Architectural Issues around Text to Speech

Johnny Taylor johnny at
Fri Apr 19 12:45:09 EDT 2013


"How do we make a selection" with the keyboard alone? That's a great question. My instinct is Shift > arrow key, but how will it know where to start? Firefox allows a user to use the cursor key to navigate within pages. And using Shift > arrow keys to select content as you use this key combo does work to select text. 

That's pretty powerful. Especially in this context. My question given this is could you mimic this behaviour programatically? To bring this feature to browsers that don't have this function? And turn it on when you enable the text to speech function in UIO?

Not only would this be nice. I can't imagine another, more intuitive/ natural, way to do it.


On 2013-04-19, at 10:39 AM, Justin Obara <obara.justin at> wrote:

Yesterday afternoon we started talking and tasking out the work for implementing the text-to-speech feature in UI options. The notes from that meeting are up on the wiki.

In looking through the designs, there are several architectural issues that have arisen. In the current designs, text-to-speech will read out the contents of a predesignated section on the page, likely the contents of the <article>. The user will be able to play and pause the reading, but also be able to select a portion of the text via, keyboard or mouse, to start reading from. This raises two high level questions.

How do we make a selection?
How do we start reading from that selection?
How do we know when a selection was made?

How do we make a selection?


This is straight forward, and should likely be supported on any system that supports a mouse.


This is also likely handled by any current OS that supports touch.


We should be able to make use of the browsers built in caret navigation. Although this may require the user to enable it in the browsers settings. Safari and chrome (tested on mac os x) seem to behave the same, in that you have to first double click on a word before you can use the keyboard to modify the selection. However, this interaction from Safari and Chrome is not ideal, as the user would still have to use the mouse to start the selection.

How do we start reading from a that selection?

This question was particularly nebulous. We would have to know what was selected, what DOM node that selection was from, and where in that DOM node the selection came from. 

Example 1:

<p> A fool thinks himself to be wise, but a <strong>wise man</strong> knows himself to be a fool.</p>

In Example 1, suppose we select "a <strong>wise". There are at least two potential issues 1) starting in the middle of the DOM node, and 2) crossing DOM nodes.

Example 2:

<p> Give every man thy ear, but few thy voice. <p>

In Example 2, suppose we selected "thy". Since the word "thy" is contained within the text for the node multiple times, how would we know which one was correct?

One possible would be to make use of window.getSelection(). This will provide us with a selection object that we can use to get the text selected as well as the node(s) that the selection starts and ends in. We should also be able to determine where in the DOM node the text  selection is, making it possible to distinguish between multiple occurrences of the same text.

There is a question of browser support, particularly for IE 8 and below, but we might be able to find a polyfil to help with that.

How do we know when a selection was made?

There doesn't seem to be any specific selection events that we could listen to. However we could probably use mouse presses, key presses and touch events to trigger a check of the selection object (see above).

fluid-work mailing list - fluid-work at
To unsubscribe, change settings or access archives,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the fluid-work mailing list