Premise: Current HTML tags are attuned to text based VISUAL consumption. The use of embedding media files is needed for audio or a/v content, which may not necessarily be accurate to the typed content of the html page. The text may also not be enunciated as typed or otherwise inaccessible in a precise manner to visually challenged individuals. Although other methods are available to invoke spoken text from html pages, there is low consistency between them. Furthermore context or even content may be lost due to additional later markup such as interstitial marketing ads or inline suggested additional reading links, when using global “speak” or page based TTS solutions. A specific tag would provide precise delineation of what is to be spoken.
Suggestion: It is my suggestion that a new tag be introduced, called “speak”. This tag in its use would instruct the browser to speak the enclosed text aloud, utilizing TTS (text to speech) or other functionality which shall be determined by the browser itself.
Additional suggestions: text=“” to define alternate text, and if present is the spoken text. Visible text between tags remains same. type=onclick, onload, etc. to define browser interaction. language=English, French, etc. to define spoken language if not browser default. highlight=word, letter, wordunderline, letterunderline, etc. to define highlighting current word being spoken. pip=short, long to define length of audio pip queue when tag is in onhover state.
Example use case(s) and intended audience: Speaking page contents: Blind individuals, who otherwise cannot use browser gestures to locate, and more precisely select and play content (text)
Click to speak, for direct consumption: Children learning to read. Foreign language learning.
Click to speak, for communication to others: Autism cases of selective mutism Other disabilities requiring selection of words to speak
Additional uses: spoken notation, citation or pronunciation of selected visible text.
Thanks for your consideration! Respectfully, Brian A. Newbold