Speech Viewer Logs of Lies
When sighted users test with a screen reader it is common to rely on the visual output — checking to see where focus goes, confirming that controls behave, watching the spoken output in a text log.
The problem is that what you see in those speech viewers is not always what you hear.
Consider how an emoji is represented in a log versus how it is announced. While visually it may be simple, it can be an audible assault for a screen reader user. 👏 is much easier to skim past when looking at a tweet, but harder when it is announced as
The same is true for other extended characters, as in the following example…
When we rely on the display of the speech instead of listening to it, we as developers or testers or authors are unaware that ┏ is announced as
box drawings heavy down and right in VoiceOver. Similarly, the related ━, which may look like an em-dash, is announced as
box drawings heavy horizontal.
If I just relied on the VoiceOver speech viewer, those captions would have been much easier to type and also completely inaccurate.
Obviously these characters are exceptions. Using extended characters always comes with risk, but sometimes even the most mundane text can have an unexpected announcement. I made a quick demo with common content I see on the web:
Here is how that content is recorded in the JAWS Speech History window:
Here is how that same content sounds when spoken — I have added captions that reflect the full announcement, so please enable them and read along as you listen:
Demo vertical bar ess are Announcements heading level one
First Name colon
First Name colon edit type in text
Submit greater button
list of one items
visited link why dash ARIA one point one
NASA and it got together on eight slash four slash twenty twenty at ten AM
white up pointing backhand index what he said
I want to impress upon you, dear reader, that I am only using JAWS because it was quickest for me. Each screen reader will have slightly different heuristics for how these are announced. I simply do not have time to record them all. This post is about assumptions regardless of screen reader.
Do not use this as justification to try to override what a screen reader says. I noted this on Stack Overflow, back when I was wasting too much time there. It has come up on the WebAIM list many times. I have to explain it to clients on the regular.
If you read my post Stop Giving Control Hints to Screen Readers then you already know trying to tailor content for screen reader users can be problematic. Similarly, forcing your preferred pronunciation or phrasing is a bad idea and should never be done (unless backed up by user research and a darn good use case).
All that out of the way, I see two risks from being unaware of what a screen reader is actually saying:
- QA teams rely on the output matching a speech log;
- verbose interfaces (that sometimes hide important cues).
I have worked with QA teams who send something back to developers because it does not exactly audibly match a provided screen reader log. Often that log is from a different screen reader / browser pairing or a customized set-up not used by QA.
Similarly, if as a developer all I do is ensure the extended characters (emoji, math symbols, etc.) show up in the speech viewer but do not confirm how they are announced (if at all), I can create an unusable interface.
Screen readers offer settings that let you control how verbose they are when encountering punctuation, certain characters, abbreviations, acronyms, or even words. A skilled screen reader user may take advantage of those features, but you cannot count on users doing it and you cannot suggest that as a fix.
What to Do?
Pay attention to times, dates, currency, punctuation, special characters, emoji, math symbols, common (and uncommon) abbreviations & acronyms, and so on. Be aware how they announce across all screen reader and browser pairings that will view them.
- run the screen reader with default settings (barring special cases);
- do not test a screen reader with your audio muted;
- do not rely on the speech viewer / log;
- be wary of any automated test that provides you with a log of screen reader output.
If you are working with multilingual content (truly multilingual, not just the word panini in a description of your lunch) then be sure the
lang attribute and the appropriate values are in place. This will help limit incorrect pronunciation from incorrectly marked-up content.
Update: Later that afternoon…
The problem with hyperbolic headlines is they can lead to overly-broad statements on Twitter, such as when I first shared this and suggested that you might be doing it wrong if not listening to the output. While Steve gave it the appropriate wary emoji, the following response drives home the point:
Hmm. I rely on the text output, but I'm also deaf, so…
One of the benefits of the de facto peer review in accessibility is when people point out a flaw in your over-simplified (character-constrained) conclusions. Making sure comments are enabled on your posts and replies allowed on your tweets is a good way to be certain you are not lost in an echo chamber of one.
Similarly, folks can identify other use cases that you failed to mention:
Also, the text log is faster than audio announcements (depending on speed of audio). So some announcements may show up in the log, but the user will never hear them due to live regions and other announcements that mess with the announcement queue.
Considering I have seen this happen (QA signs off on a thing, even though thing is never announced due an unnecessary live region interrupting it), it should have occurred to me to include it in the post.
Update: 24 August 2020
This is a great point about JAWS and which likely applies to the other screen readers (namely that text viewers are not QA tools, they are there to support end users).
The Text Viewer and History Mode in JAWS were not designed for QA. The Text Viewer was designed for teachers and parents and the History Mode was for the JAWS user. As Adrian said, users can change a lot of settings like punctuation, number processing, etc. Have real users test.
There is a solution to these kinds of problem. Certainly all web pages should go through QA and improvements be made so far as possible from their input. But unless you have an expert accessibility consultant on the team (which is not usual), then the next step is then to send the page to an external accessibility audit consultant for them to test.
They will use screen readers on it and listen to what the reader actually says, they won’t look at logs which as pointed out are not helpful. (You also need to invest in having them test on both desktop screen readers and screen readers on mobile devices, as they produce different results.) An experienced audit consultant will also find any other accessibility defects on the page, to help users with other disabilities besides being blind. They have the expertise to do the whole job for you – they do this kind of work full time, not just as a small bit-part among all the other kinds of testing to be done.