Stop Giving Control Hints to Screen Readers
TL;DR: for standard HTML controls and standard ARIA patterns (widgets), you do not need to add instructions for screen readers on how to use them nor what they are.
When a screen reader encounters an element on the page that invites interaction beyond reading, it typically provides users with instructions how to consume or interact with it. As users get more familiar with the tools, they can skip some of those instructions or disable them altogether.
If in addition to these screen reader defaults you add instructions, then they may conflict, they may be redundant, they may not apply to the form factor, and they will be verbose. If you use
aria-label, then they will not translate (excepting some Chrome users).
This also extends to telling users what a control is and, in some cases, telling a user its state. A checkbox with
checked tells a user it is checked. A
<button aria-pressed="true"> tells a user it is pressed.
Examples I have seen, heavily simplified:
<button aria-label="Click button to mark">[icon]</button> <fieldset> <legend aria-label="Use arrow keys to choose one.">Choose one</legend> [… radio buttons …] </fieldset> <th aria-sort="ascending"> Date<span class="visually-hidden">, sorted ascending</span> </th> <li role="presentation"> <a href="#Reviews" role="tab" id="Tab1" onclick="[…]"> Reviews<span class="visually-hidden"> Tab</span> </a> </li>
When you are using native HTML for its intended purpose, a screen reader will give appropriate instructions. When you are using ARIA patterns (see widget roles) as defined (without folding in your own customizations), a screen reader will give appropriate instructions.
Similarly, avoid naming a specific pattern. Different browser and screen reader pairings can call the same control different things. For example, no user knows what a combobox is (nor do half the developers I speak to, interestingly), so probably do not use that word with users.
To simplify how confusing this can be:
- You say scroll, screen reader says
- You say sort button, screen reader says
sort button button,
- You say press Enter, user is on a touch display,
- You say open combobox, user is a human.
All of this, of course, goes out the window once you start building custom patterns, modifying existing patterns, or rolling your own custom elements. Then you have to not only provide everything but potentially override the defaults.
It’s pretty cool what standard HTML gives you for free.
I put together a page with some of the more common patterns that I see turned into verbose gibberish, but I have excluded the gibberish. If you have a screen reader, play around with them yourself. Hint: you have a screen reader; there is one built into the device you are using right now.
Following this I have videos across some screen reader and browser combinations demonstrating how each is announced.
Using the example patterns above I recorded some basic interactions across seven screen reader and browser combinations. These are meant to demonstrate how each is announced by default. I am not just pressing the Tab key; in the tables I use table navigation commands, for example.
If you find you are getting different announcements on the example page, that may be a function of verbosity settings (I leave mine at defaults), a different screen reader version, a different browser version, or potentially browser plug-ins that manipulate the DOM.
You may see in the videos another section that I do not navigate and that is not in the sample above. I am saving that for a follow-up post after more testing.
How would you consider a grid of cards (container with role=”grid”)? would adding keyboard navigation instructions (“use arrows to navigate cards”) as a description, not as a label, be redundant according to the above argument?
Neil, visible plain text instructions would be necessary in that model as it affects all users. However, note that arrow keys are intercepted by screen readers for virtual cursor navigation, so you may find that your pattern does not work well for screen reader users anyway.
As an aside, so far the only valid use case I have found for the
gridrole is to create a spreadsheet (like Microsoft Excel or Google Sheets), and those also use a an
applicationrole on an ancestor, something I would not recommend for a card interface.
Let me elaborate a bit about the scenario I am facing: a long list of cards, each card has a heading, a short paragraph and a CFA button.
I think the challenge I am facing here is to balance between good KB usability – allowing quick card navigation , and a good SR usability – communicate the presentation order through the grid semantics, probably at the cost of overriding arrow navigation for Windows OS SR users.
So, my following questions would be:
#1 Would the above affect all users or all keyboard users? In case [all] keyboard users only, I thought I may resolve this by displaying the hidden description on first item focus (as a tooltip). This way all keyboard users (sighted and blind) would be able to perceive the relevant information for keyboard navigation.
#2 Regarding conflict with SR, considering the use case detailed above, wouldn’t a space separated values of the heading id and the paragraph id in the aria-labelledby attribute on the card, compensate for this “conflict”? More importantly, wouldn’t the grid semantics trigger focus mode?
Considering the use of role dialog for modals, the potential benefit of adding grid semantics, could be the mere indication of presentation order for SR users.
You describe the content of a typical card. I am still assuming you mean
role="grid"on the container, not
This does not correspond to the typical use of a grid role. If you are trying to communicate the presentation order, there may be a greater risk there since the DOM order should be the presentation order to keyboard and screen reader users.
Remember that a screen reader user can navigate the content natively with Tab to traverse the interactive controls, the h key to traverse headings, and arrow keys to traverse the content.
1. What you propose would affect anyone who uses a keyboard. Separately, do not rely on a tool-tip or other focus-only display, as that excludes touch and voice users.
2. No and no. You would create an overly-verbose control (for browsers that support
aria-labelledbyon non-interactive elements), and grid semantics on their own do not add keyboard support; you still have to script it.
I do not think the dialog role is a good analogue here, since it does not indicate presentation order either.
This is a little hard to discuss in comments, so if you have a sample drop me an email.
What do you think about hints for search boxes for autosuggest and autocomplete controls; i.e, what WAI-ARIA Authoring Practices refers to as “Combo Box” widgets? These seem like examples where hints are almost always absolutely necessary.
JAWS gives instructions for these already if you follow the ARIA authoring patterns. I’d love it if the other screen readers would do the same.
You say open combobox, user is a human.
Apparently emojis don’t work in your comments. That was supposed to be the quote followed by laughing emojis. :)
(I am testing if WordPress commenting can handle them — and no it cannot)
Not really getting the hint with the “built-in screen reader” .. where is that supposed to be located? Or are you intentionally ignoring the fact that there is more than just Windoze and Mac OS Argh on the desktop platform?
My statementHint: you have a screen reader; there is one built into the device you are using right now.is broad because the bulk of my traffic is from Windows and then macOS.
Windows has Narrator, macOS has VoiceOver, GNOME has Orca, iOS has VoiceOver, Android has TalkBack.
If you have an OS outside of that collection, then check your documentation. If you have one of the OSes I listed, check my old answer on Stack Overflow that has some links.