Where to Put Focus When Opening a Modal Dialog

TL;DR: blanket statements about where to put focus when opening a modal dialog are wrong, including this one.

This post is meant to help you, an intelligent and thoughtful and empathetic reader, figure out where you should set focus. The scenarios are non-exhaustive.

Messages

I’m artificially breaking these into three kinds of messages.

Short, Informational Messages

For messages that are simple, brief, devoid of interactive elements, not full of complex structure, and likely confirming an assumption, then dropping focus on the close button is fine. You can even use aria-describedby to reference the message text (as in the demo).

The benefit here is the user can quickly dismiss it and get on with their day. Accidentally dismissing it is harmless for everyone.

If that message is more complex or includes information the user may not expect, then focusing on the close button could be bad. Especially if the user cannot quickly or easily get back to the message (if at all).

The risk here is the user can accidentally dismiss the message, pulling them from their task, adding stress, and possibly resulting in more exciting problems.

Longer, Interactive Messages

For messages that are complex, longer, may have interactive elements, are riddled with complex structure, or may be surprising to the user, put the focus somewhere less risky. Perhaps the dialog itself or its (primary) heading. Complex text will not benefit from an aria-describedby reference (since it’s exposed as a flat string).

A terms of service or other legal-ish document is a case where you don’t want to drop the user onto the “I agree” button. The risk is the user may accidentally trigger it when the modal first appears. If you are doing that intentionally, I want you to consider if you’re building dark patterns.

It also adds interaction effort as a reader may need to scroll to the top to start reading. A screen reader user in particular has the extra hassle of finding the top (particularly with “novel” structures).

Action Required Messages

If you present users with messages where they have to make a choice, then focus should go on the thing that is least destructive. Essentially, the thing that, when activated, leaves the user no worse off.

You don’t want to put a user in a situation where they accidentally agree to sell their kidney. Nor do you want the user to accidentally decline a free kidney.

In many cases this would be the close button. Or, rather, the close button should always be the least destructive option.

Forms

I’m splitting this into two, which really isn’t enough for the nuance you need to consider.

Briefer Forms

A brief form may be a login form. Users familiar with the site, who have intentionally chosen to log in, are probably fine with putting focus onto the first field.

One fewer interaction step since they already conveyed their intent. Popping the mobile keyboard immediately can be helpful (which generally happens when focusing a text input).

Note that I didn’t say the first interactive control. That could be the close button. I also didn’t say the username field. For all I know, your login requires the user to choose their region first.

Also note that I qualified this with both familiarity and intent. Unexpected marketing modals that drop focus into a field asking for the user’s email address are an example of a user-hostile dark pattern.

Longer Forms

Longer forms require a bit more thought. And maybe a discussion of why you’re putting long forms in dialogs at all.

In cases where there is a long form, where the user may not be familiar with it, or who may not expect they would get a form when triggering the thing that popped the modal, maybe don’t put focus on a field.

Consider dropping focus onto the dialog or its primary heading. This can give the user a chance to orient.

If you drop focus onto the first field, mobile users in particular can be at a disadvantage because their soft keyboard will pop open (also true for tablets and keyboard-less desktop-class computers). They have to back out of that context just to get a chance to orient themselves to what’s being asked.

This can be a challenge for screen reader users as well, regardless of whether they use a desktop or mobile device.

Note that I qualified this with both familiarity and expectation. If the form is part of a flow with which a user is familiar, it may make sense to drop focus onto that first field. Those users likely don’t need to orient and simply want to continue their task.

Questions

Possible questions to ask that can inform how you manage focus:

Examples

I’ve made some example modal dialogs for you to try (embedded below or as a cruft-free version). You can open each modal two ways — each with focus set on a different thing. I have split them into possibly good focus placement and probably less good focus placement.

These won’t satisfy your use cases (because everyone’s use case is special and unique and wonderful). These won’t satisfy your users’ experiences and expectations (because there are lots). These will, however, function. Probably.

See the Pen Untitled by Adrian Roselli (@aardrian) on CodePen.

No comments? Be the first!

Leave a Comment or Response

  • The form doesn’t support Markdown.
  • This form allows limited HTML.
  • Allowed HTML elements are <a href>, <blockquote>, <code>, <del>, <em>, <ins>, <q>, <strong>, and maybe some others. WordPress is fickle and randomly blocks or allows some.
  • If you want to include HTML examples in your comment, then HTML encode them. E.g. <code>&lt;div&gt;</code> (you can copy and paste that chunk).