~/wiki / dostupnost / avtomaticheskie-instrumenty-proverki-a11y

Five Automatic Accessibility Checking Tools – and Where Each of Them Gets Wrong

Main chat

A chat for vibe coders: news, guides, live cases, marketplace, and finding executors.

$ cd section/ $ join vibe dev
Five Automatic Accessibility Checking Tools – and Where Each of Them Gets Wrong - обложка

Open any website, run Lighthouse, and see Accessibility: 98. Close the tab with a sense of accomplishment. Now try to go through the same site from the keyboard without touching the mouse. Or turn on VoiceOver and listen to the main form read. There’s a hell out there that Lighthouse didn’t even notice.

Automatic availability checkers are useful tools. But they check what the machine can check: contrasts, the presence of alt, the correctness of ARIA attributes, the validity of markup. Everything that has to do with meaning, context, and the actual use case is left out. According to various estimates, automation catches about a third of WCAG violations. The other two-thirds is what accessibility is all about.

The following is an analysis of five popular tools: what they do well, where they lie, and in which situations their green icon can not be relied on at all.

Why 100/100 is a trap, not a goal

The main problem is not the tools. The problem is how teams use them.

Typical scenario: the product sets the task of “bring accessibility-score to 95+”. The designer and the frontend rule what the checker highlights. Score's growing. Everyone is. And a real user with a screen reader is still unable to place an order, because in the modal, the focus jumps randomly, and the “Confirm” button is signed as “button-primary-2”.

The green icon creates the illusion of a solved problem. This is worse than not checking because it turns off further thinking.

Automation can't really test:

  • Does the alt make sense (alt="image" formally valid)
  • Is it logical to read and focus
  • Visual order is understood by the blind user
  • Does the custom component work from the keyboard
  • Does the interface break down with zoom 200% or more
  • Are dynamic changes (live regions) correct
  • Whether the text contrast on the gradient or photo is sufficient

If the team is fixed "we're all right, Lighthouse green", it is worth once to hold a live pass from the keyboard to retro. The effect is sobering.

axe DevTools: the best of the machines - but still a machine

deque’s axe-core is the engine that runs half of the industry (including Lighthouse’s built-in verification). The axe DevTools extension for Chrome is the de facto standard for quick manual verification.

What it does well: catches almost all formal violations of WCAG A and AA, gives clear references to the rules, shows a specific DOM node, and offers a fix. Minimum false positives – if axe says “violation,” it’s most likely a violation.

Where axe is wrong

Aria. axe will verify that role="dialog" has aria-labelledby. But it doesn’t check that the link leads to a meaningful title, not an empty <span>. Technically, valid. In fact, the user of the screen reader hears "dialogue", and silence.

Castom components. Self-painted dropdown without a single ARIA attribute axe simply ignores - for him it is a set of div's, and there is nothing to complain about. Paradoxically, the worse the component is made, the less axe has reasons to swear.

Order and focus. Axe does not pass the Tab interface. If the focus after opening the modular remains on the button below it - axe will not see it.

How to use without self-deception

  • Run axe on every PR – but only as a lower bar, not as a final test
  • For each component in the design system write a script manual keyboard check
  • Once in a sprint - the passage of critical flow with a screen reader (VoiceOver on macOS, NVDA on Windows)
  • When reviewing PR with new ARIA attributes – check not “there is an attribute”, but “what it voices”

** Questions for code review that axe will not ask:**

  • Where does the focus go after this pope closes?
  • What will the user hear when they press this button?
  • Do you understand the meaning of the icon without a signature and without alt?
  • If you turn off CSS, will the order of reading remain?

WAVE: A visual audit that lulls alertness

WebAIM’s WAVE is an extension that paints icons of errors, warnings and structural elements right on top of the page. Convenient for a quick overview: you can see where the headlines are, where the landmarks are, where the trouble spots are.

The main advantage is visibility. You can understand the structure of the page with your eyes in a minute without climbing into DevTools. Good for a demo client or for training new team members: “Look, this is what a page without the right semantics looks like.”.

Where WAVE fails

The tool relies heavily on “alerts” icons, yellow warnings that are not errors but require human attention. In practice, they are ignored: “Yellow is not red, let’s move on.” Namely, in yellow sit the most interesting finds: suspicious alt, empty links, duplicate headlines.

Another weakness is spa. WAVE checks what is rendered at launch. Dynamically loaded content, modulars, post-interaction states – all this is not included in the report. On a static landing page works perfectly, on a complex application shows half the picture.

And the individual pain is the contrast. WAVE checks text color and background color as two values. The text in the photo, on the gradient, on the translucent substrate is the blind spot.

**Short summary of the first two tools: * axe and WAVE solve different problems. axe for CI and formal rule checks, WAVE for quick visual inspection of the structure. None replaces the manual passage from the keyboard and screen reader. And none of them answers the main question – is the interface clear to a person who sees it differently than we do.

Lighthouse: convenient, fast, superficial

Lighthouse is built into Chrome DevTools and any normal CI. I started - got a beautiful score of 100 points with a green plaque. And that's the problem: the team is looking at the number, not the report.

Under the hood, Lighthouse has a stripped-down version of axe-core plus a few of its own checks. Trimmed is the key word. Some of Lighthouse’s axe rules are off, some are simplified. So Lighthouse: 100, axe DevTools: 12 Violations is a common occurrence, not an anomaly.

Where Lighthouse is wrong

Score-driven development. The green circle becomes the target. The team rules exactly what prevents to reach 100, and does not touch what Lighthouse in principle does not know how to check. It's worse than not checking at all because it creates false confidence.

Only the top level of the page. Lighthouse runs an audit on the DOM that it sees when downloading. Hidden behind the tab sections, the content behind display: none, the states after the click - fly by.

Contrast by letter of rule. Takes computed style and calculates the ratio. Text over the image, shadow, blur background, state :hover and :focus are blind spots, just like WAVE.

How to build without harm

  • Lighthouse – in CI as a lower threshold: “at least 90 in accessibility” instead of “must be 100”
  • Not showing score in dashboards to management as KPI accessibility is killing the point
  • For each PR with UI changes – a separate run axe, not just Lighthouse
  • Once a quarter – check the list of verifiable rules of Lighthouse with the current axe-core to understand what is not checked

Accessibility Insights: Closest to Manual Verification

A tool from Microsoft, built on the same axe-core, but with one important difference - FastPass mode and guides on assisted manual checks. That is, in addition to automation, it tells you what to poke your hands, and where to look.

Tab Stops – visualizes the order of focus right on the page. Arrows, numbers, you can see where the focus is jumping in the wrong direction. It’s no longer “the machine said OK,” it’s a tool that helps the designer and developer see the problem through their eyes.

Where Accessibility Insights is Wrong

** Entry curve. ** In order to benefit from the tool, you need to learn it. Teams that “just click audit” get exactly what they get from axe. Tab Stops, Assessment, Needs Review – half of users don’t know about these modes.

Assessment turns into a formality. A full WCAG test takes hours, and in real sprints it is driven not for every feature, but once in a release. By the time you find a problem, the feature is already in the market.

Tab Stops doesn't answer "why." Shows the focus jumps strangely. It doesn’t show you how to fix it – it’s for the developer and component architecture.

Script for the design team

  • When you model a new component – immediately draw a diagram of focus: where does the Tab go, where does Shift + Tab, what does Esc
  • On the review of the prototype in the browser - Tab Stops is turned on, pass the flow with our eyes, screen the problem areas
  • The problems found are not in the general backlog, but in the component card in the design system, so that the fix lives at the level of the system, not individual screens

Stark in Figma: Checking before it's too late to change anything

Stark is a plugin for Figma (both Sketch and browser) that catches accessibility issues at the layout stage. Contrast, color blindness simulation, touch-target sizing, alt-description generation. The main advantage is that it works where the designer actually lives, and not a single step at the end.

Where Stark is wrong

A layout is not an interface. Stark checks the contrast of the text layer on the background layer. In a live product, this text will be scrolled above the picture, in a dark topic, in a disabled state - Stark does not see these contexts because they are not in the layout.

**Simulation is not experience. Run through a deuteranopia simulator is helpful, but it's still a sighted person's gaze through the filter. A real color-impaired user lives with him all his life and compensates differently than we assume.

AI descriptions alt. A handy feature that generates a signature from an image. And exactly the same trap as with any AI: the designer inserts the generated text without looking, and in the sale of the icon “basket” appears alt “image of a black icon”.

Designer’s checklist before transferring the layout to development

  • Contrast checked for all states: default, hover, focus, disabled, error
  • The focus state is drawn explicitly, not "leave the browser"
  • Touch-target at least 44×44 for all interactive elements on mobile
  • Icons without a signature have a text description in the speck of the component
  • Running a layout through a colorblind simulator - meaning is preserved without color?
  • If there is a dark theme in the layout - the contrast is checked and in it

Questions for design review

  • What will a user who does not distinguish between red and green see on this validation form?
  • If you remove the color, where is the user focused?
  • This gray placeholder - is it exactly passing the contrast, or are we fooling ourselves?
  • The error state differs from the usual only by the color of the frame?

How to put it into the workflow

Five tools are not five checkboxes in a DoD. These are different points of control at different stages.

  • In Figma** - Stark before the layout went into development
  • In PR - axe in CI as mandatory gate, Lighthouse as soft threshold
  • On Manual Feature Check – Accessibility Insights with Tab Stops, WAVE for a quick review of the structure
  • Once in the release - a walkthrough by a live person with a keyboard and a critical fly screen reader

The biggest mistake teams make is choosing one tool and thinking they’ve closed the issue. Automation catches formal violations, nothing more. Everything that has to do with meaning – where the focus goes, what the screen reader hears, whether the icon is clear – remains handmade. And this work needs to be built into the process as systematically as we build a linter.

What to do with AI component generation

A separate story is when a part of the UI is assembled not by hands, but through an AI assistant: Figma Make, v0, Cursor from MCP to Figma, generation of components according to the description. Accessibility in such pipelines breaks down quietly and equally.

Typical failures of AI generation

  • The button is drawn as <div> with a click handler – visually indistinguishable, there is no screen reader
  • Modular without role="dialog", without a trick, closes only with a cross mouse
  • Form without <label>, placeholder instead of signature - because in the layout "so short"
  • Icon-button without aria-label – model generated SVG and forgot about the text name
  • Contrast "like in design" - but the design was a light theme, and the dark automatically generated

What to build into the process if AI is involved in code

  • The component generation prompt contains accessibility requirements explicitly: semantic tags, focus states, ARIA roles where they are needed
  • Any AI-generated component passes axe and manually runs through the Tab before entering the design system
  • Banning “AI writes alt and aria-label” without a review is the same trap as Stark, only in code
  • In the component library, availability is fixed as part of the API: if IconButton does not have label in props, it is not going to

The scenarios that are caught worst

Automation and plugins capture static DOM in the same state. Anything that is dynamic or contextual requires a conscious test.

Check for hidden failures

  • Toasts and snack bars - are announced by the screen reader, do not disappear faster than they have time to hear
  • Modals: Does the focus return to the trigger button after closing
  • Endless scrolling – is there an alternative for the keyboard, is not lost position on return
  • Custom dropdowns and combo boxes - arrows, Enter, Esc, Home/End work on waiting
  • Animations – Is there any respect for prefers-reduced-motion, does something blink faster than three times a second
  • Dark theme is not “light inversion”, contrast and focus styles are tested separately
  • Load states – what the screen reader hears while the spinner spins

Anti-Patterns That Pass All Auto Tests

  • Button with aria-label="кнопка" - formally valid, meaning garbage
  • alt="" has an illustration that makes sense - axe will be silent, the user will lose context
  • Contrast 4.5:1 on the text that lies on top of the video - on a separate frame passes, in motion there is no
  • The focus is visible, but it's a default blue frame on the brand's blue background - technically there is, practically no

How to explain this to your team and business

The main problem is not the tools, but that accessibility is perceived as “extra work at the end.” Then there are a few moves that work in real teams.

Translated into product language

  • Not "WCAG 2.2 AA," but "35 percent of our users use a keyboard at least once for navigation - we're losing them now at step X."
  • Not "contrast doesn't pass," but "this text doesn't read in the sun from the phone - we checked it out on the street."
  • Not "need semantics," but "voice assistant and page search won't find this button."

Embedded in a DoD, not a separate availability sprint

  • A component in a design system is considered ready when it has a focus state, keyboard shortcuts are described, and contrast is tested in all topics
  • The PR template contains a line about accessibility - not checkbox "I checked", but specifically: "passed Tab, checked screen reader, axe green"
  • Accessibility bugs are started with the same priority as functional ones, otherwise they will always “later”

Questions for review fitchey

  • If the user does not have a mouse, can he complete the scenario?
  • If you turn off the sound and color, what information is lost?
  • What will the screen reader hear on this screen in the first five seconds?
  • Where is the bottleneck here - and why did we decide not to fix it now?

Outcome of the segment

The tools do not make the product available. They show exactly where it's broken. The rest is the team’s decision: what context we check, who we explain to, where we catch. Automation is the bottom line. Real accessibility begins with the question “how to use it without what we expected by default.”.

Then there’s what makes “tools showed red/green” a real team habit: short checklists, pitfalls in the process, and questions that make sense to ask before the feature goes on sale.

Checklist before merge

Not a “full availability audit”, but a minimum that is realistic to pass in 10-15 minutes on any feature.

Before PR

  • Flowed only one keyboard from beginning to end – Tab, Shift+
$ cd ../ ← back to Accessibility