"We have AI captions" is the 2026 version of "we have a wheelchair ramp around back." Technically there. Functionally not. Legally questionable. Vibes-wise? Insulting.
Let's get into it.
The grift, freshly repackaged
Remember accessibility overlays? Those magical one-line-of-code widgets that promised to "make your site WCAG compliant" while disabled users screamed into the void that they were making things worse? Class action lawsuits are stacking up. Over 22% of all digital accessibility lawsuits in the first half of 2025 specifically targeted sites using overlays.
The disability community spent five years explaining, patiently and then less patiently, that you cannot solve civil rights with a JavaScript snippet.
Companies heard them. And then said, "Cool, what if we used AI instead?" Welcome to overlay 2.0. Same scam, shinier wrapper, infinitely more confident.
What AI captions actually do
Let's talk about what's happening when your platform proudly slaps "AI-generated captions" on a video and calls it accessibility.
AI captions miss names. They butcher technical terms. They drop entire sentences when audio gets complicated. They erase accents by rendering speakers unintelligible because the model wasn't trained on how their mouth moves. They confidently hallucinate words nobody said.
That last one is the kicker. AI doesn't say "I didn't catch that." AI says "the speaker definitely said 'cardiovascular treatment'" when the speaker actually said "carcinoma treatment," and a Deaf viewer just got medical misinformation delivered with the smooth confidence of a TED talk.
This is not "almost the same experience." This is a different experience that happens to be wrong.
The alt text situation is somehow worse
If AI captions are a bad translator, AI alt text is a bad translator who's also lying to you.
Blind and low-vision users rely on alt text to know what's in an image. When that alt text is auto-generated by a model that's pattern-matching on visual features, you don't get "almost" the right description. You get fiction. You get a confident sentence about a "smiling woman in a blue dress" when the image is a chart. You get "a dog sitting on grass" when the image is a screenshot of a medical test result.
Sighted people don't notice because sighted people aren't reading the alt text. Blind users notice immediately, because they're being fed lies in a polite robotic voice and then expected to navigate the world based on those lies.
Companies deploying this and calling it accessibility are not making their products usable. They're making their products deceptive.
"But it's better than nothing"
Is it though?
This is the line every accessibility shortcut hides behind, and it deserves to be retired with prejudice. "Better than nothing" assumes the alternative is the void. The actual alternative is hiring a human captioner, paying a content creator to write real alt text, or building accessibility into your workflow so it isn't an afterthought.
"Better than nothing" also assumes wrong information is closer to right information than no information. It isn't. A Deaf user who sees no captions knows there are no captions. A Deaf user who sees AI captions thinks they know what was said. One of those users can ask a friend, request a transcript, or skip the content. The other one walks away with bad data and zero idea they have it.
That's not access. That's a trapdoor with a bow on it.
The pattern, in case you missed it
Every few years, the tech industry invents a new way to not do accessibility while claiming credit for doing it.
2010s: "Our site works fine, just use the keyboard." (It didn't.)
Late 2010s: "We have an accessibility statement." (Linking to a 404.)
Early 2020s: "We installed an overlay widget." (Currently being sued.)
2026: "We have AI captions and AI alt text." (Hallucinating in real time.)
Notice the pattern? Every single one is a way to redirect responsibility away from the people building the product and onto a tool, a third party, or a vibe. Every single one fails the actual user. Every single one is marketed primarily to non-disabled decision-makers who will never personally encounter the failure.
Disabled people aren't fooled. They've never been fooled and have been telling you for years.
What you actually have to do
You knew this part was coming. Here it is:
AI is a tool. It is not a replacement. If you want to use AI to generate a first draft of captions that a human editor reviews, fixes, and signs off on — congratulations, you've built a workflow. If you want AI to flag images that need alt text and surface candidate descriptions for a human to verify — great, you've sped up a process.
If you want AI to generate captions and alt text that ship straight to users with no human in the loop, what you've built is a liability. You've also built something the disability community will roast you for, your competitors will out-accessible you on, and increasingly a regulator will fine you for. The European Accessibility Act took effect in June 2025. The DOJ Title II rule is coming, delay or no delay. The FTC has already shown it will go after vendors making bogus accessibility claims.
The era of "we deployed AI, problem solved" is ending the same way the overlay era is ending: in court.
The checklist you actually wanted
Here's what good captions look like, whether you're using AI or not.
The non-negotiables (every caption, every time)
Speaker identification when more than one person talks
Verbatim accuracy on names, numbers, technical terms, and medical language. These are the things AI gets wrong most often and the things users need most
Sound effects and non-speech audio noted in brackets, like [door slams] or [ominous music swells]
Punctuation that reflects how the sentence actually sounds; questions get question marks, pauses get commas, interruptions get em-dashes
Reading speed under 160-180 words per minute on screen; if your captions blow past that, viewers literally cannot read them in time
Two lines max per caption frame, three if you absolutely must
Captions positioned to never cover faces, lower-third graphics, or important visual information
If you're captioning without AI
Use a real captioning tool, not the YouTube auto-caption editor pretending it's one
Watch the video at full speed once before you start, context matters
Caption sound that conveys meaning, not every ambient noise — [laughter] yes, [refrigerator humming] probably no
Have a second person review before publishing; you will miss things you said yourself
For anything medical, legal, technical, or featuring multiple speakers, hire a professional captioner; this is not the place to save $200
If you're using AI as a tool (the only acceptable way)
Run the AI pass first, then have a human edit before anything ships
Budget real time for the human review — rule of thumb is 4-5x the video's runtime for the first edit
Pay special attention to: proper nouns, technical terminology, accented speech, overlapping dialogue, and anything quiet or muffled; these are AI's known failure modes
Never trust AI's confidence; it will deliver wrong words with the same certainty as right ones
Spot-check the final captions against the audio as a viewer would experience them, not as a reviewer reading a transcript
Document your review process; if you ever face a complaint, "we had a human verify" is a defense; "the AI did it" is not
Red flags that mean stop and start over
Captions out of sync by more than a second or two
Names spelled differently in different captions
Whole sentences that read like word salad
Anywhere the AI "filled in" silence with hallucinated speech
Technical terms that sound vaguely right but aren't ("hyper-tension" vs. "hypertension," "carcinoma" vs. "cardiovascular")
The one-question test
Before you publish, ask: would a Deaf viewer understand this video as well as a hearing viewer? If the answer is anything other than yes, the captions aren't done. "Mostly" doesn't pass. "Better than nothing" doesn't pass. Yes is the only passing grade.
The closing thought
Accessibility isn't a feature you bolt on. It isn't a problem you outsource to a model. It isn't a checkbox you tick by paying $49/month for a widget that has, statistically, a 22.6% chance of getting you sued.
Accessibility is what happens when you decide disabled users are users; not edge cases, not afterthoughts, not optics for your DEI report.
If your accessibility strategy is "we plugged in an AI," your strategy is a wheelchair ramp around back. Technically there. Functionally not. And everyone who actually needs it knows.
Let's do better.