I’m a Professional Fact-Checker: AI Makes More Mistakes Than You Realize

Almost half of Americans indicate that they utilize AI to access information and generate ideas. It’s easy to understand why. As social media deteriorates into chaos—and Google becomes little more than a glorified directory for Reddit threads and content farms—many of us crave something dependable. Moreover, chatbots are undeniably useful, right? The first time I engaged with one, I inquired if it realized it was a significant drain on resources. Half an hour later, I received a new recipe for vegan cream cheese.
I never ended up trying the recipe. Instead, I discovered a human-created version that the LLM might have borrowed from. That’s essentially how these models operate. They repurpose shared knowledge into something that seems customized for you. This may work fine for alternatives to dairy (unless you happen to be a vegan blogger). However, when it comes to global matters and truth—the key focus of my role as a fact-checker at WIRED—the stakes escalate dramatically.
In the past year, a growing number of people have looked at me with considerable pity. Surely, a fact-checker at a magazine cannot survive in this AI-enhanced landscape. Call me naive, but I’m not particularly concerned. I’ve concluded that very little of humanity’s shared knowledge exists online. Furthermore, my findings suggest that AI is even more inaccurate than most people realize.
Tom Wolfe apparently viewed fact-checkers, according to writer Colin Dickey, as a “cabal of women and middling editors collaborating to scrutinize and undermine the prose of the Great Writer.” Not a terrible characterization, although many of my colleagues and my boss are men. What can I say? Our duty is to be irritating, unlike AI’s.
WIRED’s fact-checking team adheres to traditional methods: careful line-by-line annotations, primary sources whenever possible, and extensive ethical and legal reviews. We challenge fundamental assumptions, seek new or contradictory information, and reach out to individuals—ensuring accuracy. It’s a rapid peer review process, striving to match the pace of news reporting.
From what I can discern, AI hasn’t targeted this approach yet. What it has focused on is “post hoc” fact-checking, the Snopes-style examination of the factuality of claims after they arise. In the UK, a project named Full Fact has developed its own AI tools to help combat misinformation. These tools, utilized in over 40 countries, process vast amounts of data, from social media posts to podcast transcripts, then identify specific claims that humans can investigate further. “You definitely need a human being,” states Mark Frankel, Full Fact’s head of public affairs.
The reasoning is straightforward: AI continues to make errors. As a fact-checker, I would love to pinpoint exactly how often that occurs. However, it’s not straightforward. Since 2018, nearly 17,000 papers about LLMs have appeared on arXiv, many specifically addressing their reliability. Still, it’s valuable to attempt to establish a working figure.
In any article that comes to WIRED’s fact-checking desk, there’s usually a considerable amount of “b-matter”: statistics, news events, quotations—anything that helps provide context. Fact-checkers typically Google this basic information, and that process, represented by the search engine’s overly familiar AI Overviews, constitutes my main interaction with AI. In my professional opinion, it’s inaccurate about a third of the time.
This might be a generous estimate, though. A March 2025 study from the Tow Center for Digital Journalism indicated that over 60 percent of responses from AI-driven search engines were incorrect. A BBC study estimates the inaccuracy rate of chatbots closer to 45 percent, a figure I see cited more frequently. To put it bluntly: AI could be wrong nearly half of the time.
