Misleading AI Stories in 2025

Bridging the Divide Between Human and AI

AI Studies that misled us in 2025

Everyone’s worried about AI slop, AI hallucinations, and misinformation.

But the slop and misinformation that actually moved markets and changed business decisions this year came from peer-reviewed journals, MIT and Harvard researchers, and books from major publishers with fact-checking teams.

Consider: a “New York Times Bestselling” book on AI’s environmental impact – published by the world’s largest publisher Penguin Random House, written by an MIT-trained engineer, highly reviewed by The New York Times and The Economist, fact-checked by a team thanked in the acknowledgements – claimed a single data centre would use 1,000 times more water than a city of 88,000 people.

The actual figure? About 0.22 times as much.

The error was off by a factor of 4,500.

Nobody caught it until a data scientist with a Substack did the maths.

That’s the world we’re in. Not AI making things up – humans at prestigious institutions making things up, and the entire apparatus of editorial review nodding along.

Earlier this year, a client came to me with concerns about their AI training programme.

They’d read an MIT study about how ChatGPT causes “cognitive atrophy” and weren’t sure they wanted to risk their team’s critical thinking skills.

The study came from MIT, so it seemed credible. We talked it through, and ultimately they went ahead with the training. But the fact that a flawed study from a prestigious institution created that hesitation at all – that’s a problem.

I spent some time tracking down the sources on some of the most shared AI studies of the year.

There is a credibility crisis.

Data fabrication, poor methodology, and headline-chasing conclusions have made it nearly impossible to know what’s real.

This isn’t AI at fault – this is good old traditional human-error – at best – or deliberate agenda-setting at worst.

Here’s what actually happened with five publications that shaped the AI conversation this year.

1. MIT: “Miracle Productivity” Fabricated Results

The claim: AI made scientists 44% more productive overnight.

The study: “Artificial Intelligence, Scientific Discovery, and Product Innovation”

Source: MIT Department of Economics

Date: Published late 2024; Retracted May 2025

Link: MIT Economics Statement on Retraction

This paper claimed to track over 1,000 materials scientists at a major R&D company, finding that those given access to AI tools boosted their patent output and discovery rates almost instantly. It was cited by financial outlets for months. I saw it referenced in board presentations and investment pitches.

The Reality Check:

An internal MIT investigation concluded the data had been fabricated. The results were statistically impossible – in real-world behavioural studies, data is noisy. This study showed perfect, linear productivity gains with almost zero variance. The study was formally retracted from The Quarterly Journal of Economics, and MIT issued a statement of “no confidence” in the findings:

“While student privacy laws and MIT policy prohibit the disclosure of the outcome of this review, we are writing to inform you that MIT has no confidence in the provenance, reliability or validity of the data and has no confidence in the veracity of the research contained in the paper.” – MIT statement, May 16th, 2025

Why this matters: Businesses make real decisions based on fake numbers. If you promised your leadership team 44% productivity gains from AI tools, you were working from fiction.

Personally I have seen far more than 44% productivity gains on some specific individual tasks – but nothing on the scale of 44% over the course of a working month. I wrote more about the realities of AI, productivity and ROI here.

2. MIT: “Your Brain on ChatGPT” Paper

The claim: Using AI tools like ChatGPT weakens your brain and destroys critical thinking.

The study: “Your Brain on ChatGPT”

Source: MIT Media Lab

Date: Late 2024 / Early 2025

Link: Arxiv: Your Brain on ChatGPT

Researchers monitored students writing essays under three conditions: unaided, using Google, and using GPT-4. They reported that AI users showed 55% lower cognitive engagement and produced work that human graders described as “soulless.”

The lead researcher explicitly warned that reliance on these tools leads to “cognitive atrophy.”

The Reality Check:

Misleading Terminology: The paper coins the term “Cognitive Debt” but frames it alongside “cognitive atrophy” – a serious clinical condition implying tissue damage.

Wrong Tool for the Diagnosis: The study used EEG sensors, which measure electrical activity, not physical brain structure. They claimed “weaker neural connectivity” equates to a deficit, but EEG cannot detect the physical tissue degeneration seen in dementia.

Subjective Metrics: The viral “soullessness” rating was a descriptive quote from human graders, not a reproducible scientific metric.

Tiny Sample Size: The “55% lower engagement” statistic came from a cohort of just 18 people using ChatGPT – far too small for the sweeping neuroscientific generalisations that followed.

Why this matters: It shifted public debate away from real issues and real AI harms – like dependency and attention control – toward medicalised alarmism that the data simply didn’t support. Businesses delayed AI adoption based on fear rather than evidence.

3. Penguin/Random House: The “Thirsty AI” Water Myth

The claim: AI is draining the world’s drinking water. Every ChatGPT query consumes half a bottle.

The sources: Li et al. (UC Riverside); data misinterpreted in the book Empire of AI (2025)

Links: Andy Masley’s Debunking | Original Study (Li et al.)

This is where the institutional failure gets absurd.

Empire of AI was one of the most talked-about tech books of the year. Published by Penguin Random House – the largest publisher in the world. The author has a mechanical engineering degree from MIT with a minor in energy studies. She was a senior AI editor at MIT Technology Review. The book was recommended by Time, The New York Times, The New Yorker, The Economist, and the Financial Times. She even thanks a large fact-checking team in the acknowledgements.

The book claimed that a Google data centre in Chile would use “more than one thousand times the amount of water consumed by the entire population of Cerrillos, roughly eighty-eight thousand residents, over the course of a year.”

Think about that claim for a moment. A single building using 1,000 times more water than 88,000 people. That would mean one data centre consuming more water than a city of 88 million – more than four times the entire population of Chile.

The Reality Check:

Data scientist Andy Masley did the maths. The book’s figure implies each resident of Cerrillos uses 0.2 litres of water per day. That’s a fifth of a water bottle. Adults need 2-4 litres just to stay alive. The average Chilean buys 180 litres per day from municipal water systems.

What appears to have happened: the source reported water usage in cubic metres. The book recorded it in litres. That’s a 1,000x error built into the central water claim. The actual data centre would have used about 3% of the municipal water system – significant, but ~4,500 times less alarming than reported.

When Masley contacted the author, she acknowledged the numbers “initially seemed strange” to her team. They asked for clarification, didn’t get it, and published anyway.

Look, mistakes happen in books. I get it. But this wasn’t a typo or a minor slip – it was a 4,500x error in a central claim, and it passed through a fact-checking team, an editor at the world’s largest publisher, and reviewers at every major publication without anyone noticing.

Nearly 1,000 Amazon reviews. And a solo blogger was the first person to notice that no building on Earth uses 1,000 times as much water as a city.

The “half a bottle per query” claim has similar problems. The viral statistic confused withdrawal (water cycled through cooling and returned) with consumption (water actually lost). The error overstated impact by roughly 4,300x. The actual water cost of a query is about 5-10 ml – a single sip.

Why this matters: If you’ve been factoring water consumption into your AI sustainability calculations, your numbers are possibly wrong by three orders of magnitude (1,000x).

AI does have environmental costs. But decisions based on numbers this wrong aren’t decisions, they’re theatre.

4. Apple: The “AI Can’t Do Reasoning” Paper

The claim: AI cannot do maths or logic – it relies entirely on memorised patterns and fails simple riddles.

The studies:

Apple Machine Learning Research’s “GSM-Symbolic” paper (October 2024) showed that slightly changing names or numbers in maths problems caused AI performance to collapse.

A separate paper from MIT, Harvard, and UChicago researchers used trick questions to argue LLMs rely on “syntactic templates” rather than true logic. They borrow the metaphor of a “Potemkin village” – a facade built to look real but hiding emptiness.

The idea: LLMs can look like they understand a concept (pass an exam, define it accurately), yet behind the scenes lack true conceptual grasp.

Links: Apple Paper | Potemkin Understanding Paper

The Reality Check:

Both studies identified something real: AI reasoning is fragile. Language Models can be tripped up by surface-level changes that wouldn’t fool a human.

But the conclusions overreached. Both studies restricted how models could respond – forcing snap judgements rather than allowing the step-by-step “thinking” that modern reasoning models use. When models are given enough tokens to work through problems (Chain of Thought), the fragility often disappears. Testing reflexes isn’t the same as testing reasoning capacity.

If I gave you a pen and paper and asked you to “calculate 1,534,315 x 404.496”, you probably could given sufficient time. If I took away the pen and paper and asked you to answer me in 5 seconds, does that mean you can’t do maths?

Why this matters: “AI reasoning is brittle” and “AI cannot reason” are different claims requiring different responses. If you dismissed AI for analytical work based on these papers, you made that call on incomplete evidence.

My other point here is that the “A” in “AI” stands for “Artificial”. It’s fair to say that these systems aren’t truly intelligent or reasoning in the same way humans are – but if they have the appearance of true intelligence, how much does that matter?

5. Nature: “AI Model Collapse” Warning

The claim: AI will degrade and “eat itself” if trained on AI-generated data.

The study: “AI models collapse when trained on recursively generated data”

Source: Nature (Ilia Shumailov et al., Oxford/Cambridge/Toronto)

Date: July 2024

Link: Read the paper in Nature

The Reality Check:

The peer-reviewed paper showed that when a model is trained only on its own raw output, it loses quality fast.

That’s 100% true – and completely unlike how AI companies actually train models.

Real-world training uses ratio filtering and synthetic data curation. Under these conditions, the collapse they warned about is entirely avoidable.

Why this matters: This is a valid theoretical concern for AI researchers to monitor, but absolutely not an imminent crisis for businesses to panic about. We don’t complain when a photocopy of a photocopy of a photocopy doesn’t come out right – we just look for the original.

What This All Means for Your Business

If you believed the cognitive atrophy study, you might have banned AI tools and fallen behind competitors who didn’t.

If you believed the productivity study, you might have promised your board results that were literally made up.

If you believed the water figures, your sustainability reporting is off by orders of magnitude.

If you believed the reasoning papers without reading the critiques, you might have dismissed AI for work it can actually do well.

The cost of bad information isn’t confusion. It’s delayed decisions, missed opportunities, and resources allocated to problems that don’t exist at the scale reported.

What to Watch For in 2026

Three red flags when you see AI research making headlines:

Medical metaphors without clinical evidence. “Atrophy,” “addiction,” “damage” – these terms carry weight. Check whether the study actually measured what the language implies.

Restricted compute. Did the study let the model think step-by-step, or force a snap answer? This changes what the results mean entirely.

Unit confusion. Withdrawal vs. consumption. Permitted maximum vs. actual usage. Cubic metres vs. litres. The difference between alarming and mundane is often a unit conversion someone didn’t check.

My View

We spent 2025 worrying about AI slop, misinformation, and hallucinations while peer-reviewed journals published fabricated data, prestigious researchers used clinical terms without clinical evidence, and a fact-checked NYT bestselling book from the world’s largest publisher got a central claim wrong by a factor of 4,500.

The problem isn’t AI vs. traditional sources. The problem is that verification has always been necessary, and the institutions we trusted to do it for us aren’t reliable.

When anyone sends me a viral AI study now, I ask four questions before I read past the abstract:

  1. Who collected the data?
  2. What were the constraints?
  3. What’s the sample size?
  4. Do the numbers pass a basic sanity check?

These questions take five minutes.

They would have stopped boards from expecting 44% productivity gains that were fabricated.

They would have caught a claim that implied a single building uses more water than a city of 88 million people. They would have saved a client from hesitating over a study that proved nothing about cognitive atrophy.

Look, I’m not saying AI is amazing.

LLMs make things up (that’s how they work, literally).

Generative AI – especially video – uses a lot of electricity and water (but as you’ve seen, 4,500 times less than many now think).

There are going to be massive job losses due to AI, and it’s already having a real detrimental impact on education that we need to resolve.

And I’m not saying ignore academic research papers. What I’m saying is the same scrutiny we’re told to apply to AI-generated content should apply to everything – including the sources warning us about AI.

Let’s have a bit more nuance in the AI conversation. Less hype. Less throwing the baby out with 4,500x the bathwater.

Love to hear your thoughts on this – leave a comment below with your thoughts, and if you like the post please share with others.

💡
Your AI Transformation Starts Here
Get The Free AI Toolkit for Strategic Breakthrough Zero Guesswork, Maximum Impact
💡 Your AI Transformation Starts Here:

Get The Free AI Toolkit for Strategic Breakthrough
Zero Guesswork, Maximum Impact

Get Instant Access
Written by Alastair McDermott

I help leadership teams adopt AI the right way: people first, numbers second. I move you beyond the hype, designing and deploying practical systems that automate busywork - the dull bits - so humans can focus on the high-value work only they can do.

The result is measurable capacity: cutting processing times by 92% and unlocking €55,000 per month in extra productivity.

More posts like this.

Bridging the Divide Between Human and AI
AI Strategy

Navigating the Chaos: How to Prepare for AI Disruption

AI is advancing faster than our institutions can respond. This gap is already causing chaos. Not just disruption, but systemic breakdown. It’s the pacing gap: one line on the graph is AI capability, rising exponentially. The other is how businesses, governments, and

Bridging the Divide Between Human and AI
AI Strategy

Why Talk About ROI First?

When I talk with clients about tech projects, I talk about Return On Investment (ROI) before I ever talk about the actual technology. I do this to solve a problem most engineers don’t even acknowledge: we love building the wrong thing really

Bridging the Divide Between Human and AI
AI Essentials

How to Use AI for Professional Writing

My Workflow for Using AI Without Losing Authority I use AI for a huge amount of my high-stakes professional writing now. Not because it’s smarter than me… ok, I mean, it is, but… But it handles the parts of writing I find

Bridging the Divide Between Human and AI
AI Strategy

Where is the ROI?

Where is the return on investment from AI? Why huge AI time savings don’t seem to impact the bottom line I keep seeing the same thing happen. People tell me they’re saving loads of time with AI, but when leadership looks at

Bridging the Divide Between Human and AI
AI Strategy

What Can We Learn from Moderna’s AI Strategy?

What Can We Learn from Moderna’s AI Strategy? I’ve spent a lot of time looking at how different companies approach AI adoption. Back in early 2024, OpenAI released a flagship case study about Moderna’s use of ChatGPT. It was impressive, showing fast

Get regular updates on AI strategies that work.

You're almost there!

I turn AI tech & strategy into clear, actionable insights. You’ll discover how to leverage AI, how to integrate it strategically to get a competitive edge, automate tedious tasks, and improve business decision-making.

– Alastair.