AI Will Quietly Keep Getting Better

Alastair McDermott

Boring AI research breakthroughs will change your work (but won’t make headlines)

I’ve been reading AI research papers for months. Most cover incremental improvements that won’t matter for ages.

But when I step back and look at the patterns, something becomes clear.

Researchers are solving the boring, practical problems that determine whether AI actually helps with real work. Things like reading long documents without missing key details, not making stuff up, and doing exactly what you ask it to do.

These aren’t flashy capabilities that look good in demos. They’re the fundamentals that determine whether AI saves you time or creates more work.

The problems researchers are fixing

Context Windows (long documents that actually work)

Here’s a secret about AI: many models claim they can handle massive documents, but in reality they ignore huge chunks of what you give them.

If you hand someone a 200-page report and ask them to summarise it under time pressure, they’ll skim the first few pages, glance at the middle, maybe check the conclusion, then write a summary. That’s essentially what many current AI models do with long documents.

Researchers have been tackling this from multiple angles. Liu et al. changed how models learn to pay attention to earlier parts of long texts (arXiv:2404.12822). Instead of forgetting what they read 50 pages ago, models now get rewarded for referencing earlier content.

Chen et al. figured out how to split documents and process chunks simultaneously (arXiv:2404.18610). Rather than reading everything sequentially, the model processes multiple sections at once, then combines the insights.

Wang et al. developed attention mechanisms that work across 256,000 tokens – roughly a 400-page book (arXiv:2405.08559). They combine focused attention on nearby text with selective attention on distant parts. Zhang et al. pushed this to 512,000 tokens using hierarchical processing (arXiv:2405.14731).

The pattern is clear. Context windows will become real capabilities in commercial models, not just marketing numbers.

Making stuff up less often

Hallucinations remain the biggest practical problem with AI. It’s like having a brilliant assistant who occasionally invents facts with complete confidence.

This happens because AI models are prediction engines. They’re trained to produce text that looks right, not text that is right. When they don’t know something, they often guess rather than admit uncertainty.

Multiple research teams have been working on this. Chen et al. split generation into two steps – extract facts first, then write responses (arXiv:2404.17503). It’s like requiring someone to gather all their sources before writing, rather than making claims and hoping they’re correct.

He et al. built systems that inject supporting facts when the model seems uncertain (arXiv:2405.09464). Think of it as having a fact-checker sitting next to the AI, jumping in with verified information when the model starts to guess.

Liu et al. developed better ways to spot made-up content in summaries. Their system breaks documents into pieces, flags potentially invented content, then combines results. It outperformed larger models whilst running faster.

This is important because reducing hallucinations isn’t just about accuracy. It’s about whether you’ll be able to rely on AI output for decisions that matter.

Following instructions properly

Getting AI to do exactly what you ask sounds simple. It’s surprisingly difficult.

Models are trained on vast amounts of text where “following instructions” meant different things to different people. Academic writing follows different rules than marketing copy. Legal documents have different constraints than creative writing.

Shen et al. enhanced training with structured prompts that encode specific constraints (arXiv:2404.18504). Instead of hoping models understand your requirements, they build the constraints into the training process.

Xu et al. developed a clever approach that works with existing models – generate multiple responses and pick the best one based on how well it follows instructions (arXiv:2405.14247). It’s like asking several people to complete a task, then choosing the response that best meets your criteria.

But here’s something interesting. Li et al. found that asking models to “think out loud” sometimes makes them worse at following strict instructions. When you ask for step-by-step reasoning, models sometimes get so focused on showing their work that they forget your original requirements.

Efficiency gains that matter

Kong et al. created systems where models stop processing when they’re confident about answers (arXiv:2404.17489). Instead of running every calculation to completion, they exit early when possible. This cut compute costs by 40% with minimal accuracy loss.

Think of it like a multiple-choice exam. Some questions you know immediately. Others require more thought. Rather than spending the same time on every question, you allocate effort based on difficulty.

This is important because efficiency will affect what becomes economically viable to run at scale.

What this actually means for you

These aren’t breakthrough moments. They’re the steady progress that will gradually shift what AI can do reliably.

If you use AI for work, here’s what to watch for:

AI will be able to read and understand longer and more documents (longer context windows)
AI will flag or fact-check what it’s uncertain about
AI will follow instructions better
AI will continue to reduce in costs to to more efficient processing
AI will have improved reasoning without losing accuracy

The bigger picture

This research will end up in the commercial products we are using daily and become baseline functionality.

When multiple research teams start solving the same practical problems from different angles, commercial applications follow.

We’re seeing this convergence around reliability, efficiency, and instruction-following. The boring fundamentals that determine whether AI actually helps with real work.

This research will take months to filter into products you can use. But the direction is clear.

AI will get quietly better at the things that matter most for practical applications.

💡

Your AI Transformation Starts Here

Get The Free AI Toolkit for Strategic Breakthrough Zero Guesswork, Maximum Impact

💡 Your AI Transformation Starts Here:

Get The Free AI Toolkit for Strategic Breakthrough
Zero Guesswork, Maximum Impact

Get Instant Access

More posts like this.

AI Strategy

Why Talk About ROI First?

When I talk with clients about tech projects, I talk about Return On Investment (ROI) before I ever talk about the actual technology. I do this to solve a problem most engineers don’t even acknowledge: we love building the wrong thing really

October 14, 2025

AI Essentials

How to Use AI for Professional Writing

My Workflow for Using AI Without Losing Authority I use AI for a huge amount of my high-stakes professional writing now. Not because it’s smarter than me… ok, I mean, it is, but… But it handles the parts of writing I find

October 8, 2025

AI Strategy

Where is the ROI?

Where is the return on investment from AI? Why huge AI time savings don’t seem to impact the bottom line I keep seeing the same thing happen. People tell me they’re saving loads of time with AI, but when leadership looks at

August 22, 2025

AI Strategy

The Butler and the Advisor: A Simple Framework for Getting Real Value from AI

The strategic value of AI isn’t rewriting your emails – it’s using AI as a thinking partner. Recent AI ads on TV present AI as a mere grammar-fixer or writing shortcut – all surface-level stuff. They reduce AI to a glorified autocorrect.

July 20, 2025

AI Strategy

What Can We Learn from Moderna’s AI Strategy?

What Can We Learn from Moderna’s AI Strategy? I’ve spent a lot of time looking at how different companies approach AI adoption. Back in early 2024, OpenAI released a flagship case study about Moderna’s use of ChatGPT. It was impressive, showing fast

July 16, 2025

AI Strategy

Your most valuable asset is invisible

Your most valuable asset is invisible. That instinctive decision you just made? The one based on two decades of hard-won experience? It’s worth a fortune. But right now, it lives only in your head, impossible to scale, teach, or even explain to

July 1, 2025

AI Will Quietly Keep Getting Better

Boring AI research breakthroughs will change your work (but won’t make headlines)

The problems researchers are fixing

Context Windows (long documents that actually work)

Making stuff up less often

Following instructions properly

Efficiency gains that matter

What this actually means for you

The bigger picture

More posts like this.

Why Talk About ROI First?

How to Use AI for Professional Writing

Where is the ROI?

The Butler and the Advisor: A Simple Framework for Getting Real Value from AI

What Can We Learn from Moderna’s AI Strategy?

Your most valuable asset is invisible

Get regular updates on AI strategies that work.

You're almost there!