Skip to content
Join 7,000+ leaders following Alastair's work on LinkedIn.

AI Will Quietly Keep Getting Better

Last updated 15 April 2026 Published 4 June 2025

Boring AI research breakthroughs will change your work (but won't make headlines)

I've been reading AI research papers for months. Most cover incremental improvements that won't matter for ages.

But when I step back and look at the patterns, something becomes clear.

Researchers are solving the boring, practical problems that determine whether AI actually helps with real work. Things like reading long documents without missing key details, not making stuff up, and doing exactly what you ask it to do.

These aren't flashy capabilities that look good in demos. They're the fundamentals that determine whether AI saves you time or creates more work.

The problems researchers are fixing

Context Windows (long documents that actually work)

Here's a secret about AI: many models claim they can handle massive documents, but in reality they ignore huge chunks of what you give them.

If you hand someone a 200-page report and ask them to summarise it under time pressure, they'll skim the first few pages, glance at the middle, maybe check the conclusion, then write a summary. That's essentially what many current AI models do with long documents.

Researchers have been tackling this from multiple angles. Liu et al. changed how models learn to pay attention to earlier parts of long texts (arXiv:2404.12822). Instead of forgetting what they read 50 pages ago, models now get rewarded for referencing earlier content.

Chen et al. figured out how to split documents and process chunks simultaneously (arXiv:2404.18610). Rather than reading everything sequentially, the model processes multiple sections at once, then combines the insights.

Wang et al. developed attention mechanisms that work across 256,000 tokens - roughly a 400-page book (arXiv:2405.08559). They combine focused attention on nearby text with selective attention on distant parts. Zhang et al. pushed this to 512,000 tokens using hierarchical processing (arXiv:2405.14731).

The pattern is clear. Context windows will become real capabilities in commercial models, not just marketing numbers.

Making stuff up less often

Hallucinations remain the biggest practical problem with AI. It's like having a brilliant assistant who occasionally invents facts with complete confidence.

This happens because AI models are prediction engines. They're trained to produce text that looks right, not text that is right. When they don't know something, they often guess rather than admit uncertainty.

Multiple research teams have been working on this. Chen et al. split generation into two steps - extract facts first, then write responses (arXiv:2404.17503). It's like requiring someone to gather all their sources before writing, rather than making claims and hoping they're correct.

He et al. built systems that inject supporting facts when the model seems uncertain (arXiv:2405.09464). Think of it as having a fact-checker sitting next to the AI, jumping in with verified information when the model starts to guess.

Liu et al. developed better ways to spot made-up content in summaries. Their system breaks documents into pieces, flags potentially invented content, then combines results. It outperformed larger models whilst running faster.

This is important because reducing hallucinations isn't just about accuracy. It's about whether you'll be able to rely on AI output for decisions that matter.

Following instructions properly

Getting AI to do exactly what you ask sounds simple. It's surprisingly difficult.

Models are trained on vast amounts of text where "following instructions" meant different things to different people. Academic writing follows different rules than marketing copy. Legal documents have different constraints than creative writing.

Shen et al. enhanced training with structured prompts that encode specific constraints (arXiv:2404.18504). Instead of hoping models understand your requirements, they build the constraints into the training process.

Xu et al. developed a clever approach that works with existing models - generate multiple responses and pick the best one based on how well it follows instructions (arXiv:2405.14247). It's like asking several people to complete a task, then choosing the response that best meets your criteria.

But here's something interesting. Li et al. found that asking models to "think out loud" sometimes makes them worse at following strict instructions. When you ask for step-by-step reasoning, models sometimes get so focused on showing their work that they forget your original requirements.

Efficiency gains that matter

Kong et al. created systems where models stop processing when they're confident about answers (arXiv:2404.17489). Instead of running every calculation to completion, they exit early when possible. This cut compute costs by 40% with minimal accuracy loss.

Think of it like a multiple-choice exam. Some questions you know immediately. Others require more thought. Rather than spending the same time on every question, you allocate effort based on difficulty.

This is important because efficiency will affect what becomes economically viable to run at scale.

What this actually means for you

These aren't breakthrough moments. They're the steady progress that will gradually shift what AI can do reliably.

If you use AI for work, here's what to watch for:

  • AI will be able to read and understand longer and more documents (longer context windows)
  • AI will flag or fact-check what it's uncertain about
  • AI will follow instructions better
  • AI will continue to reduce in costs to to more efficient processing
  • AI will have improved reasoning without losing accuracy

The bigger picture

This research will end up in the commercial products we are using daily and become baseline functionality.

When multiple research teams start solving the same practical problems from different angles, commercial applications follow.

We're seeing this convergence around reliability, efficiency, and instruction-following. The boring fundamentals that determine whether AI actually helps with real work.

This research will take months to filter into products you can use. But the direction is clear.

AI will get quietly better at the things that matter most for practical applications.

Related Articles

Building an AI Knowledge Base That Actually Works

Labs Learning - April 2026 This is Part 2 of a two-part series on building AI knowledge bases. Read Part 1: Your AI Knowledge Base Is Only as Good as What You Feed It In the last post, I talked about the data quality problem: how 87% of my AI knowledge base turned out to […]

Your AI Knowledge Base Is Only as Good as What You Feed It

Labs Learning - April 2026 This is Part 1 of a two-part series on building AI knowledge bases. Read Part 2: Building an AI Knowledge Base That Actually Works There's a type of AI system that's becoming increasingly popular in businesses of all sizes. It's called RAG - retrieval-augmented generation - and the basic idea […]

Building a Live Speech to Text AI

One thing you may not know about me is that when I run workshops or live demos for organisations, I always give the organiser explicit permission - twice - to interrupt me. Here's why: I have ADHD. I have it...

What the Hell Is "Data"?

One thing that really frustrates me about AI consultants - and software engineers - is when they say things like: "You need to know your data" "You need to have clean data" "You need to know where your data...

Is your business AI ready?

  • Get honest, practical AI advice
  • Find out where AI saves the most time
  • No hard sell - just an honest conversation
Alastair McDermott

25 mins · Free · No obligation

Book a Focus Call