New Method Catches AI Hallucinations in Real-Time

Researchers unveil a scalable technique to detect fabricated entities in long-form AI generations.

A new research paper introduces a cost-effective and scalable method for real-time detection of hallucinated entities in large language models. This approach focuses on specific factual inaccuracies, offering a significant improvement over existing detection techniques.

By Sarah Kline

September 12, 2025

4 min read

New Method Catches AI Hallucinations in Real-Time

Key Facts

A new method detects hallucinated entities in long-form AI generations in real-time.
The approach targets entity-level hallucinations (e.g., fabricated names, dates, citations).
The method scales effectively to 70-billion-parameter models.
Classifiers achieved an AUC of 0.90 for Llama-3.3-70B, outperforming baselines.
The detection method also effectively identifies incorrect answers in mathematical reasoning tasks.

Why You Care

Have you ever wondered if the AI chatbot you’re talking to is making things up? What if that AI is giving medical advice or legal guidance? The potential for AI to ‘hallucinate’—inventing facts or details—is a major concern. This new research directly addresses that problem, making AI outputs more trustworthy for you.

What Actually Happened

Researchers Oscar Obeso, Andy Arditi, and their team have introduced a significant advancement in AI reliability. They’ve developed a method for the real-time detection of hallucinated tokens in long-form AI generations, according to the announcement. Hallucinated tokens refer to fabricated entities like names, dates, or citations that AI models sometimes generate. This new approach targets these specific entity-level hallucinations, rather than broader claim-level inaccuracies, as detailed in the blog post. This focus allows for streaming detection, meaning errors can be caught as the AI generates text. The team successfully scaled this method to 70-billion-parameter models, indicating its broad applicability. Existing methods often fall short, being either too limited for long texts or too expensive for practical use, the research shows.

Why This Matters to You

This creation is crucial for anyone relying on large language models (LLMs) for information. Imagine you’re using an AI to draft a report or research a complex topic. This new detection method means you’re less likely to encounter made-up facts or references. The paper states that their classifiers consistently outperform baselines on long-form responses. This includes more expensive methods like semantic entropy, with an AUC of 0.90 versus 0.71 for Llama-3.3-70B. This means a much higher accuracy in spotting fakes.

Here’s how this could impact your daily interactions with AI:

Increased Trust: You can have greater confidence in the factual accuracy of AI-generated content.
Safer Applications: For high-stakes uses like medical or legal advice, the risk of harmful misinformation decreases significantly.
Efficient Content Creation: If you’re a content creator, less time will be spent fact-checking AI outputs.

“Large language models are now routinely used in high-stakes applications where hallucinations can cause serious harm, such as medical consultations or legal advice,” the team revealed. This highlights the important need for reliable detection. How much more would you trust AI if you knew it was constantly checking itself for factual errors?

The Surprising Finding

Here’s the twist: while the researchers specifically trained their system to detect entity-level hallucinations, it showed unexpected versatility. The study finds that despite being trained only with entity-level labels, their probes effectively detect incorrect answers in mathematical reasoning tasks. This indicates a generalization beyond just identifying fabricated names or dates. It challenges the assumption that highly specialized detection methods are limited to their exact training domain. This broader applicability suggests the method could be more than initially conceived. It could potentially catch a wider range of factual errors, even those not directly related to named entities.

What Happens Next

This research points to a promising future for more reliable AI. The team publicly released their datasets, making it easier for others to build upon their work. You can expect to see these detection methods integrated into commercial AI models within the next 12-18 months. For example, future versions of AI writing assistants might flag potentially hallucinated sentences in real-time as you type. This would provide feedback, improving your workflow and the quality of your output. The industry implications are vast, leading to more dependable AI tools across various sectors. The company reports that annotated responses from one model can even be used to train effective classifiers on other models. This suggests a cost-effective path to widespread adoption. This work suggests a promising new approach for , real-world hallucination detection, as mentioned in the release.

Ready to start creating?