AI Hallucinations: New Method Combats Misinformation in Q&A

Researchers combine LLMs and Knowledge Graphs to boost accuracy, especially in critical domains.

A new research paper introduces an innovative method to significantly reduce AI hallucinations in question-answering systems. By integrating Large Language Models (LLMs) with Knowledge Graphs (KGs), the approach aims to improve accuracy and reliability, particularly in sensitive fields like biomedicine.

Sarah Kline

By Sarah Kline

November 14, 2025

4 min read

AI Hallucinations: New Method Combats Misinformation in Q&A

Key Facts

  • The research combines Large Language Models (LLMs) and Knowledge Graphs (KGs) to reduce AI hallucinations.
  • A query checker within the system ensures syntactical and semantic validity of LLM-generated queries.
  • The method was evaluated using a new benchmark dataset of 50 biomedical questions.
  • GPT-4 Turbo performed best in query generation, but llama3:70b showed promise with prompt engineering.
  • A user-friendly web interface and source code are available for accessibility and experimentation.

Why You Care

Ever asked an AI a question only to get a confidently wrong answer? It’s frustrating, right? This problem, known as AI hallucination, can be more than just annoying; it can be dangerous. What if your medical questions were answered incorrectly? A new approach is here to tackle this head-on, making AI responses much more trustworthy for you.

What Actually Happened

Researchers Larissa Pusch and Tim O. F. Conrad have introduced a novel method to combat AI hallucinations, according to the announcement. Their paper details combining Large Language Models (LLMs) with Knowledge Graphs (KGs) to enhance the accuracy of question-answering systems. LLMs are AI models that understand and generate human-like text. KGs are structured databases of facts and relationships. The team built their method using the LangChain structure.

This new system includes a query checker. This checker ensures that LLM-generated queries are both syntactically and semantically valid. These validated queries then extract information from a Knowledge Graph. This process, the research shows, substantially reduces errors like hallucinations. They specifically applied this to a biomedical Knowledge Graph as an example. The goal is to make digital information systems more accessible and reliable.

Why This Matters to You

Imagine you’re a healthcare professional or a researcher. Your work relies on accurate information. Incorrect AI responses could lead to serious consequences. This new method directly addresses that essential need. It aims to provide reliable and intuitive solutions for question-answering systems. The approach effectively handles common issues such as data gaps and hallucinations, as mentioned in the release. This means you can trust the information you receive more.

For example, consider a doctor using an AI to quickly retrieve drug interaction information. If the AI hallucinates, it could suggest a dangerous combination. This new system helps prevent such errors. It ensures the AI’s output is grounded in data. How much more confident would you feel knowing your AI assistant cross-references its answers with a structured knowledge base?

Here’s how the new system compares to traditional LLM approaches:

FeatureTraditional LLMLLM + Knowledge Graph
AccuracyProne to hallucinationsSignificantly reduced errors
ReliabilityCan generate misinformationGrounded in factual data
TransparencyBlack boxVerifiable paths for accuracy
Data GapsStruggles with specific dataAddresses gaps effectively

Larissa Pusch and Tim O. F. Conrad state that their approach offers “a reliable and intuitive approach for question answering systems.” This means that for your essential information needs, you’ll get answers you can depend on.

The Surprising Finding

Here’s the twist: while models like GPT-4 Turbo performed well, open-source models showed significant promise. The study finds that GPT-4 Turbo excels in generating accurate queries. However, models like llama3:70b can achieve strong results with proper prompt engineering. This challenges the assumption that only the most , proprietary LLMs can deliver high accuracy. It suggests that accessible, open-source AI could be just as effective for you with the right fine-tuning.

This is surprising because many assume top-tier performance requires top-tier, often costly, closed-source models. The research indicates that strategic prompt engineering can level the playing field. This could democratize access to highly accurate, hallucination-reduced AI. It opens doors for smaller organizations or individual developers. They can now build systems without needing massive budgets.

What Happens Next

The researchers have already developed a user-friendly web-based interface. This interface allows users to input natural language queries. You can view the generated and corrected Cypher queries. What’s more, you can verify the resulting paths for accuracy, according to the announcement. The source code is publicly available on their Git repository. This means developers can start experimenting with this hybrid approach immediately.

We might see this system integrated into various industry-specific AI tools within the next 12-18 months. For example, imagine a legal research system using this method. It could provide attorneys with highly accurate case precedents, minimizing the risk of misinterpretation. This actionable advice for developers is clear: explore combining LLMs with KGs. The industry implications are vast, promising more reliable AI assistants across many sectors. This could lead to a new standard for trustworthy AI interactions.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice