LLMs Struggle with Interviewing, New Study Reveals

A new dataset exposes significant 'ground gaps' in how large language models conduct informational interviews.

Large Language Models (LLMs) often fail at complex conversational tasks like journalistic interviews, a new study indicates. Researchers developed 'NewsInterview,' a dataset of 40,000 interviews, to evaluate LLMs. The findings show LLMs struggle with strategic dialogue and information extraction, highlighting a need for better multi-turn planning in AI.

Mark Ellison

By Mark Ellison

February 14, 2026

3 min read

LLMs Struggle with Interviewing, New Study Reveals

Key Facts

  • Researchers created 'NewsInterview,' a dataset of 40,000 two-person informational interviews from NPR and CNN.
  • LLMs are significantly less likely than human interviewers to use acknowledgements and pivot to higher-level questions.
  • Interviewer LLMs struggle with recognizing when questions are answered and engaging persuasively.
  • The study indicates a fundamental deficit in LLMs' multi-turn planning and strategic thinking.
  • The research was accepted at ACL 2025.

Why You Care

Ever wonder if an AI could truly interview someone like a human journalist? Could it ask the right follow-up questions? A new study suggests the answer is a resounding ‘not yet.’ This research highlights a essential ‘ground gap’ in Large Language Models (LLMs), impacting their ability to engage in meaningful, multi-turn conversations. Why should you care? Because if AI can’t master a simple interview, what does that mean for your future interactions with AI assistants?

What Actually Happened

Researchers have unveiled a new dataset called ‘NewsInterview,’ as detailed in the blog post. This dataset aims to evaluate how well LLMs perform in informational interviews. The team curated an extensive collection of 40,000 two-person informational interviews from major news outlets like NPR and CNN. Their goal was to pinpoint specific weaknesses in LLMs’ conversational abilities, especially in grounding language (connecting responses to reality) and strategic dialogue (planning ahead in a conversation). This effort provides a ‘playground’ – a simulated environment – for developing more capable AI agents. The study was accepted at ACL 2025, according to the announcement.

Why This Matters to You

This research has practical implications for anyone interacting with or building AI. If you’re a content creator, imagine using an AI to conduct preliminary interviews for your podcast. The study reveals that current LLMs would likely miss key details. For example, an LLM might not recognize when a source has fully answered a question. This leads to suboptimal information extraction, as the research shows. Your AI interviewer might just move on without digging deeper. The study highlights that “LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.” This means less nuanced and less complete information for your projects. Think about your own experiences: have you ever felt an AI wasn’t truly ‘listening’ to your full response? This research explains why. How might this impact your trust in AI tools designed for complex interactions?

Here’s a quick look at the identified ‘ground gaps’:

LLM WeaknessHuman Interviewer Strength
Less acknowledgementsUses acknowledgements often
Struggles to pivot questionsPivots to higher-level questions
Suboptimal info extractionExtracts comprehensive info
Lacks multi-turn planningStrategic multi-turn planning

The Surprising Finding

Here’s the twist: while LLMs acting as ‘source personas’ (the interviewee) can mimic human behavior in sharing information, their performance as ‘interviewer LLMs’ is remarkably poor. The study finds that “interviewer LLMs struggle with recognizing when questions are answered and engaging persuasively.” This is surprising because many assume LLMs, with their vast knowledge, would excel at asking follow-up questions. However, the data indicates a fundamental deficit in their multi-turn planning and strategic thinking. It challenges the assumption that simply having access to information translates to effective conversational strategy. This suggests that the problem isn’t just about knowledge, but about the process of inquiry itself.

What Happens Next

The findings underscore a clear need for enhancing LLMs’ strategic dialogue capabilities, according to the team. We can expect to see advancements in AI models focusing on conversational planning in the next 12-18 months. Developers will likely integrate longer-horizon rewards into their training, as mentioned in the release. Imagine future AI tools that can conduct nuanced customer service interviews or even assist in journalistic investigations. For example, a future AI might be able to guide a complex troubleshooting conversation with a user, ensuring all angles are covered. For you, this means improved AI assistants that truly understand and adapt to your conversational needs. The industry will likely prioritize developing agents that can engage persuasively and recognize conversational cues. You should look for updates from major AI labs in late 2025 or early 2026, focusing on these specific improvements.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice