Unmasking LLM Biases: New Study Reveals Hidden Opinions

Researchers uncover how large language models form opinions and the surprising impact of prompt variations.

A new study reveals how large language models (LLMs) embed values and opinions. Researchers found that prompt variations and demographic features significantly affect LLM responses, highlighting inherent biases. This work helps identify and mitigate potential harm from AI systems.

Mark Ellison

By Mark Ellison

September 2, 2025

5 min read

Unmasking LLM Biases: New Study Reveals Hidden Opinions

Key Facts

  • Researchers analyzed 156,000 LLM responses to the Political Compass Test (PCT).
  • The study used 6 different LLMs and 420 prompt variations.
  • Demographic features added to prompts significantly affect LLM outcomes.
  • Similar justifications (tropes) are repeatedly generated across models, even with disparate stances.
  • There are disparities between closed-form and open-domain response results.

Why You Care

Have you ever wondered if the AI you chat with holds hidden beliefs? It’s a essential question. A new study, presented at EMNLP 2024, sheds light on how large language models (LLMs) form and express their values and opinions. This research is important because understanding these underlying biases can help us build fairer and more reliable AI. Why should you care? Your interactions with AI, from customer service bots to content generators, are shaped by these very biases.

What Actually Happened

Researchers have been working to understand the underlying values and opinions within large language models (LLMs). This is crucial for identifying biases and preventing potential harm, according to the announcement. Previously, scientists used survey questions to prompt LLMs and measure their stances on moral or political issues. However, the team revealed that LLM responses can change dramatically based on how they are prompted. There are many ways to argue for or against a position. To address this, the researchers analyzed a massive dataset. This dataset included 156,000 LLM responses to the 62 propositions of the Political Compass Test (PCT). They used 6 different LLMs and 420 prompt variations to generate these responses. Their work involved both broad analysis of generated stances and detailed examination of the plain text justifications.

For fine-grained analysis, the technical report explains they identified “tropes.” Tropes are semantically similar phrases that appear repeatedly across different prompts. These reveal natural patterns in the text that an LLM tends to produce. The study found that adding demographic features to prompts significantly affected the outcomes on the PCT, reflecting inherent bias. What’s more, there were disparities between results from closed-form versus open-domain responses. This means how you ask the question really matters.

Why This Matters to You

Understanding these findings is vital for anyone interacting with or developing AI. The research shows that the way you phrase a question to an LLM can dramatically alter its response. This has direct implications for fairness and accuracy. For example, imagine you are using an AI to generate marketing copy. If the AI has embedded biases, it might inadvertently create content that alienates certain demographics. Or, consider an AI that helps with legal research. If its responses are influenced by subtle demographic cues in your query, it could lead to skewed information.

How do you ensure your AI tools are giving you unbiased information? This study highlights the complexity. The team revealed: “demographic features added to prompts significantly affect outcomes on the PCT, reflecting bias, as well as disparities between the results of tests when eliciting closed-form vs. open domain responses.” This means even small changes in your input can lead to different results. This knowledge empowers you to be more essential of AI outputs.

Here’s a breakdown of what the study found:

Finding CategoryKey Observation
Prompt SensitivityLLM stances vary greatly based on prompting methods.
Demographic BiasAdding demographic features to prompts significantly alters results.
Response Type DisparityDifferences exist between closed-form and open-domain responses.
Recurring JustificationsSimilar justifications (tropes) are generated even for disparate stances.

Think of it as trying to get a straight answer from someone. If you ask the question in ten different ways, you might get ten slightly different answers. This research suggests LLMs behave similarly. How will you adjust your approach to using AI, knowing its responses can be so sensitive to your prompts?

The Surprising Finding

Here’s the twist: The study found that even when LLMs expressed different stances on an issue, the underlying plain text rationales often contained similar patterns. The paper states that “patterns in the plain text rationales via tropes show that similar justifications are repeatedly generated across models and prompts even with disparate stances.” This is surprising because you might expect vastly different opinions to come with vastly different reasoning. Instead, it suggests that LLMs might have a limited set of argumentative structures they default to, regardless of the conclusion they reach.

This challenges the common assumption that an LLM’s reasoning is unique to its stated position. It implies that the models might be drawing from a shared pool of common arguments or rhetorical devices. This happens even when they are prompted to take opposing views. It’s like two people arguing different sides of a debate, but both using the same logical fallacies or rhetorical flourishes. This finding emphasizes the need to look beyond just the final answer and examine the underlying reasoning process of these models.

What Happens Next

This research has significant implications for the future of AI creation and usage. The findings of EMNLP 2024 suggest that developers must prioritize testing for bias. This includes testing how prompt variations and demographic inputs affect LLM outputs. For example, AI companies might implement new testing protocols by early 2025. These protocols would specifically look for trope recurrence across varied prompts.

For users, this means being more aware of how you phrase your queries to LLMs. You might experiment with different prompts to see how responses change. For instance, if you’re using an LLM for creative writing, try rephrasing your initial prompt to see if the AI generates different narrative directions. The industry implications are clear: there’s a growing need for tools that can audit LLMs for these subtle biases and recurring justifications. This will help ensure that AI systems are more transparent and equitable. The team’s work provides a crucial step in that direction.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice