New AI Method 'SALT' Secures Your Private Thoughts

Researchers introduce SALT to prevent Large Language Models from leaking sensitive data during internal reasoning.

A new method called SALT (Steering Activations towards Leakage-free Thinking) has been developed to enhance privacy in Large Language Models (LLMs). It tackles the critical issue of LLMs inadvertently exposing private user data through their internal thought processes, not just their final outputs. This innovation aims to balance privacy and the model's reasoning abilities.

Mark Ellison

By Mark Ellison

November 16, 2025

4 min read

New AI Method 'SALT' Secures Your Private Thoughts

Key Facts

  • SALT (Steering Activations towards Leakage-free Thinking) is a new method for enhancing LLM privacy.
  • It addresses privacy leakage in an LLM's internal reasoning processes (Chain of Thought), not just its final outputs.
  • SALT is a lightweight test-time intervention that injects steering vectors into hidden states.
  • The method achieves significant reductions in privacy leakage, including 18.2%, 17.9%, and 31.2% across multiple LLMs.
  • Researchers identified specific 'high-leakage layers' within LLMs responsible for exposing sensitive data.

Why You Care

Ever worried your AI assistant might be thinking about your private data, even if it doesn’t say it out loud? What if its internal ‘thoughts’ could expose sensitive information? A new creation called SALT aims to stop this quiet data leakage. This is crucial as Large Language Models (LLMs) become more integrated into your daily life. It ensures your conversations remain truly private, protecting your contextual privacy expectations.

What Actually Happened

Researchers have introduced a novel technique named SALT. This stands for Steering Activations towards Leakage-free Thinking. As detailed in the blog post, SALT is a lightweight test-time intervention. It specifically targets privacy leakage within an LLM’s Chain of Thought (CoT). The Chain of Thought is the model’s internal reasoning process, like a human thinking step-by-step. The team revealed that LLMs can inadvertently expose sensitive details through these reasoning traces. This happens even when the final output appears safe. SALT works by injecting targeted steering vectors into the model’s hidden state. This helps to mitigate privacy leakage without compromising the model’s ability to reason effectively.

Why This Matters to You

Your personal data is increasingly handled by AI. This new method directly addresses a hidden vulnerability. Imagine you’re using an AI for financial advice. You expect the final advice to be secure. However, the internal steps the AI took to reach that advice could still contain sensitive details. This is where SALT comes in. It protects those internal ‘thoughts’ from leaking your information. This ensures a delicate balance between privacy and utility. The research shows that SALT achieves significant reductions in leakage. For example, it can reduce privacy leakage by 18.2%, 17.9%, and 31.2% across multiple LLMs. How confident are you that your current AI tools aren’t already ‘thinking’ about your sensitive data in a leaky way?

Here are some key benefits of SALT for your AI interactions:

  • Enhanced Privacy: Your sensitive data remains secure during the AI’s internal processing.
  • Contextual Protection: Prevents leakage from the model’s reasoning, not just its final output.
  • Maintained Utility: The AI’s reasoning capabilities are preserved, ensuring accurate results.
  • Lightweight Intervention: Easy to implement without major overhauls to existing LLMs.

As mentioned in the release, “LLMs often leak private information through their internal reasoning processes, violating contextual privacy expectations.” This means even if the AI gives you a safe answer, its internal steps could be exposing your secrets. SALT directly tackles this problem. It makes your interactions with AI more secure and trustworthy.

The Surprising Finding

Here’s the twist: previous privacy efforts focused mainly on the final output of LLMs. However, the study finds that the real vulnerability lies deeper. It’s in the model’s internal reasoning processes, the ‘Chain of Thought.’ This is surprising because many assumed securing the output was enough. The team revealed that these “leaky thoughts” occur when models expose sensitive details inadvertently. This happens even when final outputs appear safe. The research identifies specific “high-leakage layers” responsible for this behavior. This challenges the common assumption that privacy is only about what the AI explicitly tells you. It highlights a more subtle and pervasive privacy challenge.

What Happens Next

The introduction of SALT marks a significant step forward in AI privacy. We can expect to see this method, or similar steering vector techniques, integrated into commercial LLMs within the next 12-18 months. Imagine future AI assistants that inherently protect your internal data during every interaction. For example, a medical AI could process your health records securely, ensuring no sensitive details are exposed during its diagnostic reasoning. The company reports that SALT is a lightweight intervention, making its adoption potentially faster. This will likely set new industry standards for privacy in AI creation. What’s more, developers will need to understand and implement these new security measures. This ensures that their AI applications are truly leakage-free. This shift will ultimately build greater trust in AI technologies for all users.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice