New 'LoopLLM' Attack Exposes AI Vulnerabilities

Researchers uncover a novel method to overwhelm large language models, raising security concerns.

A new research paper introduces 'LoopLLM,' an attack framework that exploits large language models (LLMs) by forcing them into repetitive generation loops. This method significantly increases energy consumption and latency, outperforming previous attack techniques.

By Mark Ellison

November 16, 2025

4 min read

New 'LoopLLM' Attack Exposes AI Vulnerabilities

Key Facts

LoopLLM is a new energy-latency attack framework for LLMs.
It forces LLMs into repetitive generation loops.
LoopLLM achieves over 90% of maximum output length, compared to 20% for baselines.
It improves cross-model transferability by about 40% to models like DeepSeek-V3 and Gemini 2.5 Flash.
The attack exploits autoregressive vulnerabilities in LLMs.

Why You Care

Ever wonder if someone could deliberately slow down or even crash the AI tools you rely on daily? What if a simple prompt could make your favorite AI assistant burn through massive amounts of energy and time? A recent discovery reveals a new vulnerability in large language models (LLMs) that does exactly that. This creation, detailed in a new paper, introduces an attack called LoopLLM. It highlights a essential security concern for anyone building, using, or depending on AI.

What Actually Happened

Researchers have unveiled LoopLLM, a novel energy-latency attack structure targeting large language models, as mentioned in the release. This attack works by compelling LLMs to enter repetitive generation loops. Previous attack methods tried to delay the output’s termination symbols. However, these older methods became less effective as the output grew longer, according to the paper states. LoopLLM overcomes this limitation by exploiting an LLM’s autoregressive vulnerabilities—its tendency to predict the next word based on previous ones. This new approach reliably forces LLMs to generate text until they hit their maximum output limits. The team revealed that this method significantly outperforms existing techniques.

Why This Matters to You

This new attack isn’t just a theoretical concept; it has practical implications for your AI interactions. Imagine using an AI for essential tasks, only for it to get stuck in an endless loop, consuming resources and failing to deliver useful output. This could impact everything from customer service chatbots to AI assistants. The research shows that LoopLLM can achieve over 90% of the maximum output length, a significant jump from the 20% seen with baseline methods. What’s more, its transferability improved by around 40% to models like DeepSeek-V3 and Gemini 2.5 Flash, according to the announcement.

Consider a scenario where an organization uses an LLM for real-time content generation. A malicious actor could deploy a LoopLLM attack. This would cause the model to generate nonsensical, repetitive text, wasting computational resources and potentially leading to service outages. How might such an attack affect your business or personal use of AI?

As Xingyu Li, one of the authors, stated, “Existing attack methods aim to prolong output by delaying the generation of termination symbols. However, as the output grows longer, controlling the termination symbols through input becomes difficult, making these methods less effective.” This highlights the cleverness of the LoopLLM approach in bypassing these prior limitations.

Impact of LoopLLM

Feature	Old Methods	LoopLLM
Max Output Length	~20%	>90%
Cross-Model Transfer	Limited	~40% betterment
Attack Mechanism	Delay termination	Repetitive Loops

The Surprising Finding

What truly stands out is the attack’s effectiveness and transferability. The study finds that LoopLLM can achieve over 90% of the maximum output length. This is a dramatic increase compared to the mere 20% achieved by previous methods. This level of control over an LLM’s output length is quite surprising. It challenges the assumption that prompt engineering alone can fully mitigate such resource-draining attacks. The structure also improves transferability by approximately 40% to commercial models like DeepSeek-V3 and Gemini 2.5 Flash, as detailed in the blog post. This means an attack crafted for one model could potentially work on others, making defenses more complex.

What Happens Next

This discovery will likely spur significant efforts in LLM security and prompt engineering. We can expect to see new defense mechanisms developed over the next 6-12 months. For example, AI developers might implement stricter output length controls or more detection systems for repetitive generation patterns. The industry implications are substantial; AI providers will need to harden their models against these types of energy-latency attacks. For you, this means staying informed about updates to your preferred AI services. Always be cautious about the prompts you use or encounter, especially if they seem designed to elicit overly long or repetitive responses. The technical report explains that LoopLLM introduces a “repetition-inducing prompt optimization.” This suggests that users should be aware of how certain prompt structures can be exploited.

Ready to start creating?