ChaosEater: LLMs Automate Software Resilience Testing

New research introduces ChaosEater, an AI system that fully automates Chaos Engineering using Large Language Models.

A new paper introduces ChaosEater, a system leveraging Large Language Models (LLMs) to fully automate Chaos Engineering. This aims to make building resilient software systems more accessible and cost-effective. The system targets Kubernetes environments, handling everything from requirement definition to debugging.

By Sarah Kline

November 16, 2025

3 min read

ChaosEater: LLMs Automate Software Resilience Testing

Key Facts

ChaosEater is a system that fully automates Chaos Engineering using Large Language Models (LLMs).
It aims to make building resilient software systems accessible and low-cost.
ChaosEater targets software systems built on Kubernetes.
The system automates the entire CE cycle, including requirement definition, code generation, testing, and debugging.
Evaluations show ChaosEater completes CE cycles with significantly low time and monetary costs.

Why You Care

Ever worried about your favorite app crashing at the worst possible moment? What if software could proactively find and fix its own weaknesses before you even noticed a problem? New research from Daisuke Kikuta and his team introduces ChaosEater, a system designed to make software incredibly resilient. This creation could dramatically improve the stability of the digital tools you rely on daily.

What Actually Happened

Chaos Engineering (CE) is a technique for improving the resilience of distributed systems. It involves intentionally injecting faults to uncover weaknesses, according to the announcement. Traditionally, planning these experiments and improving systems based on results has been manual and labor-intensive. However, the paper proposes ChaosEater, a system that fully automates the entire CE cycle. It uses Large Language Models (LLMs) – AI models capable of understanding and generating human-like text – to manage this complex process. ChaosEater specifically targets software systems built on Kubernetes, a popular system for managing containerized workloads.

Why This Matters to You

Imagine your favorite streaming service never buffering or your banking app always working, even during peak traffic. That’s the promise of more resilient software. ChaosEater automates tasks like defining requirements, generating code, testing, and debugging. This means developers can build stronger systems without needing extensive, specialized knowledge. Your digital experiences could become much smoother and more reliable.

For example, consider a small startup building a new online service. Without ChaosEater, they might lack the resources or expertise for thorough resilience testing. With this new LLM-powered Chaos Engineering approach, they can ensure their service is from the start. This significantly lowers the barrier to entry for creating reliable applications.

“To address these challenges and enable anyone to build resilient systems at low cost, this paper proposes ChaosEater, a system that automates the entire CE cycle with Large Language Models (LLMs),” the team revealed. This automation offers significant benefits:

Reduced Time Costs: Faster identification and resolution of vulnerabilities.
Lower Monetary Costs: Less need for highly specialized, expensive human expertise.
Increased Accessibility: More developers can build resilient software.
Enhanced Reliability: Systems are proactively against failures.

How much more reliable could your daily digital life become with systems like ChaosEater at work?

The Surprising Finding

What’s truly surprising is ChaosEater’s ability to consistently complete reasonable CE cycles with significantly low time and monetary costs, the study finds. This challenges the common assumption that comprehensive resilience testing requires extensive human effort and specialized skills. The system’s cycles were also qualitatively validated by both human engineers and other LLMs. This dual validation highlights its effectiveness and accuracy. It suggests that AI can take on highly complex engineering tasks that were once considered exclusively human domains. This could fundamentally change how software quality assurance is approached in the future.

What Happens Next

While the research is promising, the widespread adoption of LLM-powered Chaos Engineering will likely unfold over the next few years. The paper was accepted at the ASE 2025 NIER Track, indicating it’s a forward-looking concept. We might see initial integrations into developer toolkits by late 2025 or early 2026. For example, a cloud provider could offer automated resilience testing as a service, powered by ChaosEater’s principles. Developers should start exploring how LLMs can assist in their testing pipelines now. The industry implications are vast, potentially leading to a new standard for software reliability. This could free up human engineers to focus on more creative and complex problem-solving, rather than repetitive testing tasks.

Ready to start creating?