New AI Method Evaluates Legal Text Style

CLASE combines linguistic features and LLM insights for accurate Chinese legalese evaluation.

A new hybrid evaluation method called CLASE has been developed to assess the stylistic quality of Chinese legal texts generated by large language models (LLMs). It aims to bridge the gap between factual accuracy and adherence to specialized legal writing norms, providing interpretable scores and improvement suggestions.

By Sarah Kline

February 16, 2026

4 min read

New AI Method Evaluates Legal Text Style

Key Facts

CLASE is a hybrid evaluation method for Chinese legalese stylistic evaluation.
It combines linguistic feature-based scores and experience-guided LLM-as-a-judge scores.
The method was tested on 200 Chinese legal documents.
CLASE achieves higher alignment with human judgments than traditional and pure LLM-as-a-judge methods.
It provides interpretable score breakdowns and suggestions for improvement.

Why You Care

Ever wondered why AI-generated legal documents often sound… off? Even if the facts are right, the tone and style can miss the mark. This isn’t just a minor issue; it can undermine trust and clarity in essential legal contexts. A new method called CLASE (Chinese LegAlese Stylistic Evaluation) aims to fix this for Chinese legal texts. Why should you care? Because if AI can’t master legal style, its utility in complex fields remains limited, impacting your future interactions with AI-powered services.

What Actually Happened

Researchers have introduced CLASE, a hybrid evaluation method for assessing the stylistic quality of Chinese legal text. According to the announcement, large language models (LLMs) often generate legal text that is factually accurate but fails to meet specialized stylistic norms. This is a significant challenge in legal writing. CLASE combines two scoring mechanisms. It uses linguistic feature-based scores and experience-guided LLM-as-a-judge scores. Both components learn from authentic legal documents and their LLM-restored counterparts. This approach creates a transparent and reference-free way to evaluate legal style, as detailed in the blog post.

Why This Matters to You

Imagine you’re a lawyer or a legal professional. You rely on AI tools for drafting documents. If those documents don’t sound professional, they reflect poorly on your work. CLASE offers a approach by focusing on stylistic performance. The method provides interpretable score breakdowns. It also gives suggestions for improvements, as the paper states. This means you can understand why a text isn’t up to par and how to fix it.

Key Benefits of CLASE:

Higher Alignment: Achieves greater agreement with human judgments than traditional metrics.
Interpretability: Offers clear score breakdowns, showing specific areas for betterment.
Scalability: Provides a practical approach for professional stylistic evaluation.
Reference-Free: Does not require a ‘reference’ text for comparison.

For example, if an AI drafts a contract, CLASE could flag overly casual language or incorrect legal phrasing. This allows you to refine the AI’s output with precision. How much more confident would you be in AI-generated legal content if you knew it passed a rigorous stylistic check?

“CLASE provides interpretable score breakdowns and suggestions for improvements, offering a and practical approach for professional stylistic evaluation in legal text generation,” the team revealed.

The Surprising Finding

Here’s the twist: traditionally, evaluating legal style manually is impractical. This is because implicit stylistic requirements are hard for experts to formalize into explicit rules. Existing automatic methods also fall short. Reference-based metrics mix semantic accuracy with stylistic fidelity. Meanwhile, LLM-as-a-judge evaluations can be opaque and inconsistent. The surprising finding is that CLASE, despite these challenges, achieves substantially higher alignment with human judgments. This is true compared to both traditional metrics and pure LLM-as-a-judge methods. The research shows this was demonstrated in experiments on 200 Chinese legal documents. This challenges the assumption that only human experts can truly grasp the nuances of legal style. It suggests a hybrid AI approach can learn these subtleties effectively.

What Happens Next

The introduction of CLASE marks a significant step for AI in legal tech. We can expect to see further creation and adoption of such hybrid evaluation models. The code and data for CLASE are already available, suggesting broader research and application. Within the next 12-18 months, similar methods might emerge for other languages and specialized domains. Imagine a future where AI not only drafts your initial legal documents but also refines them to perfectly match professional standards. For example, law firms could integrate CLASE-like tools into their document review processes. This would ensure higher quality and consistency. Our advice to you: keep an eye on these developments. Understanding these evaluation tools will be crucial for anyone working with AI-generated content in professional settings. This could reshape how legal documents are produced and globally.

Ready to start creating?