New AI Method Boosts Text-to-SQL Accuracy for LLMs

Researchers introduce SPFT-SQL, a novel approach to enhance large language models' ability to convert natural language into database queries.

A new research paper details SPFT-SQL, a method that significantly improves how large language models (LLMs) translate everyday language into SQL queries. This innovation addresses a key challenge in AI, making database interactions more intuitive and accurate for users.

Katie Rowan

By Katie Rowan

September 10, 2025

4 min read

New AI Method Boosts Text-to-SQL Accuracy for LLMs

Key Facts

  • SPFT-SQL is a new self-play fine-tuning method for Text-to-SQL tasks.
  • It addresses challenges faced by previous methods like SPIN in Text-to-SQL.
  • SPFT-SQL uses a verification-based iterative fine-tuning approach.
  • It employs an error-driven loss method during self-play to improve accuracy.
  • The method was tested on six open-source LLMs and five benchmarks, outperforming state-of-the-art.

Why You Care

Ever wished you could just ask your database a question in plain English and get a precise answer? Imagine telling your computer, “Show me all sales from last quarter for products over $100,” and it instantly delivers the exact data. This isn’t a distant dream anymore. New research is making this a reality. It promises to transform how we interact with vast amounts of information. What if you could unlock your data’s full potential just by speaking naturally?

What Actually Happened

Researchers have unveiled a new method called SPFT-SQL. It enhances large language models (LLMs) for the complex task of Text-to-SQL parsing, according to the announcement. This involves converting human language questions into structured database queries (SQL). Previous methods, like self-play fine-tuning (SPIN), faced challenges. SPIN struggled because it didn’t generate new information. Also, too many correct SQL queries from an ‘opponent’ model could actually reduce the main model’s accuracy, the paper states. SPFT-SQL tackles these issues head-on. It introduces a two-phase approach. First, a verification-based iterative fine-tuning step synthesizes high-quality training data. This builds a strong base for the model. Second, during self-play, an ‘error-driven loss’ method is used. This encourages the opponent model to produce incorrect outputs. This helps the main model learn to distinguish correct SQL from errors, as detailed in the blog post. This unique strategy improves the LLM’s ability to generate accurate SQL.

Why This Matters to You

This creation holds significant implications for anyone working with data. Think about how much time you spend crafting precise SQL queries. SPFT-SQL could drastically reduce this effort. It makes data access more democratic. You won’t need to be a SQL expert to retrieve complex information. Imagine you are a marketing manager. You need to analyze customer demographics across different regions. Instead of asking an IT specialist, you could simply type your request. The system would generate the correct SQL and pull the report instantly. This allows you to focus on analysis, not syntax. How much more efficient could your workday become with this capability?

Here’s how SPFT-SQL enhances LLMs for Text-to-SQL:

  • Verification-based Iterative Fine-Tuning: Synthesizes high-quality training data based on database schema and validation feedback. This creates a foundation.
  • Error-Driven Loss Method: Incentivizes incorrect outputs from an opponent model. This helps the main model learn to differentiate between correct and erroneous SQL.
  • Improved Accuracy: The overall process significantly boosts the LLM’s ability to generate accurate SQL queries.

One of the authors, Yuhao Zhang, explained the core challenge. He said, “Despite the significant advancements of self-play fine-tuning (SPIN), which can transform a weak large language model (LLM) into a strong one through competitive interactions between models of varying capabilities, it still faces challenges in the Text-to-SQL task.” This new method directly addresses those challenges. It aims to make LLMs more reliable for database interactions. Your daily tasks involving data could become much simpler and faster.

The Surprising Finding

Here’s the twist: The researchers found that too many correct SQL queries during the self-play process could actually hinder the main model’s performance. This seems counterintuitive. You would expect more good examples to always be better. However, the study finds that “the large number of correct SQL queries produced by the opponent model during self-play reduces the main model’s ability to generate accurate SQL queries.” This is surprising because it challenges the common assumption that more examples lead to better learning. Instead, the team revealed that by introducing an ‘error-driven loss’ method, the main model learns to distinguish correct SQL from erroneous SQL. This makes the model more . It’s like learning to spot a fake by studying forgeries, not just originals. This approach helps the model become more discerning and ultimately more accurate.

What Happens Next

This research, presented at EMNLP 2025 Findings, suggests a promising future. We can expect to see these Text-to-SQL advancements integrated into various AI tools. Companies might begin incorporating SPFT-SQL-like techniques into their enterprise software by late 2025 or early 2026. For example, business intelligence platforms could allow users to generate complex reports using natural language. Database administrators might use these tools to automate routine query generation. For you, this means more intuitive data exploration. Keep an eye out for updates from major AI providers. They will likely adopt similar methodologies. This will make interacting with databases as simple as having a conversation. The industry implications are clear: more accessible data for everyone, regardless of their technical background. This could fundamentally change how businesses operate and make decisions.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice