New 'Four Over Six' Method Boosts LLM Training Accuracy

Researchers introduce an NVFP4 quantization update for improved large language model performance.

A new method called 'Four Over Six' (4/6) significantly enhances the accuracy of NVFP4 quantization for large language models. This innovation prevents training divergence and improves inference, especially on NVIDIA Blackwell GPUs, making LLM development more efficient.

Sarah Kline

By Sarah Kline

December 2, 2025

4 min read

New 'Four Over Six' Method Boosts LLM Training Accuracy

Key Facts

  • The 'Four Over Six' (4/6) method modifies the NVFP4 quantization algorithm.
  • NVFP4 is a low-precision numerical format used for speed and memory benefits in LLMs.
  • The 4/6 method evaluates two potential scale factors for each block of values to improve accuracy.
  • It prevents training divergence and improves downstream accuracy in LLMs.
  • The method can be efficiently implemented on NVIDIA Blackwell GPUs.

Why You Care

Ever wonder why your favorite AI chatbot sometimes struggles, or why training these complex models takes so much computing power? What if there was a way to make large language models (LLMs) both faster and more accurate? A new creation, called ‘Four Over Six’ (4/6), promises to do just that for LLM training and deployment, according to the announcement. This could mean more reliable AI experiences for you.

What Actually Happened

Researchers have introduced ‘Four Over Six’ (4/6), a significant modification to the NVFP4 quantization algorithm, as detailed in the blog post. NVFP4 is a low-precision numerical format. It’s often used to accelerate large language models, providing speed and memory benefits. However, using NVFP4 can lead to problems like training divergence and performance degradation during inference. The new 4/6 method addresses these issues directly. It evaluates two potential scale factors for each block of values, improving how data is represented.

This betterment is particularly relevant for matrix multiplication operations. These are crucial for both forward and backward passes in LLM training. The team revealed that 4/6 can be efficiently implemented on NVIDIA Blackwell GPUs. This makes it a viable approach for current and future LLM creation.

Why This Matters to You

This technical advancement might sound complex, but its implications for you are straightforward. Imagine an AI assistant that understands your requests with fewer errors. Think of it as upgrading from a blurry photo to a crisp, high-definition image. The ‘Four Over Six’ method improves the underlying precision of how LLMs process information.

This means more stable training for developers. It also leads to more accurate and reliable AI models in the long run. The research shows that 4/6 prevents training divergence in several cases. It brings training loss significantly closer to BF16 (a higher precision format) compared to existing NVFP4 methods. How might more accurate and stable AI models change your daily interactions with system?

“As large language models have grown larger, low-precision numerical formats such as NVFP4 have become increasingly popular due to the speed and memory benefits they provide,” the paper states. This new method makes those benefits more accessible without sacrificing quality. What’s more, the company reports that 4/6 can be easily incorporated into many different post-training quantization methods. It generally improves downstream accuracy for various applications.

Key Benefits of Four Over Six (4/6)

  • Prevents Training Divergence: Ensures more stable and successful LLM training.
  • Improves Inference Accuracy: Leads to more reliable predictions and responses from deployed models.
  • Efficient on Blackwell GPUs: Ready for implementation on modern NVIDIA hardware.
  • Broader Compatibility: Easily integrates with existing post-training quantization techniques.

The Surprising Finding

Here’s an interesting twist: the research team found that floating-point formats like FP4 experience the most quantization error on near-maximal values within each data block. This finding challenges the intuitive idea that errors are evenly distributed. Instead, the study finds these specific value ranges are primarily responsible for downstream performance degradation. This is counterintuitive because one might expect errors to be spread across all value ranges. However, the team discovered that scaling to smaller FP4 values can make the distribution of representable values more uniform for some blocks. This unexpected adjustment significantly improves the representation of those essential near-maximal values. This insight is central to the effectiveness of the 4/6 method.

What Happens Next

The introduction of ‘Four Over Six’ signals a positive direction for LLM creation. We can expect to see this method integrated into future AI training recipes within the next 6-12 months. For example, AI developers training new conversational agents or complex data analysis tools will likely adopt 4/6 to ensure higher accuracy and stability. The technical report explains that this could lead to more models for various industries. This includes healthcare, finance, and customer service. The authors hope this work inspires future efforts in training and deploying models with NVFP4. For you, this means potentially faster, more reliable, and more intelligent AI applications appearing in your daily life. Keep an eye out for updates from major AI structure providers. They will likely incorporate these advancements into their platforms soon.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice