AI System ELM Automates Cancer Diagnosis, Saves 900 Hours

A new hybrid AI model significantly improves tumor classification in cancer registries, boosting efficiency.

A novel AI system, ELM, combines small and large language models to automate tumor group classification from pathology reports. This innovation has reduced manual review by 60-70% at the British Columbia Cancer Registry, saving an estimated 900 person-hours annually while maintaining high accuracy.

By Katie Rowan

March 21, 2026

4 min read

AI System ELM Automates Cancer Diagnosis, Saves 900 Hours

Key Facts

ELM is a hybrid AI system for automated tumor group classification.
It combines small, encoder-only language models with large language models (LLMs).
ELM has been deployed at the British Columbia Cancer Registry.
It reduced manual review requirements by 60-70%, saving 900 person-hours annually.
The system achieved a weighted precision and recall of 0.94 on a test set of 2,058 reports.

Why You Care

Imagine a world where AI helps doctors diagnose cancer faster and more accurately. What if this system could also free up medical staff from tedious, time-consuming tasks? A new creation in artificial intelligence (AI) is doing just that, directly impacting how cancer data is managed. This creation could mean quicker insights into cancer trends and better patient outcomes. Your health data, or that of your loved ones, could be processed with greater efficiency and precision.

What Actually Happened

Researchers have introduced ELM (Ensemble of Language Models), a hybrid AI system designed to automate tumor group classification. This system processes unstructured pathology reports, a task traditionally requiring extensive manual effort, according to the announcement. Population-based cancer registries (PBCRs) typically spend about 900 person-hours annually classifying 100,000 reports. Existing rule-based systems struggle with the complex language found in these medical documents. ELM combines smaller, specialized language models with larger, more general ones. The system uses an ensemble of six fine-tuned ‘encoder-only’ models, which are smaller AI programs focused on understanding specific parts of text. These models analyze different sections of each report. If at least five of these six models agree on a tumor group, it’s assigned. Otherwise, a larger language model (LLM) steps in to make the final decision, guided by a carefully selected prompt.

Why This Matters to You

This new approach offers significant practical implications for healthcare and data management. ELM has already been deployed at the British Columbia Cancer Registry, as mentioned in the release. The company reports it has reduced manual review requirements by approximately 60-70%. This translates to substantial time savings, estimated at 900 person-hours each year. This efficiency gain is achieved while maintaining rigorous data quality standards. Think of it as having an expert assistant who can quickly sift through mountains of complex medical text. For example, if you or a family member has a cancer diagnosis, accurate and timely data classification is crucial for research and public health initiatives. This system helps ensure that vital information is categorized correctly and efficiently. How might these time savings allow medical professionals to focus more on patient care or research?

Here are some key performance improvements ELM delivered:

Weighted Precision and Recall: 0.94 (statistically significant betterment)
Leukemia F1-score: Improved from 0.76 to 0.88
Lymphoma F1-score: Improved from 0.76 to 0.89
Skin Cancer F1-score: Improved from 0.44 to 0.58

Lovedeep Gondara, one of the authors, stated, “ELM represents the first successful deployment of a hybrid small, encoder only models-LLM architecture for tumor group classification in a real-world PBCR setting.” This highlights the practical success of combining different AI model types.

The Surprising Finding

What’s particularly interesting about ELM is its hybrid architecture and its real-world success. It defies the common assumption that only the largest, most language models can handle complex medical text. The study finds ELM achieves high accuracy by strategically combining smaller, specialized models with a larger one for arbitration. This ‘ensemble’ approach allows the system to maximize text coverage, even with token limits (the maximum amount of text an AI can process at once). The team revealed that ELM achieved weighted precision and recall of 0.94 on a test set of 2,058 pathology reports. This is a statistically significant betterment (p<0.001) over encoder-only ensembles, which had an F1-score of 0.91. It also substantially outperforms older rule-based methods. This shows that a clever combination of AI tools can sometimes be more effective than relying on a single, massive model.

What Happens Next

The successful deployment of ELM at the British Columbia Cancer Registry sets a precedent for other healthcare institutions. We can expect to see similar hybrid AI architectures adopted in other PBCRs over the next 12-24 months. The technical report explains that this model demonstrates how strategic combination of language models can achieve both high accuracy and operational efficiency. Imagine this system being applied to other areas of medical record keeping, such as classifying disease severity or treatment responses. This could further streamline administrative tasks and improve data quality across the board. For you, this means potentially faster processing of medical information, leading to more efficient healthcare systems. Our advice for organizations is to explore how hybrid AI models can address their specific data classification challenges. The industry implications are clear: AI is moving beyond single-model solutions towards more integrated, specialized systems for complex tasks.

Ready to start creating?