Datumo Raises $15.5M to Challenge Scale AI in LLM Evaluation, Backed by Salesforce

A Seoul-based startup secures significant funding to expand its AI data evaluation services, aiming to address critical safety and responsibility concerns in generative AI.

Seoul-based Datumo has raised $15.5 million in funding, with Salesforce as a key backer, positioning itself to compete directly with industry leader Scale AI. The company focuses on evaluating Large Language Models (LLMs) to ensure safe and responsible generative AI deployment, a growing concern for organizations.

August 12, 2025

4 min read

Datumo Raises $15.5M to Challenge Scale AI in LLM Evaluation, Backed by Salesforce

Why You Care

If you're a content creator, podcaster, or anyone building with AI, the quality and safety of your AI models are paramount. Datumo's recent $15.5 million funding round, backed by Salesforce, signals a significant push to improve how Large Language Models (LLMs) are evaluated, directly impacting the reliability and ethical use of the AI tools you depend on.

What Actually Happened

Seoul-based Datumo, formerly known as SelectStar, has successfully raised $15.5 million in funding. This capital injection is intended to bolster its efforts in LLM evaluation, setting the stage for a direct challenge to established players like Scale AI. According to the announcement, Datumo's CEO, David Kim, a former AI researcher at Korea’s Agency for Defense creation, founded the company in 2018 with five KAIST alumni. His initial frustration stemmed from the laborious nature of data labeling, leading him to conceptualize a reward-based app where individuals could label data for compensation. Even before the app's full creation, Datumo secured pre-contract sales totaling tens of thousands of dollars during a startup competition at KAIST (Korea complex Institute of Science and system). The company reported surpassing $1 million in revenue within its first year and has since secured contracts with major Korean enterprises, including Samsung, LG Electronics, Hyundai, Naver, and SK Telecom. The company's evolution saw clients requesting services beyond simple data labeling, pushing Datumo into more complex AI evaluation.

Why This Matters to You

For content creators and AI enthusiasts, this creation is crucial because it addresses the growing need for reliable and safe generative AI. Many organizations, as reported in the source material, are not fully prepared to use generative AI responsibly. Datumo's focus on LLM evaluation means more reliable tools for assessing AI outputs, which translates to fewer biases, more accurate content generation, and ultimately, more trustworthy AI assistants for your creative workflows. Imagine an AI that generates podcast scripts or marketing copy that is not only coherent but also factually sound and ethically aligned. This funding could accelerate the creation of better evaluation metrics and platforms, providing you with clearer insights into the performance and limitations of the LLMs you integrate into your work. It's about building confidence in AI, knowing that the underlying models have undergone rigorous, independent scrutiny. This can help mitigate risks associated with misinformation or biased outputs, which are significant concerns for anyone publishing content.

The Surprising Finding

One surprising aspect of Datumo's journey is its origin: a reward-based app for data labeling. CEO David Kim's initial idea was to gamify data labeling, allowing anyone to contribute and earn money. This grassroots approach to data collection and annotation, validated at a KAIST startup competition, contrasts with the typical top-down, enterprise-focused strategies often seen in the AI infrastructure space. The company's ability to secure significant pre-contract sales and surpass $1 million in revenue within its first year, even before the app was fully built, underscores the important market demand for data labeling and, subsequently, complex AI evaluation services. This organic growth from a seemingly simple concept into a multi-million dollar venture challenging industry giants highlights the latent demand for flexible and expandable data solutions, a demand that has now evolved into complex LLM evaluation.

What Happens Next

With this $15.5 million injection, Datumo is poised to expand its operations and refine its LLM evaluation capabilities. We can expect to see increased competition in the AI evaluation market, potentially leading to more new and accessible tools for assessing AI model performance, safety, and ethical compliance. The involvement of Salesforce as a backer suggests a strategic interest from a major enterprise software provider in ensuring the quality and reliability of AI, which could lead to tighter integrations of evaluation tools within broader AI creation platforms. For content creators, this means a future where the 'black box' of AI becomes a little more transparent, with clearer standards and benchmarks for what constitutes a 'good' or 'responsible' AI. Over the next 12-24 months, look for new partnerships, improved evaluation frameworks, and perhaps even more accessible tools that allow users to directly assess the quality of AI models before deployment. This push towards better evaluation is a essential step in making generative AI truly enterprise-ready and trustworthy for a wider range of applications.