AI Learns to Judge Art: Multimodal LLMs Grasp Aesthetics

New research shows AI can move beyond simple visual appeal to understand deeper artistic meaning.

Multimodal Large Language Models (MLLMs) are learning to evaluate art with human-like aesthetic judgment, according to new research. By suppressing 'hallucinations' with an evidence-based approach, MLLMs can provide in-depth artistic reasoning. This development could transform AI art tutoring and image generation.

Katie Rowan

By Katie Rowan

September 16, 2025

3 min read

AI Learns to Judge Art: Multimodal LLMs Grasp Aesthetics

Key Facts

  • Multimodal LLMs (MLLMs) can reason about aesthetics in a zero-shot manner.
  • MLLMs initially exhibit 'hallucinations' in aesthetic reasoning, producing subjective opinions.
  • An evidence-based approach, ArtCoT, suppresses these hallucinations.
  • ArtCoT helps MLLMs produce in-depth aesthetic reasoning aligned with human judgment.
  • Applications include AI art tutoring and reward models for image generation.

Why You Care

Ever wondered if an AI could truly ‘get’ art, not just generate pretty pictures? Can a machine appreciate beauty like you do? New research reveals that Multimodal Large Language Models (MLLMs) are making strides in understanding art’s deeper meaning. This could dramatically change how you interact with AI art and even how AI learns about creativity.

What Actually Happened

Researchers Ruixiang Jiang and Changwen Chen have explored how MLLMs can perform aesthetic judgment, according to the announcement. Their paper, “Multimodal LLMs Can Reason about Aesthetics in Zero-Shot,” investigates the complex process of aesthetic sensibility. This goes beyond mere visual appeal, which current computational methods often overlook. The team revealed a method to elicit aesthetic reasoning from MLLMs. These models combine different data types, like text and images, to understand context. They found that MLLMs, when properly guided, can produce in-depth aesthetic analysis that aligns with human judgment, as mentioned in the release.

Why This Matters to You

This new capability for AI to reason about aesthetics has practical implications. Imagine an AI art tutor that doesn’t just correct brushstrokes but explains the emotional impact of your color choices. Think of it as having an art critic in your pocket, offering nuanced feedback. This could significantly enhance your creative process.

However, a essential challenge emerged during their research. “MLLMs exhibit a tendency towards hallucinations during aesthetic reasoning, characterized by subjective opinions and unsubstantiated artistic interpretations,” the paper states. This means AI might invent reasons for its judgments. To counter this, the researchers developed ArtCoT, a baseline that promotes evidence-based reasoning. This principle helps MLLMs produce more objective and multifaceted evaluations.

What kind of art could you create or analyze with such a tool?

Application AreaBenefit for You
AI Art TutoringReceive deeper, more meaningful feedback
Image GenerationCreate AI art with more human-like aesthetic appeal
Art AnalysisGet objective, in-depth critiques of artwork
Creative InspirationDiscover new perspectives on your creations

The Surprising Finding

Here’s the twist: While MLLMs can reason about aesthetics, they initially struggle with accuracy. The research shows that these models tend to ‘hallucinate’ — generating subjective opinions or artistic interpretations without factual basis. This was a significant hurdle. However, the study finds that these hallucinations can be suppressed. By employing an evidence-based and objective reasoning process, MLLMs can overcome this limitation. This approach, substantiated by their proposed baseline called ArtCoT, allows the AI to provide more reliable aesthetic judgments. It challenges the common assumption that AI’s creative interpretations are inherently flawed or unreliable. Instead, it suggests a path to more credible AI-driven art criticism.

What Happens Next

This research, presented at ACM MM 2025, paves the way for exciting future developments. We can expect to see these improved MLLMs integrated into various applications within the next 12 to 18 months. For example, future AI image generators could use these aesthetic reasoning models as ‘reward models.’ This means the AI would learn to create images that are not just visually appealing but also possess deeper artistic impact, aligning with human aesthetic values. For you, this could mean AI-generated art that truly resonates. The company reports this work aims for AI systems that can genuinely understand, appreciate, and contribute to art. Your future interactions with AI could involve much more artistic dialogue. This will likely influence how artists use AI tools and how we all perceive AI’s creative potential.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice