Why You Care
Ever feel like your AI assistant just isn’t getting it, even after you’ve tried to explain things multiple times? What if the most AI models are not as ‘smart’ at learning from you as we thought? New research has unveiled a significant challenge for Large Multimodal Models (LMMs) – their ability to truly understand and integrate human feedback. This directly impacts how useful and intuitive your AI tools can be in the future.
What Actually Happened
Researchers have introduced a new structure called InterFeedback, according to the announcement. This structure is designed to autonomously assess the ‘interactive intelligence’ of LMMs. Interactive intelligence refers to an AI’s capacity to refine its responses based on user input and feedback. The team also developed InterFeedback-Bench, which uses datasets like MMMU-Pro and MathVerse to evaluate ten different open-source LMMs. What’s more, they created InterFeedback-Human, a dataset of 120 cases specifically for manual testing of top models, as mentioned in the release. This comprehensive approach aims to fill a gap in existing benchmarks, which often overlook this crucial interactive aspect.
Why This Matters to You
This research has practical implications for anyone using or developing AI. Imagine trying to teach an AI a new skill, or refine its output for a creative project. If the AI struggles to interpret your corrections, your experience becomes frustrating. The study highlights that current LMMs, even leading ones, are not yet adept at this crucial interaction.
Think of it as trying to explain a complex recipe to a new chef. If they don’t adjust their technique after your suggestions, the meal won’t improve. This is similar to how LMMs are currently performing with human feedback.
So, what does this mean for your daily interactions with AI? It suggests that while LMMs are impressive, their ability to learn dynamically from your input is still developing. “Our evaluation results indicate that even the LMM, OpenAI-o1, struggles to refine its responses based on human feedback, achieving an average score of less than 50%,” the paper states. This finding underscores the need for better feedback integration mechanisms in AI systems. Do you find yourself repeating instructions to your AI assistant more often than you’d like?
InterFeedback-Bench Evaluation Details
| Dataset | Purpose | Models |
| MMMU-Pro | General interactive intelligence | 10 open-source LMMs |
| MathVerse | Mathematical reasoning with feedback | 10 open-source LMMs |
| InterFeedback-Human | Manual testing of leading models | OpenAI-o1, Claude-Sonnet-4 |
The Surprising Finding
Here’s the twist: despite the impressive capabilities of today’s Large Multimodal Models, their interactive intelligence is surprisingly low. The research shows that even models like OpenAI-o1 performed poorly when tasked with refining responses based on human input. Specifically, the team revealed that OpenAI-o1 scored less than 50% on average when trying to incorporate feedback. This challenges the common assumption that AI inherently learns well from user corrections. We might expect a model capable of generating complex text and images to easily adapt to simple feedback. However, the study finds that interpreting and effectively using feedback is a distinct and underdeveloped skill for these models. This suggests a significant gap between an LMM’s generation abilities and its interactive learning capacity.
What Happens Next
This research points to a clear direction for future AI creation. The team’s findings “point to the need for methods that can enhance LMMs’ capabilities to interpret and benefit from feedback,” as detailed in the blog post. We can expect to see more focus on improving how Large Multimodal Models process and integrate user input over the next 12-18 months. Developers will likely explore new training methodologies and architectural changes to address this. For example, imagine a design tool where your AI assistant truly learns your aesthetic preferences after just a few corrections, rather than needing constant re-instruction. For you, this means future AI assistants should become much more intuitive and less frustrating to use. As a user, consider providing specific, clear feedback to your AI tools, even if they don’t seem to ‘get it’ immediately. This helps generate valuable data for developers. The industry implication is a shift towards more truly adaptive and user-centric AI systems, moving beyond just impressive output generation to genuine interactive intelligence.
