Why You Care
Ever wish your online meetings or virtual events felt more personal and responsive? Imagine an AI assistant that truly understands what’s on your screen and converses naturally. Kaltura’s recent acquisition of eSelf for $27 million promises to make this a reality, according to the announcement. This deal could significantly change how you interact with video content and virtual agents.
What Actually Happened
Kaltura, a Nasdaq-listed company known for its cloud-based video solutions, has acquired eSelf in a $27 million deal. The acquisition, as detailed in the blog post, integrates eSelf’s virtual agent system into Kaltura’s system. eSelf was co-founded in 2023 by CEO Alan Bekker and CTO Eylon Shoshan. Bekker previously sold his first startup, Voca, to Snap in 2020. eSelf brings deep technical expertise in speech-to-video generation and low-latency speech recognition. It also offers screen understanding, allowing avatars to interpret and respond to on-screen content. The eSelf co-founders and their team of approximately 15 AI experts will join Kaltura, the company reports.
Why This Matters to You
This acquisition means more intelligent and interactive video experiences for everyone. Kaltura plans to integrate eSelf.ai’s virtual agent system across its video offerings, as mentioned in the release. This integration aims to enable agents that can listen, speak, and interpret user screens in real time. Think of it as having a super-smart virtual assistant that truly understands your context. For example, imagine a customer service bot that not only hears your question but also sees the error message on your screen and guides you through troubleshooting visually. How might this impact your daily work or learning experiences?
Ron Yekutiel, co-founder and CEO of Kaltura, emphasized the strategic importance of this acquisition. “This acquisition was so strategic. We were actively evaluating multiple companies to find the right fit,” Yekutiel said in an interview with TechCrunch. He added, “We determined that they [eSelf] were for real-time, synchronous conversation — not just video-on-demand lip-syncing — and that they had an impressive speech-to-text and text-to-speech system stack.” This focus on real-time, synchronous conversation is crucial for creating truly natural interactions.
Here’s a breakdown of what eSelf’s system brings:
| Capability | Description |
| Speech-to-Video | Generates video content directly from spoken words. |
| Low-Latency Speech | Processes spoken language with minimal delay for real-time interaction. |
| Screen Understanding | Allows AI avatars to ‘see’ and respond to content on a user’s screen. |
| Real-time Conversation | Enables , natural dialogue, not just pre-recorded responses. |
The Surprising Finding
What’s particularly interesting is the emphasis on real-time, synchronous conversation over simple lip-syncing for video-on-demand. Many might assume AI video agents are primarily about making pre-recorded content look more natural. However, the team revealed that eSelf excels in live, interactive dialogue. Yekutiel highlighted that eSelf was ” for real-time, synchronous conversation — not just video-on-demand lip-syncing.” This challenges the common assumption that AI video is mostly about enhancing existing video files. Instead, the focus is on creating dynamic, on-the-fly interactions. This capability is far more complex than merely animating a mouth to match audio. It suggests a future where AI agents are active participants in live digital interactions.
What Happens Next
Kaltura plans to integrate eSelf’s system across its extensive video offerings. This integration will likely unfold over the next 12-18 months, with initial features possibly appearing in early 2026. Imagine your next virtual corporate town hall featuring an AI assistant that can summarize questions in real-time or pull up relevant data as you speak. The company reports that all eSelf employees are joining Kaltura, ensuring a smooth transition of expertise. This move positions Kaltura to offer more AI-powered video solutions to its over 800 enterprise customers. These clients include tech giants like Amazon, Oracle, and IBM, as well as leading universities, as the company reports. For users, this means more engaging and efficient virtual environments. Your interactions with online platforms could become significantly more intuitive and personalized in the near future.
