Tolan's Voice AI Leaps Forward with GPT-5.1 for Natural Chats

Portola unveils a voice-first AI companion, Tolan, designed for engaging, open-ended conversations powered by OpenAI's latest model.

Tolan, a new voice-first AI companion, is leveraging GPT-5.1 to create highly responsive and context-aware conversations. Developed by Portola, this AI focuses on stable personalities and low latency, moving beyond simple text prompts to natural dialogue.

Mark Ellison

By Mark Ellison

January 8, 2026

4 min read

Tolan's Voice AI Leaps Forward with GPT-5.1 for Natural Chats

Key Facts

  • Tolan is a voice-first AI companion built by Portola.
  • It leverages OpenAI's GPT-5.1 models for low latency, accurate context, and stable personalities.
  • Tolan is designed for open-ended, ongoing dialogue, not just quick prompts.
  • The system rebuilds its context window from scratch each turn for real-time adaptation.
  • GPT-5.1 and the Responses API cut speech initiation time by over 0.7 seconds.

Why You Care

Ever wished your AI assistant could truly understand you, even when you change topics mid-sentence? Imagine a digital companion that remembers your past conversations and maintains a consistent personality. This isn’t just science fiction anymore; it’s the reality Tolan is building. What if your voice interactions with AI felt as natural as talking to a friend?

OpenAI recently highlighted how Portola, a team with a strong track record, developed Tolan. This voice-first AI companion aims to redefine how we interact with artificial intelligence. It promises more engaging, open-ended, and personalized voice experiences for you.

What Actually Happened

Portola has launched Tolan, a voice-first AI companion, significantly enhanced by OpenAI’s GPT-5.1 models, according to the announcement. Tolan allows people to converse with a personalized, animated character that learns over time. The company reports that this app is specifically designed for ongoing, open-ended dialogue, moving beyond quick prompts and replies. Quinten Farmer, co-founder and CEO of Portola, stated, “We saw the rise of ChatGPT and knew voice was the next frontier.”

Voice AI presents unique challenges, particularly regarding latency (the delay in response time) and context management. However, it also enables richer, more exploratory interactions than text-based systems. The team focused on two essential areas: memory and character design. They built a character-driven universe with real-time context management. This system ensures personality and memory remain consistent as conversations evolve, as mentioned in the release.

Why This Matters to You

This creation means your interactions with AI are about to get a lot smoother and more personal. Tolan’s architecture addresses the core demands of voice interactions. Voice users expect , natural responses, even when conversations shift unexpectedly. Tolan aims to respond quickly, track changing topics, and maintain a consistent personality without lag or changes in tone.

For example, imagine you’re discussing a recipe with Tolan, then suddenly remember you need to pick up milk. You can seamlessly transition to asking about grocery stores nearby without the AI getting confused or losing track of your previous conversation. This adaptation is crucial for a natural feel.

How often have you felt frustrated by an AI that forgets what you just said? Tolan’s approach tackles this directly. The research shows that introducing OpenAI GPT-5.1 and the Responses API cut speech initiation time by over 0.7 seconds. This reduction is enough to noticeably improve the conversational flow for you.

Quinten Farmer emphasized the importance of this advancement, stating, “GPT-5.1 gave us the steerability to finally express the characters we had in mind. It wasn’t just smarter—it was more faithful to the tone and personality we wanted to create.” This steerability allows for more nuanced and consistent AI characters.

Here’s how Tolan manages context for natural conversations:

  • Context Reconstruction: Each turn, Tolan rebuilds its context window from scratch.
  • Information Pulled: This includes a summary of recent messages, a persona card, vector-retrieved memories, tone guidance, and real-time app signals.
  • Adaptability: This architecture allows Tolan to adapt in real time to abrupt topic shifts.

The Surprising Finding

What truly stands out is Tolan’s unique approach to context management. Unlike many AI agents that simply cache (store temporarily) previous prompts across multiple turns, Tolan rebuilds its entire context window from scratch each time. This is a technically intensive process, as detailed in the blog post, yet it’s foundational to Tolan’s success in natural voice interactions.

This finding is surprising because it challenges the common assumption that simply storing more past data is always better. Instead, Tolan prioritizes a dynamic, real-time reconstruction. Quinten explained, “We realized quickly that cached prompts just didn’t cut it. Users change subjects all the time. To feel , the system had to adapt midstream.” This dynamic approach allows for remarkable adaptability when users abruptly change topics, which is a common occurrence in natural speech.

What Happens Next

Looking ahead, we can expect to see more voice-first AI applications adopting similar context management techniques. Portola’s work with Tolan indicates a clear path for future AI creation. The industry implications are significant, pushing towards more human-like and less frustrating digital interactions.

For example, imagine future educational tools or customer service bots that can follow complex, winding conversations without losing their way. This system could lead to more effective and personalized learning experiences. Users might see these voice AI features rolling out in various applications within the next 12 to 18 months, according to industry trends.

For you, this means a future where your voice AI understands you better. Actionable advice for readers is to explore new voice AI applications as they emerge. Pay attention to how well they handle context and maintain personality. These factors will define the next generation of voice interaction.

Ready to start creating?

Create Voiceover

Transcribe Speech

Create Dialogues

Create Visuals

Clone a Voice