DESAMO: Smart Homes for Elders with Audio LLM

A new system uses embedded AI to understand elderly users' unique audio needs.

Researchers have developed DESAMO, an on-device smart home system specifically designed for elderly users. It leverages an embedded Audio LLM to process raw audio, improving interaction and detecting critical events like falls, overcoming limitations of traditional voice assistants.

By Sarah Kline

August 27, 2025

4 min read

DESAMO: Smart Homes for Elders with Audio LLM

Key Facts

DESAMO is an on-device smart home system for elder-friendly use.
It uses an embedded Audio LLM to process raw audio input directly.
The system can understand user intent and detect critical events like falls or calls for help.
It overcomes limitations of conventional ASR-based voice assistants with unclear speech.
DESAMO was accepted for presentation as a UIST 2025 Poster.

Why You Care

Imagine a smart home system that truly understands your grandparents, even if their speech isn’t perfectly clear. How much peace of mind would that give you? A new creation called DESAMO promises just that. This system is designed to make smart homes more accessible and safer for elderly individuals. It tackles common issues faced by older users with traditional voice assistants, making your loved ones’ lives easier and more secure.

What Actually Happened

Researchers Youngwon Choi, Donghyuk Jung, and Hwayeon Kim have introduced DESAMO. This stands for a ‘Device for Elder-Friendly Smart Homes Powered by Embedded LLM with Audio Modality,’ according to the announcement. It’s an on-device smart home system. Unlike typical voice assistants, DESAMO uses an Audio Large Language Model (LLM) directly. This means it processes raw audio input. The system aims to provide natural and private interactions, as detailed in the blog post. It also handles non-speech audio, which is a significant betterment over current technologies.

Traditional voice assistants often struggle with unclear speech. They rely on ASR-based pipelines (Automatic Speech Recognition) or ASR-LLM cascades. These methods can fail when elderly users speak unclearly, the paper states. DESAMO’s direct audio processing allows for a more understanding of user intent. It can also detect essential events, such as falls or calls for help, the technical report explains.

Why This Matters to You

This system has direct benefits for you and your family. If you have elderly relatives, you know the challenges of ensuring their safety and comfort. DESAMO addresses these specific needs. It offers a more reliable way for them to interact with smart home devices. Think of it as a highly attentive digital companion. It doesn’t just hear words; it understands sounds.

For example, imagine your elderly parent slips and falls. A conventional system might not register their distressed cries. DESAMO, however, is designed to pick up on such non-speech audio cues. This capability could trigger an alert to you or emergency services. This provides a crucial layer of safety. What if smart homes could truly anticipate needs, not just respond to commands?

“While conventional voice assistants rely on ASR-based pipelines or ASR-LLM cascades, often struggling with the unclear speech common among elderly users and unable to handle non-speech audio, DESAMO leverages an Audio LLM to process raw audio input directly, enabling a understanding of user intent and essential events, such as falls or calls for help,” the abstract reveals.

Here’s how DESAMO improves on current systems:

Enhanced Speech Understanding: Better handles unclear or soft speech.
Non-Speech Audio Detection: Recognizes sounds like falls or calls for help.
On-Device Processing: Improves privacy by keeping data local.
Natural Interaction: Makes using smart home features simpler for elders.

The Surprising Finding

Here’s the twist: the most surprising aspect of DESAMO is its ability to directly process raw audio input. This bypasses the typical ASR step entirely. Most people assume voice assistants must convert speech to text first. However, the research shows that this direct approach is more effective for elder-friendly smart homes. It allows for a more understanding of user intent. It also handles essential events like falls or calls for help.

This challenges the common assumption that higher accuracy in speech recognition is the only path to better voice assistants. Instead, DESAMO demonstrates that understanding the context and meaning from raw audio can be more beneficial. Especially for users with unique vocal patterns. This direct audio processing is a significant departure from current industry norms. It opens up new possibilities for accessibility.

What Happens Next

DESAMO was accepted for presentation as a poster at UIST 2025. This indicates it’s still in the research and creation phase. You can expect more detailed findings and possibly prototypes to emerge in late 2025 or early 2026. This initial research suggests a promising future for elder-friendly smart homes.

For example, future versions might integrate with existing smart home platforms. Imagine a DESAMO unit seamlessly connecting to your smart lights, thermostats, and security cameras. This could create a truly responsive environment for your loved ones. Companies developing smart home system should take note. They might consider adopting similar direct audio processing methods. This could vastly improve the user experience for older adults.

As the team revealed, the system is designed for natural and private interactions. This focus on privacy, combined with enhanced understanding, could set a new standard. It offers actionable advice for developers: prioritize raw audio processing for vulnerable user groups. The industry implications are clear: a shift towards more empathetic and context-aware AI for home assistance.

Ready to start creating?