Tech1 hr ago

Gradium AI CEO Says Voice AI Still Far From 'Her' Level

Gradium AI chief says voice assistants are still far from the human-like conversation shown in the film Her, outlining current limits and future focus.

Alex Mercer/3 min/GB

Senior Tech Correspondent

TweetLinkedIn
Neil Zeghidour on Voice AI's 'Her' Moment

Neil Zeghidour on Voice AI's 'Her' Moment

Source: StartuphubOriginal source

Gradium AI’s Neil Zeghidour warns that voice AI has not yet reached the fluid, empathetic dialogue shown in the movie *Her*.

Context At an AI Engineer gathering, Zeghidour framed the industry’s ambition as a “Her moment” – a reference to the film where an operating system engages in deep, human‑like conversation. He noted that while voice assistants can transcribe speech and generate spoken responses, they lack the nuanced understanding and emotional resonance of a real person.

Key Facts - Gradium AI’s mission is to make natural voice the primary way people interact with artificial intelligence, unlocking capabilities that remain underused. - The company builds and trains voice models for three core tasks: speech‑to‑text (converting spoken words into written form), text‑to‑speech (producing spoken output from text), and speech‑to‑speech translation (changing spoken language from one tongue to another while preserving voice characteristics). - Zeghidour emphasized that current models still stumble on context retention, tone adaptation, and genuine empathy—elements essential for a “Her‑level” experience.

What It Means Gradium AI’s focus on foundational voice technology suggests the industry will see incremental improvements rather than a sudden breakthrough. By strengthening the building blocks of speech‑to‑text, text‑to‑speech, and speech‑to‑speech translation, the company aims to create voice agents that feel more fluid and less robotic. However, Zeghidour’s caution signals that developers and investors should temper expectations about near‑term, fully conversational AI.

The next milestone will be measuring how well voice models handle multi‑turn dialogue and emotional cues. Watch for announcements on advanced context‑aware architectures and partnerships that could bring voice AI closer to the human‑like interaction long promised by the “Her” vision.

TweetLinkedIn

More in this thread

Reader notes

Loading comments...