Google has launched Gemini 3.5 Live Translate, a real-time audio translation system that works across more than 70 languages. The model is designed to translate spoken language on the fly while maintaining key aspects of the original speaker's delivery.
Real-Time Translation with Tone Preservation
The model automatically detects which language is being spoken and translates continuously without waiting for a sentence to finish. Google says it also preserves the speaker's tone, pace, and pitch during translation, making the output sound more natural. This marks an advancement over traditional translation systems that often produce flat or robotic speech.
Wide Availability Across Platforms
Gemini 3.5 Live Translate is now available for developers through the Gemini Live API and Google AI Studio. Businesses can access a preview of the feature within Google Meet. For regular users, the translation capability is rolling out in the Google Translate app on both Android and iOS.
In Google Meet, language support has expanded from just five languages to more than 70, offering over 2,000 possible language combinations. This expansion makes real-time translation far more accessible in video meetings.
Business Testing and Early Adoption
Ride-hailing company Grab is reportedly testing the model to help translate conversations between drivers and passengers. This use case highlights the potential for real-time translation in service industries where language barriers are common. Google has not confirmed whether Grab has officially deployed the feature, but the testing indicates interest from commercial partners.
Watermarking AI-Generated Audio
Stay updated
Get the day's AI and automation news in your inbox. No spam, unsubscribe anytime.
All audio generated by Gemini 3.5 Live Translate carries an inaudible SynthID watermark. SynthID is a technology developed by Google DeepMind that embeds a digital watermark into AI-generated content, allowing it to be identified later. This watermark is not audible to the human ear but can be detected by software tools. The approach is part of Google's broader effort to label AI-produced material transparently.
Background on Gemini
Gemini is Google's family of multimodal AI models, capable of processing text, images, audio, and video. The 3.5 version represents a recent iteration that focuses on improving speed and efficiency for real-time applications like live translation. Google has invested heavily in translation technology for years through Google Translate, which already supports over 100 languages for text translation. The addition of real-time voice translation with tone and pitch preservation is a step toward more natural communication across language divides.
Google's announcement comes as competition among AI translation tools intensifies. Other companies, including Microsoft and Meta, have also released translation models. However, Gemini 3.5 Live Translate's combination of automatic language detection, continuous translation, and tone preservation distinguishes it from earlier efforts.
The launch of Gemini 3.5 Live Translate could have significant implications for international business, travel, and cross-cultural communication. By lowering the barrier of language in real-time conversation, Google aims to make global interaction simpler and more fluid.
For developers, the availability through the Gemini Live API means they can integrate the feature into their own apps and services. Google AI Studio provides a platform for experimenting with the model before deployment. The Google Meet preview allows enterprise users to test the feature in live meetings, with broader availability expected later.
The Google Translate app update is likely the most visible change for consumers. Users can now start a live conversation and have it translated aloud, with the system handling language detection automatically. The app already supported some voice translation, but the new model brings faster and more natural speech output.

