ReleaseBytes
gemini Gemini Blog ·

Gemini 3.5 Live Translate offers real-time voice translation

aiengineer
feature announcement

Google has launched Gemini 3.5 Live Translate, an advanced audio model for real-time speech-to-speech translation. It supports over 70 languages, preserves speaker intonation, and translates continuously, staying just seconds behind the speaker. Available via API for developers, Google Meet for enterprises, and the Google Translate app for consumers, the model aims to improve cross-lingual communication in various applications.

  • Gemini 3.5 Live Translate for real-time speech translation
  • Multiple access points for Gemini 3.5 Live Translate
  • Enhanced multilingual capabilities for developers
  • Improved Google Meet translation experience
  • New listening mode in Google Translate app
Features (5)
  • Gemini 3.5 Live Translate for real-time speech translation

    Gemini 3.5 Live Translate is a new audio model for live speech-to-speech translation that automatically detects over 70 languages. It generates natural-sounding translated speech while preserving the speaker's intonation, pacing, and pitch, operating continuously to remain in sync with the speaker and minimize awkward pauses.

  • Multiple access points for Gemini 3.5 Live Translate

    The model is rolling out across Google products, including a public preview for developers via the Gemini Live API and Google AI Studio, a private preview for enterprises in Google Meet starting this month, and for general users via the Google Translate app on Android and iOS.

  • Enhanced multilingual capabilities for developers

    The Gemini Live API supports streaming speech processing and handles multilingual inputs without manual configuration, offering robustness in noisy environments. Developer platforms like Agora and LiveKit are integrating the API to simplify the creation of voice translation applications.

  • Improved Google Meet translation experience

    Future updates to Google Meet will leverage Gemini 3.5 Live Translate to support over 70 languages, enable conversations across over 2000 language combinations, and provide instant access to speech translation, significantly expanding on previous capabilities.

  • New listening mode in Google Translate app

    For Android users, a new 'listening mode' is being rolled out in the Google Translate app. This mode allows users to hear translations directly through their phone's earpiece, providing a private way to receive translations without headphones.

Notes (1)
  • SynthID watermarking for AI-generated audio

    All audio generated by Gemini 3.5 Live Translate is watermarked with SynthID. This imperceptible watermark is embedded directly into the audio output to ensure AI-generated content is detectable and to help prevent misinformation.

Read the original announcement →

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-live-3-5-translate/