The latest version of Google AI — Gemini 1.5 Pro — can now hear your voice.
Gemini is Google's rebranded bot formerly known as Bard, and Gemini 1.5 Pro is the latest version of the model, made available to a limited number of developers in February of this year. Capable of processing text, code, video and (now) uploaded audio streams, including audio from videos, Gemini 1.5 Pro can listen to, analyze and extract information from videos without the need for a corresponding paper trail.
In fact, support for audio files means users can use the Gemini 1.5 Pro to gather information from earnings calls, transcribe recorded interviews, or analyze videos with audio (basically any type of audio file). The AI can process cues containing 1 hour of video, 11 hours of audio, 30,000 lines of code, or over 700,000 words in a single stream.
Google is also offering a public preview of Gemini 1.5 Pro to those with access to Vertex AI, but there is no public beta testing yet. Currently, most users interact with Google's artificial intelligence through the Gemini chatbot.