Google Gemini Live Enhances Real-Time Conversations with AI Integration

Google’s new Gemini Live feature allows users to integrate images, files, and videos into real-time conversations. Gemini Live is Google’s response to ChatGPT’s Advanced Voice Mode. This feature is now available in Germany, and users of the free Gemini version can also access it. The feature was launched for Android users in over 40 languages, including German, in the fall. With Gemini, users can now accomplish tasks, brainstorm effectively, communicate, and organize thoughts digitally. Android users can do this simultaneously with just one prompt.

At the Samsung Galaxy Unpacked Event, Google introduced the latest updates for Gemini and Gemini Live. The focus is on simultaneous app usage thanks to AI assistance, which helps save time. With Gemini Live, you can let your ideas flow, stay informed, or just have a conversation. Users can now integrate images, videos, and files into their conversations. Sissie Hsiao, Google’s Vice President and General Manager for the Gemini App and Speech, explains that this option allows users to quickly switch from a work process on their smartphone to a conversation with Gemini Live to get explanations, backgrounds, or additional support.

Hsiao states, “As Gemini Live is developed for Android, you can easily transition from what you’re doing on your phone to a conversation about it. And from today, Gemini Live becomes even more versatile, allowing you to insert images, files, and YouTube videos into your conversation.” Users can, for example, evaluate their photography settings on a picture or ask a question about a relevant YouTube video. This function is initially available for the Samsung Galaxy S24 and S25 series, as well as Google Pixel 9 devices. It will be rolled out for more devices in the coming weeks.

In the next few months, Android devices with the Gemini App and the Samsung Galaxy S25 will have the option to access Project Astra. Through Project Astra, users can use AI on Android phones to conduct conversations in various languages, incorporate products like Lens, Maps, and Search, recall conversations, and receive AI responses with lower latency. Google showcases how these possibilities might enhance the discovery of London in a video.

Since spring 2024, Google’s Gemini allows integration with other apps like Google Maps, YouTube, or Google Drive via Extensions, enabling cross-service tasks. A new option is to include several of these Extensions with a single prompt. Sissie Hsiao explains, “For example, if you’re looking for a list of protein-rich lunch ideas, you can ask Gemini for recipes and then easily save them in a note directly in Samsung Notes or Google Keep.” You can also search for a specific location on Google Maps and send the results to a contact with the same prompt.

These Multi-Extension Prompts are available for all Gemini users on the web, Android, and iOS. However, the option to call up the Gemini App directly on the screen by holding down the Side Button is limited to Samsung Galaxy S25 devices.

Google introduced more updates for users at the Samsung Galaxy Unpacked Event. These include the new Now Bar, available for Samsung Galaxy S25 users, displaying current sports scores or Google Maps directions on the locked device. Google plans to roll out Deep Research in the Gemini App next week. This feature provides advanced users with extensive research information from the AI assistant, potentially saving time, though it may cause issues if unfiltered information is used.

In the USA, users can connect smartphones with Galaxy Watch7 LTE Smartwatches. An update for the search function Circle to Search is also expected, integrating AI Overviews for users to see AI-supported search results with information and links. One-Tap-Actions improve usability for visual searches, allowing direct identification of phone numbers, email addresses, or URLs with a matching CTA, such as calling or visiting the URL.

Google’s various AI updates are supported by the powerful Gemini 2.0 model, which you can already test. It is Google’s most advanced AI model to date, offering multimodal options with native audio and visual input. CEO Sundar Pichai stated, “Today, we are excited to introduce the next era of models for this new agent-based age: Gemini 2.0, our most powerful model yet. With new advances in multimodality, such as native image and audio output, and native tool usage, it will allow us to develop new AI agents that bring us closer to our vision of a universal assistant.”