Google Shows What Gemini Omni and Gemini 3.5 Can Do in New Videos
Google released nine demonstration videos of Gemini Omni and Gemini 3.5 following their presentation at Google I/O 2026. We review what they show and what it means for the industry.
Google I/O 2026 put two names on the table regarding models: Gemini Omni and Gemini 3.5. Beyond the on-stage announcements, the Google AI Blog published last Friday, May 29 a collection of nine demonstration videos designed to showcase these capabilities in concrete situations, without slide presentations in sight.
The format matters: rather than benchmarks or comparison tables, Google opts to demonstrate real use cases captured on video. It is a deliberate communication choice that has been standard practice at OpenAI for some time, and Google adopts it clearly here.
What the demos show
The nine videos cover a broad range of scenarios. Without detailed official transcripts available beyond the post, the clips demonstrate multimodal capabilities: real-time video understanding, reasoning about images, voice interaction with natural responses, and code generation with visual context. Gemini Omni emerges as the model oriented toward integrating multiple input and output modalities simultaneously, while Gemini 3.5 is presented as an update focused on reasoning and precision in complex tasks.
The emphasis on smooth transitions between modalities, passing from voice to text to image within the same conversation, is the thread running through several of the demos. This is the type of integration that various labs have been promising for months on paper, but which in practice shows notable friction.
Why it matters and for whom
For teams working with language model integrations, the launch of Gemini Omni is relevant for one concrete reason: if simultaneous multimodal capability works as shown, it reduces the need to chain different specialized models for voice, image, and text. That simplifies architectures and can lower operational costs in complex pipelines.
For developers already working with the Google ecosystem, Vertex AI, Google AI Studio, Gemini APIs, these demos signal where the stack is headed. It is not neutral information: Google is marking what type of use cases it wants you to build on its infrastructure.
For the broader ecosystem, including Claude, the move has direct implications: competition in multimodal capabilities intensifies. Anthropic has its own bets in that space, but Google has structural advantages in integration with proprietary hardware (TPUs) and distribution through its mass-market consumer products.
What the demos don't answer
Demonstration videos have one obvious limitation: they show what the product team chose to show, under the conditions they selected. There is no public information yet about real latencies in production, pricing for accessing Gemini Omni, or when these capabilities will be available outside controlled demos.
It is also unclear how much Gemini 3.5 represents a substantial improvement over previous versions in mathematical reasoning or code tasks, which are the metrics where labs compete with greater technical detail. For that, we will need to wait for independent evaluations.
Editorial perspective
Nine well-produced videos are a good communication tool, but they do not substitute for technical documentation or real access for developers. When that arrives, we will have a clearer picture of whether Gemini Omni changes anything in practice or if it remains, for now, an appealing promise.
Sources
Read next
An astrophysicist uses Codex to simulate black holes
Chi-kwan Chan uses OpenAI's Codex to build black hole simulations and test Einstein's general relativity. Here's how it works in practice.
Google vibe-codes an I/O 2026 quiz with AI Studio
Google used its own AI Studio to build an interactive quiz about I/O 2026 announcements through vibe coding. A dogfooding exercise that reveals more than it might seem.
How Attackers Exploit the 'Personality' of AI Chatbots
Jailbreak techniques have evolved from simple text tricks to attacks that manipulate the identity and role assigned to models. Here's what's happening.