Google I/O 2025: a $249.99 AI agent, video generators and other innovations

ForkLog

10 months ago

On May 20 at the Google I/O 2025 conference, the company unveiled a raft of new AI products, including an image generator, video tools, a filmmaking app, a translator in Google Meet and more.

$249.99 for Google AI Ultra

Google launched a new AI Ultra plan at $249.99 a month. It provides “the highest level of access” to the company’s AI apps and services. The subscription includes the new Google Veo 3 video generator, the Flow filmmaking app and the powerful Gemini 2.5 Pro Deep Think model (not yet launched).

Other Google AI Ultra options:

higher limits on the NotebookLM and Whisk platforms;
access to the Gemini chatbot in Chrome;
agent tools based on Project Mariner technology;
YouTube Premium;
30 TB of storage across Google Drive, Google Photos and Gmail.

One of the agent tools is Agent Mode. It can browse web pages, conduct research and integrate with Google apps to execute specific tasks. Its launch is expected “soon”.

“Ultra is a programme for those who want to be on the front line of artificial intelligence from Google,” said Josh Woodward, vice president of Google Labs and Gemini.

The AI Ultra subscription is available for now only in the US.

Google joins a growing list of firms launching pricey plans. In December 2024, OpenAI released ChatGPT Pro at $200 a month. In April, AI startup Anthropic set the same price for Max.

Veo 3 — video with sound

Veo 3 is a new AI model for generating video and audio accompaniment such as effects, noise and dialogue. The company stressed the product’s superiority over the previous Veo 2 in the quality of its output.

“For the first time we are coming out of the age of silence in video creation. [You can give Veo 3] a prompt for character and environment characteristics and propose dialogue with a description of how it should sound,” said Google DeepMind CEO Demis Hassabis.

cooking up something tasty for tomorrow… pic.twitter.com/wyIRMsXkFG

— Demis Hassabis (@demishassabis) May 19, 2025

The model is available in the Gemini app for subscribers to the Google AI Ultra plan.

The appearance of Veo 3 was likely made possible by DeepMind’s work in the field. In June last year, Google’s AI division began developing a technology based on artificial intelligence for generating soundtracks for video.

Improvements were also presented for Veo 2 — it can now be given images of characters, scenes, objects and styles to improve consistency. It understands camera motion, can add or remove objects from a clip and can expand frames — for example, turning vertical video into horizontal.

The new Veo 2 features will become available on the Vertex AI platform.

Imagen 4 — image generator

Google brought to market a new AI model for creating images — Imagen 4. It can visualise fine details, such as fabrics, water droplets and animal fur, and work with photorealistic and abstract styles.

Imagen 4 delivers visuals that pop with richer details, more nuanced color, and better text outputs.

Everyone can make images for free in the Gemini App today: https://t.co/awhPeHZIqm #GoogleIO pic.twitter.com/nnI8ZGIELv

— Google Gemini App (@GeminiApp) May 20, 2025

The model delivers higher-quality results than Imagen 3 and can create illustrations in different aspect ratios at resolutions up to 2K.

“We also put a lot of emphasis on improving text generation and typography, so the model is great for creating slides, invitations or any other materials where you need to combine images and text,” Woodward stressed.

The tool is available in the Gemini app, on the Google Whisk and Vertex AI platforms, and in Google Slides, Vids, Docs and other Google Workspace products.

Flow — film generator

At Google I/O 2025 the company announced Flow, a new AI model for creating films. It integrates three tools:

Veo for generating video;
Imagen for generating images;
Gemini for working with text and prompts.

Introducing Flow: a new type of AI filmmaking tool that combines the best of Veo, Imagen and Gemini — built with and for creatives.

Flow helps you maintain character and visual consistency from one clip to the next.

See how emerging filmmakers are using it ? pic.twitter.com/H0cBv6IGs1

— Google (@Google) May 20, 2025

Flow lets you import characters or scenes, or create these elements directly inside the tool. It offers camera controls for changing angle or perspective, a scene builder and asset-management features.

In addition, the company is launching Flow TV — a feed of video clips and content with the exact prompts used to create them. The service will help users understand creators’ process.

Smart glasses

Google is joining the smart-glasses race, announcing partnerships with Gentle Monster and Warby Parker to build an Android XR-based gadget.

Android XR is a platform for extended-reality (XR) devices launched last year in partnership with Qualcomm and Samsung.

The company said it is deepening its partnership with Samsung to develop XR glasses. The two firms are building the software and hardware platform.

At the conference, Google showed a concept of Android XR glasses with Gemini artificial intelligence. They are equipped with a camera, microphone, speakers and a display for viewing notifications.

Google Android XR Glasses ? Live Demo#GoogleIO pic.twitter.com/qoGK4rs2z4

— Ben Geskin (@BenGeskin) May 20, 2025

Google plans to allocate up to $150 million to co-develop AI glasses with Warby Parker. $75 million has already been sent.

Gemini integration in Chrome

The company announced the launch of Gemini integration in Chrome. Users will get an AI assistant for working in the browser. It can understand page context and perform various tasks.

Gemini in Chrome is available via text input and voice command. You can start chatting with the assistant by clicking the Gemini icon in the top-right corner of the Chrome window.

Example: a user can open a banana-bread recipe page and ask Gemini to make it gluten-free. Or use the digital assistant to choose a plant for a bedroom depending on lighting conditions.

In future, Gemini will be able to work with multiple tabs at once — enabling, among other things, comparison of two similar items across pages or online shops.

Translator in Google Meet

Google Meet has added real-time speech translation. The company uses a large audio language model from DeepMind to enable natural conversation with a counterpart in another language.

During translation, voice, intonation and facial expression are preserved. The new feature has many use cases. For example, English-speaking grandchildren will be able to talk to Spanish-speaking grandparents, as will employees of a large company across regions.

The company claims translation latency is very low, allowing conversations with several people at once.

During the conversation, the original speech from the interlocutor is preserved. The translation is overlaid on top.

Gemini chatbot improvements

Google announced several updates to the Gemini chatbot. Among them:

broader availability of multimodal capabilities;
updated AI models;
streaming video from the phone’s camera or screen while holding voice conversations in parallel;
routing in Google Maps, creating events in Google Calendar and to-do lists in Google Tasks.

At the conference, Google said Gemini now has 400 million monthly active users.

The company also updated Deep Research — a tool for generating detailed research reports. Users can upload PDFs and images, and the service will match them with public information to provide more personalised answers.

In future, Drive and Gmail will be integrable into Deep Research.

Project Mariner — an AI agent for browsing web pages

Google opened the experimental AI agent Project Mariner to American users with a Google AI Ultra subscription. Its operating principle has also been updated — the assistant can now perform up to ten tasks simultaneously.

Examples of Project Mariner’s capabilities include buying tickets to a baseball game or groceries online. Users chat with the agent; it then visits sites and performs the required actions. They can get on with other things while the assistant completes tasks in the background.