AI/TLDR

Google DeepMind · 2026-06-24 · major

Gemini 3.5 Flash gets Computer Use — native browser, mobile, and desktop agents

Google DeepMind built Computer Use directly into Gemini 3.5 Flash, so the main Flash model can now drive a browser, an Android phone, or a desktop on its own through the Gemini API and Enterprise Agent Platform.

Gemini 3.5 Flash header graphic from the Google blog announcing built-in Computer Use
Google

Gemini 3.5 Flash can now see a screen and click, type, and scroll on its own through a single built-in tool.

Key specs

EnvironmentsBrowser, Android, Desktop
Action fieldintent

Quick facts

MakerGoogle DeepMind
ModelGemini 3.5 Flash (also available on 3 Flash Preview)
CapabilityNative computer-use tool — browser, Android, desktop
AvailabilityGemini API and Gemini Enterprise Agent Platform
Demo environmentBrowserbase
SafetyTargeted adversarial training plus two optional enterprise safeguard systems
Use casesSoftware testing, knowledge work, long-horizon enterprise automation

What is it?

Computer Use is now a first-class tool on Gemini 3.5 Flash, so the same Flash model that handles chat and coding can now drive a browser, an Android phone, or a desktop. Until today the same capability lived in a separate Gemini 2.5 Computer Use Preview model that teams had to call alongside Flash.

How does it work?

The Gemini 3.5 Flash Computer Use tool takes screenshots from the target environment, then returns a structured action — click, type, scroll, swipe — plus a short intent field that explains the planned step. Developers run the action in their own sandbox or via the Browserbase demo environment and feed the next screenshot back, and the agent loops until the task is done.

Why does it matter?

Enterprise automation teams have been stuck stitching a planning model and a separate computer-use model together; baking the tool into the main Gemini 3.5 Flash model removes that hop and gives a single API call for long-horizon work like software testing, form filling, and back-office knowledge work. Google says it is the best performance it has shipped for agentic computer-use tasks.

Who is it for?

Developers and enterprises building browser, mobile, or desktop agents

Frequently asked questions

Which Gemini models support Computer Use right now?
Google recommends Gemini 3.5 Flash for Computer Use, and the same tool also works with Gemini 3 Flash Preview and the legacy Gemini 2.5 Computer Use Preview. The new bit on June 24 is that the capability is built into the main 3.5 Flash model instead of needing the separate 2.5 model.
How do developers actually access Gemini 3.5 Flash Computer Use?
Gemini 3.5 Flash Computer Use is exposed as a built-in tool in the Gemini API and on the Gemini Enterprise Agent Platform. Google also points developers at a Browserbase-hosted demo environment and a reference implementation on GitHub so teams can try the loop before wiring it into their own infrastructure.
What environments can a Gemini 3.5 Flash agent actually drive?
Gemini 3.5 Flash Computer Use can see and act in a browser, on an Android phone, and on a desktop. Each step returns an intent field that explains the planned action — click, type, scroll, swipe — before the agent runs it, which makes long-horizon trajectories easier to audit.
What safety controls did Google ship with Gemini 3.5 Flash Computer Use?
Google trained Gemini 3.5 Flash Computer Use with targeted adversarial examples and released two optional enterprise safeguards: a per-step approval flow that pauses sensitive or irreversible actions for human confirmation, and a prompt-injection guard that halts the agent when it detects hidden instructions in the screen it is reading.

Try it

https://ai.google.dev/gemini-api/docs/computer-use

Sources · 2 outlets

Tags

  • google-deepmind
  • gemini
  • gemini-3-5-flash
  • computer-use
  • agent
  • browser-automation
  • mobile-agent
  • desktop-agent
  • tool-use
  • gemini-api
  • browserbase

← All releases · Learn AI