Google launches Gemini 3.5 Flash at I/O 2026

Google's new mid-tier model claims roughly 4x output-token speed over rival frontier models while outperforming Gemini 3.1 Pro on coding and agentic benchmarks.

Google unveiled Gemini 3.5 Flash on May 20, 2026 at its annual I/O developer conference in Mountain View, positioning the model as the first entry in a new generation of its Gemini LLM (Large Language Model) family. <cite index="3-7,3-8">Google described it as the first in its latest series of models combining frontier intelligence with action.</cite>

Availability and pricing

<cite index="3-9,3-10">Gemini 3.5 Flash is generally available via Google's agent-first development platform Google Antigravity, the Gemini API in Google AI Studio, and Android Studio.</cite> The model is also rolling out in the Gemini app and as the default engine for AI Mode in Google Search. <cite index="4-1,4-2">The stable API model identifier is gemini-3.5-flash, replacing the gemini-3-flash-preview identifier used during the preview window.</cite> <cite index="7-31">Developers pay $1.50 per million input tokens and $9.00 per million output tokens through the API.</cite>

<cite index="3-33,3-34">Google said it is also working on Gemini 3.5 Pro, which is being used internally and is expected to roll out next month.</cite>

Benchmarks and performance

Google's headline claim is that the Flash-tier model surpasses its previous flagship on agentic and coding workloads. <cite index="3-12,3-13">The company says Gemini 3.5 Flash delivers intelligence that rivals large flagship models at Flash-series speeds, outperforming Gemini 3.1 Pro on benchmarks including Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%).</cite> <cite index="1-15,1-16">It scores 84.2% on CharXiv Reasoning, a benchmark that tests multimodal understanding.</cite>

<cite index="5-9">Google says 3.5 Flash delivers four times the output token generation speed of competing frontier models.</cite> <cite index="4-31,4-32">On Google's published benchmark table, 3.5 Flash leads all reported models on five separate evaluations, while trailing Gemini 3.1 Pro on long-context (MRCR v2 at 128k) and Humanity's Last Exam, where raw knowledge depth matters more than agentic capability.</cite>

Agentic tooling and Antigravity 2.0

The release is paired with an updated developer harness. <cite index="5-11,5-12">Google updated Antigravity, its agentic development platform, to version 2.0, featuring an Editor view that functions like a familiar IDE interface with an agent sidebar, and a Manager view that serves as a control center for orchestrating multiple agents working in parallel across workspaces.</cite> <cite index="4-12">Google said the internal optimization of Gemini 3.5 Flash inside Antigravity 2.0 runs at 12x the speed of comparable frontier models, compared to the 4x figure for the public API.</cite>

Enterprise deployments and scale

Google named several launch partners. <cite index="1-26,1-27,1-28,1-29,1-30,1-31,1-32,1-33,1-34">Shopify is running subagents in parallel for data analysis to power merchant growth forecasts; Macquarie Bank is piloting it for customer onboarding, reasoning over 100+ page documents; Salesforce is integrating 3.5 Flash into Agentforce to automate enterprise tasks using multiple subagents that retain context across multi-turn tool calling; and Ramp uses it for OCR on invoices.</cite>

The launch comes amid sharply higher inference volumes at Google. <cite index="5-10">CEO Sundar Pichai said Google is now processing more than 3.2 quadrillion tokens per month, a significant increase from 480 trillion tokens per month at I/O 2025.</cite>

Competitive context

The release places Gemini's mid-tier model in direct competition with rival frontier offerings at lower latencies and prices. <cite index="7-25">As of May 2026, GPT-5.5 leads many agentic workflow benchmarks, and Claude Opus 4.7 leads several coding benchmarks including SWE-bench Verified, with the lowest hallucination rate of the three.</cite> A heavier Gemini 3.5 Pro variant is expected in June 2026.