At its annual I/O developer conference on May 19, 2026, Google introduced Gemini 3.5 Flash, the first model in a new family that the company says combines frontier-level intelligence with agentic capability. <cite index="3-7,3-8,3-9,3-10">The model launched generally available the same day via Google Antigravity, the Gemini API in Google AI Studio, and Android Studio.</cite>
Benchmarks and pricing
Google's headline claim inverts the usual hierarchy between its Flash (speed-optimized) and Pro (quality-optimized) tiers. <cite index="3-12,3-13">According to Google, Gemini 3.5 Flash outperforms Gemini 3.1 Pro on challenging coding and agentic benchmarks including Terminal-Bench 2.1 (76.2%), GDPval-AA (1,656 Elo) and MCP Atlas (83.6%).</cite> <cite index="1-15,1-16">It also scored 84.2% on CharXiv Reasoning, a benchmark that tests multimodal understanding.</cite>
<cite index="3-28,3-29,3-30">Google positioned the model for long-horizon agentic tasks, stating that work that used to take a developer days or an auditor weeks can be completed in a fraction of the time, often at less than half the cost of other frontier models, with the model planning, building and iterating across applications, codebases and financial documents.</cite> <cite index="6-30">API pricing is set at $1.50 per million input tokens and $9.00 per million output tokens, with output running at roughly 289 tokens per second.</cite>
On long-context and pure-knowledge evaluations, the previous flagship still leads. <cite index="4-32">Gemini 3.5 Flash trails Gemini 3.1 Pro on MRCR v2 at 128k context and Humanity's Last Exam, where raw knowledge depth matters more than agentic capability.</cite>
Default model in Search
The most consequential deployment is in Google Search. <cite index="3-19,3-20,3-21,3-22">Google said AI Mode has surpassed more than 1 billion monthly users and is being upgraded with Gemini 3.5 Flash as the new default model globally, with AI Mode queries more than doubling every quarter since launch and overall Search queries reaching an all-time high last quarter.</cite> <cite index="3-24">The company also announced what it called the biggest upgrade to its Search box in over 25 years, redesigned around AI.</cite>
Google CEO Sundar Pichai framed the launch against a sharp rise in inference demand. <cite index="5-10">Pichai said the company is now processing more than 3.2 quadrillion tokens per month, a substantial increase from 480 trillion tokens per month at I/O 2025.</cite>
Developer platform and enterprise adoption
Google paired the model with an update to its agent development platform. <cite index="5-11,5-12">Antigravity was updated to version 2.0, featuring an Editor view that functions like an integrated development environment with an agent sidebar, and a Manager view for orchestrating multiple agents working in parallel across workspaces.</cite> <cite index="4-12">Google said an internally optimized build of Gemini 3.5 Flash inside Antigravity 2.0 runs at roughly 12 times the speed of comparable frontier models, compared with the 4x figure cited for the public API.</cite>
Several enterprise customers were named as early adopters. <cite index="1-25,1-26,1-27,1-28,1-29,1-30,1-31,1-32,1-33,1-34">Google said Shopify is running subagents in parallel for data analysis to power merchant growth forecasts, Macquarie Bank is piloting the model for customer onboarding by reasoning over 100-plus page documents, Salesforce is integrating 3.5 Flash into Agentforce to automate enterprise tasks across multiple subagents that retain context across multi-turn tool calling, and Ramp is using it for invoice optical character recognition.</cite>
A larger sibling remains in development. <cite index="3-33,3-34">Google said Gemini 3.5 Pro is being used internally and is expected to roll out next month.</cite>