At Google I/O 2026 on May 20, Google introduced Gemini 3.5 Flash, the first model in its newest Gemini family. <cite index="4-1,4-2,4-3">Google described the release as its latest family of models combining frontier intelligence with action, a major leap forward in building more capable, intelligent agents, kicking off the series with 3.5 Flash.</cite>
Availability and positioning
<cite index="4-5">3.5 Flash is available to billions of people globally via the Gemini app and AI Mode in Google Search, for developers through the Google Antigravity agent-first development platform and the Gemini API in Google AI Studio and Android Studio, and for enterprises via the Gemini Enterprise Agent Platform and Gemini Enterprise.</cite> <cite index="4-6,4-7">Google said it is also working on 3.5 Pro, which is being used internally and will roll out next month.</cite>
<cite index="1-19">AI Mode in Search has adopted Gemini 3.5 Flash as the new default model globally</cite>, and <cite index="1-18">Google said AI Mode has surpassed more than 1 billion monthly users.</cite>
Performance and pricing claims
Google's reported benchmarks position the new Flash tier above the prior-generation Pro model. <cite index="7-12">According to the company, Gemini 3.5 Flash outperforms Gemini 3.1 Pro on Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), and MCP Atlas (83.6%), and leads in multimodal understanding on CharXiv at 84.2%.</cite> <cite index="4-10,4-11">On output tokens per second, Google says it is four times faster than other frontier models, landing in the top-right quadrant of the Artificial Analysis index.</cite>
On cost, <cite index="4-13">Google states the model can complete work often at less than half the cost of other frontier models.</cite> Third-party documentation of the API lists pricing at $1.50 per million input tokens and $9.00 per million output tokens for the gemini-3.5-flash identifier.
Agent-first architecture
The launch is tied tightly to Google Antigravity, the company's development platform updated to version 2.0. <cite index="1-31,1-32,1-33">Google described the new Antigravity as "unabashedly agent-first," focusing on agent conversations, agent-produced artifacts, and multi-agent orchestration, with new primitives including subagents, hooks, and asynchronous task management, and noted that Gemini 3.5 Flash was co-optimized with the Antigravity agent harness.</cite> <cite index="8-12,8-13">In Antigravity, Google says 3.5 Flash is 12x faster through token optimization, and Antigravity 2.0 is available globally.</cite>
Google also unveiled Gemini Spark, a persistent personal agent. <cite index="8-15,8-16,8-17">Spark runs on virtual machines through Google Cloud, operates 24/7 without requiring an open laptop, is accessible via the Gemini app with options to email or message it, and uses Gemini 3.5 Flash and Antigravity to handle long-running background tasks.</cite>
Infrastructure scale
The announcements were framed against expanding compute commitments. <cite index="3-10">Google Chief Executive Sundar Pichai said the company is now processing more than 3.2 quadrillion tokens per month, a sharp increase from 480 trillion tokens per month at I/O 2025.</cite> <cite index="3-23">Google has committed to spending between $180 billion and $190 billion in infrastructure investment this year, roughly six times its 2022 capital expenditure figure.</cite>
Safety
<cite index="4-15,4-16,4-17">Google said Gemini 3.5 was developed in accordance with its Frontier Safety Framework, with strengthened cyber and CBRN safeguards intended to reduce both harmful outputs and mistaken refusals of safe queries, using more advanced safety training and interpretability tools that check the model's inner reasoning before it responds.</cite>
The release intensifies competitive pressure on the cost-per-token tier where rivals including Anthropic, OpenAI, and DeepSeek have concentrated commercial offerings.