Miami-based startup Subquadratic announced on May 16, 2026 that it had closed a $29 million seed round and exited stealth with SubQ, a Large Language Model (LLM) the company says is the first frontier-scale system built on a fully subquadratic attention architecture. <cite index="9-1">Investors include Tinder co-founder Justin Mateen, Miami-based investor Javier Villamizar, and angels previously involved with Anthropic, OpenAI, Stripe, and Brex.</cite> <cite index="6-10">The round was reported at a roughly $500 million valuation.</cite>
Architecture and claims
The standard transformer architecture introduced in 2017 uses dense attention, in which each token attends to every other token in the sequence. <cite index="5-6,5-7">That produces O(n²) complexity: doubling the input quadruples the compute.</cite> Subquadratic says its model replaces dense attention with a sparse-attention scheme the company calls Subquadratic Sparse Attention (SSA). <cite index="5-10,5-11,5-12">Rather than scoring every possible token pair, the model identifies which relationships actually matter and ignores the rest, producing what the company describes as O(n)—linear, not quadratic—complexity.</cite>
<cite index="4-11">Chief Executive Officer Justin Dangel framed the launch as an architectural shift, stating that "the fundamental scaling laws imposed by the transformer architecture and dense attention have been broken through."</cite> <cite index="4-12">Chief Technology Officer Alexander Whedon said the move from dense to sparse attention is intended to avoid the exponential cost increases associated with larger context windows.</cite>
The first model, SubQ 1M-Preview, ships with a native 12-million-token context window. <cite index="5-2">The company claims the model is 52 times faster than FlashAttention at one million tokens, operates at under 5% of the cost of Claude Opus 4.6, and scores 81.8% on SWE-Bench Verified versus 80.8% for Opus, with a RULER 128K accuracy of 95% at $8 of compute versus $2,600 for Opus.</cite> <cite index="9-4">Subquadratic says the architecture reduces attention compute by nearly 1,000 times at the full 12-million-token context relative to standard transformers.</cite>
Products and availability
<cite index="6-12,6-13,6-14,6-15">Subquadratic is offering three products in private beta with waitlist access: a developer and enterprise Application Programming Interface (API) exposing the full context window, SubQ Code (a command-line interface coding agent), and SubQ Search (a long-context search tool). There is no public pricing, no open weights, and no full technical report or peer-reviewed paper at launch.</cite> <cite index="7-3">The company has set a 50-million-token context target for the fourth quarter.</cite>
Reception and caveats
SubQ joins a line of non-transformer or sub-quadratic research efforts. <cite index="3-9">Previous subquadratic attempts—including Mamba, RWKV, Hyena, and S4—have shown promise at small scales but have not consistently matched transformer quality at full production scale.</cite> Independent researchers have noted that all performance claims to date are vendor-supplied. <cite index="8-22,8-23,8-24">No third party has yet posted SubQ against benchmarks such as MRCR or RULER on the long-context tasks that matter for production work, and until that happens the headline figures should be treated as marketing.</cite>
<cite index="1-17">Subquadratic's 35-person team includes 11 PhDs from Meta, Google, Oxford, Cambridge, and Brigham Young University.</cite> The company has said it does not plan to open-source SubQ's weights and will operate as a commercial API provider.