SubQ raises $29M for non-transformer long-context model with 12M token window

Miami-based Subquadratic emerged from stealth with seed funding for a sparse-attention Large Language Model claiming linear scaling and a 12-million-token context window.

Miami-based startup Subquadratic emerged from stealth this month with a $29 million seed round and a new model, SubQ, that the company says is the first Large Language Model (LLM) to escape the quadratic-attention scaling that has defined transformer architectures since 2017. <cite index="5-3,5-4">The company claims SubQ has a native 12-million-token context window on an architecture where compute grows roughly linearly with context length instead of quadratically.</cite>

Funding and investors

<cite index="10-3">Subquadratic raised $29 million in seed funding from investors including Tinder co-founder Justin Mateen, former SoftBank Vision Fund partner Javier Villamizar, and early investors in Anthropic, OpenAI, Stripe, and Brex.</cite> <cite index="10-4">The New Stack reported the round values the company at $500 million.</cite> The company is led by Chief Executive Officer Justin Dangel and Chief Technology Officer Alexander Whedon.

Architecture

The headline technical claim is a mechanism the company calls Subquadratic Sparse Attention (SSA). <cite index="8-17">The approach selectively focuses only on the token comparisons that matter rather than computing attention across every possible relationship.</cite> <cite index="7-12,7-13">Dense attention grows quadratically with input size, while SSA is designed to scale sub-quadratically, closer to O(n·k) instead of O(n²), where k is the number of tokens selected per step. When k is kept small relative to n, this is substantially more efficient than full attention.</cite>

Subquadratic reports that <cite index="8-8">SubQ 1M-Preview reduces attention compute by nearly 1,000 times at 12 million tokens compared with standard transformer architectures.</cite> On the RULER 128K benchmark, <cite index="2-27">SubQ 1M-Preview scored 95.6% accuracy versus 94.8% for Claude Opus 4.6, and the company reports SubQ Sparse Attention is 52× faster than FlashAttention in its architecture-level comparison while requiring 63% less compute.</cite>

Products

<cite index="10-2">The company is launching three products into private beta: an Application Programming Interface (API) exposing the full context window, a command-line coding agent called SubQ Code, and a search tool called SubQ Search.</cite> <cite index="2-25">SubQ Code loads entire codebases into a single context window, enabling developers to plan, execute, and review across a full repository in a single pass.</cite> The model is not being released as open weights, though Subquadratic has said it can be fine-tuned for customer-specific use cases.

Skepticism from researchers

The launch divided the research community. <cite index="10-13,10-14">Prominent AI engineer Will Depue initially noted that SubQ is "almost surely a sparse attention finetune of Kimi or DeepSeek," referring to existing open-source models. Whedon confirmed this on X, writing that the company is using weights from open-source models as a starting point, as a function of its funding and maturity.</cite> <cite index="10-15">Depue later wrote that the company's O(n) scaling claims and speedup numbers did not appear to line up.</cite>

The broader context matters: <cite index="10-21">Kimi Linear, DeepSeek Sparse Attention, Mamba, and RWKV all promised subquadratic scaling, and all faced the same problem — architectures that achieve linear complexity in theory often underperform quadratic attention on downstream benchmarks at frontier scale, or end up hybrid, mixing subquadratic layers with standard attention and losing the pure scaling benefits.</cite>

<cite index="5-13,5-14">At launch, Subquadratic has not published open weights or a full technical report, and every performance number associated with SubQ is vendor-reported and not independently reproduced.</cite> Dangel has said the company plans to release additional technical papers and products in the months ahead. A reported Q4 2026 target is a 50-million-token context window.

SubQ raises $29M for non-transformer long-context model with 12M token window

Funding and investors

Architecture

Products

Skepticism from researchers

Cross-references

Sources