The first modelbuilt for long‑context tasks

SubQ is a sub-quadratic LLM built for 12M-token reasoning, allowing agents to work across full repositories, long histories, and persistent state without quality loss.

Request early access →

Context

12M

token reasoning

Speed

150

tokens per second

Cost

1/5

of other leading LLMs

Use Cases

All your context. Always available.

Reason across 12M tokens in one prompt: entire repos, months of PRs, and long-running agent state, with room to spare at one-fifth the cost.

012M

Python source code

The entire 3.13 standard library

~5.1M

Six months of React PRs

~1,050 pull requests against the React codebase

~7.5M

~ Approximate token counts.

Architecture

Not just another model.An architectural breakthrough.

SubQ is the first model built on a fully sub-quadratic sparse-attention architecture. LLMs today waste compute by processing every possible relationship between words, but only a small fraction of these relationships matter.

SubQ finds and focuses only on those, ensuring compute is used where it matters most. At 12M tokens, this reduces attention compute almost 1,000×, changing the way LLMs scale.

Technical report (coming soon)

Benchmarks

A leader in long-context retrieval and coding tasks.

Benchmarks	Gemini 3.1 Pro	Opus 4.6	Opus 4.7	GPT-5.4	GPT-5.5	SubQ 1M-Preview
SWE-Bench VerifiedReal-world software engineering ability	80.6%	80.8%	87.6%	n/r	n/r	81.8%
RULER @ 128KLong-context accuracy across 13 tests	n/r	94.8%*	n/r	n/r	n/r	95.6%
MRCR v2 (8-needle, 1M)Multi-round coreference resolution in long contexts	26.3%	78.3%	32.2%	36.6%	74.0%	86.2%

n/r = result was not reported by the model provider

* = internally evaluated

SubQ results are third-party validated

Third-party validated results →Technical report (coming soon)

Products

Two ways to use SubQ.

API

For developers and teams

The full-context API for developers and enterprise teams. Process full repositories and pipeline states in a single API call at linear cost.

→ 12M token context window
→ Streaming + tool use
→ OpenAI-compatible endpoints

Request API access →

Code

For coding agents

The long-context layer for coding agents. Plug into Claude Code, Codex, and Cursor to map codebases, gather context, and answer token-heavy questions faster.

→ ~25% lower bill, 10× faster exploration
→ Auto-redirects expensive model turns
→ One-line install

Request SubQ Code access →

Research

From the lab.

PartnershipsMay 14, 2026

We're Partnering with LayerLens to Evaluate SubQ

ProductMay 5, 2026

Introducing SubQ: The First Fully Subquadratic LLM

TechnicalUpdated May 15, 2026

How SSA Makes Long Context Practical

About

We built the architecture the industry said wasn't possible.

Subquadratic is a frontier AI research and infrastructure company building a new class of LLMs. While other major labs focus on incremental improvements to Transformer models, we're pushing foundational change at the model architecture level — enabling large-context, multi-modal inference that scales efficiently where transformers can't.

Built by researchers from

Meta
Google
Oxford
Cambridge
BYU

Early Access

Is your business ready?
Build with us.

Join the private preview.