We're Partnering with LayerLens to Evaluate SubQ
Date
May 14, 2026
Today we're announcing a partnership with LayerLens, a company building evaluation infrastructure for AI systems, to evaluate SubQ on Stratix, their benchmark platform that helps teams evaluate, compare, and monitor models across standardized benchmarks and covers 200+ models and close to 100 benchmarks.
SubQ is built on Sub-quadratic Sparse Attention (SSA), which takes a fundamentally different approach to attention than the standard attention architecture most labs are still iterating on. The design goal is to process more context with much less compute without performance tradeoffs, and we think that opens up a category of long-horizon reasoning tasks that current models genuinely struggle with, with far better token economics. But "we think" only gets you so far, which is why we wanted an independent evaluation layer that applies the same framework to SubQ as it does to every other model on the platform.
What the benchmarks will cover
Long-context is central to what SubQ is designed for, so Stratix will test retrieval accuracy at depth, positional consistency across varying context lengths, and synthesis from extended inputs. Those are the results we care most about, and they're also the ones we expect people to scrutinize most closely.
Stratix will also run SubQ through the same reasoning, coding, instruction following, and tool use evaluations it applies across the rest of its model catalog, because how a model performs across the full range of everyday tasks matters just as much to the teams that would actually deploy it. We're using Stratix Enterprise going forward, so every future SubQ release goes through the same framework. That gives us a consistent, public record of how the model evolves across versions rather than a series of one-time results that go stale the moment the next release ships.
What goes public once evaluation is complete
Results will be published at stratix.layerlens.ai and will include benchmark scores across both standard and long-context suites, prompt-level breakdowns showing how SubQ performs on individual evaluation items, head-to-head comparisons against other models on the platform, per-benchmark breakdowns across specific capability areas, and a full report covering methodology, findings, strengths, and limitations.
We're making a real architectural bet with SubQ, and putting it through independent evaluation that publishes everything, including the parts that show limitations, is how we intend to stand behind it.
LayerLens built Stratix specifically to make that kind of evaluation more systematic and accessible as the model landscape gets more complex and harder to assess, and that mission lines up closely with what we're trying to do here.
Results coming soon at stratix.layerlens.ai.