This could solve the main issue with context windows Because this new model has a context window of...

TL;DR · AI 摘要
推文宣称新模型SubQ实现1200万token上下文窗口、98%准确率,速度提升52倍且成本仅Opus 4.7的5%,但未提供技术细节、评测方法或可验证数据。
核心要点
- SubQ声称支持12M token超长上下文,仍保持98%准确率
- 相比Opus 4.7,推理速度快52倍、成本降至5%
- 采用全亚二次稀疏注意力(SSA)架构,属首款此类前沿模型
结构提纲
按章节快速跳转。
思维导图
用一张图看清主题之间的关系。
查看大纲文本(无障碍 / 无 JS 友好)
- SubQ:12M token上下文模型
- 核心指标
- 1200万token上下文
- 98%准确率
- 52×更快,5%成本
- 技术架构
- 全亚二次稀疏注意力(SSA)
- 首个SSA前沿大模型
金句 / Highlights
值得收藏与分享的关键句。
This could solve the main issue with context windows
12M tokens (!!) but still maintains 98% accuracy
first model built on a fully sub-quadratic sparse-attention architecture (SSA)
Because this new model has a context window of 12M tokens (!!) but still maintains 98% accuracy
And compared to Opus 4.7, it's:
- 52 times faster
- Costs 5% of the price
That's really impressive." / X
Paul Couvert on X: "This could solve the main issue with context windows Because this new model has a context window of 12M tokens (!!) but still maintains 98% accuracy And compared to Opus 4.7, it's: - 52 times faster - Costs 5% of the price That's really impressive." / X
Don’t miss what’s happening

This could solve the main issue with context windows Because this new model has a context window of 12M tokens (!!) but still maintains 98% accuracy And compared to Opus 4.7, it's: - 52 times faster - Costs 5% of the price That's really impressive.
Quote

@alex_whedon
·
15h
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens -
1:24
·
8
10
67
25
Read 8 replies