Fast, faster, Qwen. 🚀

Thrilled to see Qwen3.5 reaching a record-breaking 580 tps for agentic workl...

Q: 意义

该成就推动了开源大语言模型推理的边界。

Qwen(@Alibaba_Qwen)

Qwen(@Alibaba_Qwen)2026年5月27日

Fast, faster, Qwen. 🚀 Thrilled to see Qwen3.5 reaching a record-breaking 580 tps for agentic workl...

7.5Score

TL;DR · AI 摘要

Qwen3.5 达到 580 tps 的记录性突破，得益于 TokenSpeed 引擎和合作伙伴的优化。

核心要点

Qwen3.5 在 TokenSpeed 引擎上实现 580 tps 的性能。
FA4 优化由 Lightseek、NVIDIA、Mooncake 和 Tri Dao 提供。
该成就推动了开源大语言模型推理的边界。

结构提纲

按章节快速跳转。

§引言
Qwen3.5 达到 580 tps 的记录性突破。
·性能里程碑
Qwen3.5 在 TokenSpeed 引擎上实现 580 tps 的性能。
·合作伙伴
FA4 优化由 Lightseek、NVIDIA、Mooncake 和 Tri Dao 提供。
·意义
该成就推动了开源大语言模型推理的边界。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

Qwen3.5 性能突破

金句 / Highlights

值得收藏与分享的关键句。

Qwen3.5 达到 580 tps 的记录性突破。
— 第 1 段
⬇︎ 下载 PNG 𝕏 分享到 X
FA4 优化由 Lightseek、NVIDIA、Mooncake 和 Tri Dao 提供。
— 第 1 段
⬇︎ 下载 PNG 𝕏 分享到 X
该成就推动了开源大语言模型推理的边界。
— 第 1 段
⬇︎ 下载 PNG 𝕏 分享到 X

#Qwen#TokenSpeed#FA4#高性能#开源

打开原文

Thrilled to see Qwen3.5 reaching a record-breaking 580 tps for agentic workloads on the TokenSpeed engine! This milestone wouldn't be possible without our incredible partners.

Huge thanks to @lightseekorg, @NVIDIAAI, the Mooncake team, and @tri_dao for" / X

Qwen

@Alibaba_Qwen

Fast, faster, Qwen. Image 2: 🚀 Thrilled to see Qwen3.5 reaching a record-breaking 580 tps for agentic workloads on the TokenSpeed engine! This milestone wouldn't be possible without our incredible partners. Huge thanks to

@lightseekorg

,

@NVIDIAAI

, the Mooncake team, and

@tri_dao

for the pioneering FA4 optimization. Together, we are pushing the boundaries of open-source LLM inference. Image 3: 🤝 Image 4: ✨ Dive into the full

@PyTorch

blog post below! Image 5: 👇 pytorch.org/blog/up-to-580 #Qwen #Qwen3_5 #TokenSpeed #LLM #Inference #AI #PyTorch #OpenSource #AgenticAI #HighPerformance

Quote

PyTorch

@PyTorch

7h

The speed-of-light optimization for Qwen3.5 on the TokenSpeed inference engine is a significant milestone, achieving a record-breaking 580 tokens per second (tps) for agentic workloads on NVIDIA GPUs. In the PyTorch Foundation's latest community blog post, you can learn all

4:34 PM · May 27, 2026

236.6K Views