Github上线DeepSeek 百万token窗口实证研究：中英文万字分析报告，PDF，图表，代码

T_Wang_Lab

347人浏览 · 2026-03-02 08:32:48

T_Wang_Lab · 2026-03-02 08:32:48 发布

本项目基于 DeepSeek 于 2026 年 2 月推出的 “新长文本模型”（上下文窗口扩展至1,000,000 tokens，API 端仍保持 V3.2 版本），通过构建非AI/IT领域的完整项目流程，进行了全程、全负载实证工程测试。在单一连续上下文中实现了端到端的闭环。

【核心发现】

1. 交互的令牌预算 (Interaction Token Budget)

实测表明，完整的项目级对话消耗的令牌总量约为 1.2 × 10⁶ – 1.6 × 10⁶ Tokens。该数值并非固定常量，而是受多重变量影响的动态区间：输入格式敏感性：原始HTML、DOCX 与纯文本的编码效率存在显著差异。计数机制黑盒：由于模型内部的稀疏注意力（Sparse Attention）机制、候选生成过程及 Tokenizer 策略对用户不可见，实际消耗量只能给出近似估算。

2. 远程回忆与综合 (Long-Range Recall & Synthesis)

在满载的百万级窗口内，该模型展现了惊人的高保真记忆能力：

全周期检索：能够精准检索对话起始阶段的指令与约束，重建项目关键里程碑。

高密度综合：在对话末期，模型可基于全部历史上下文，自主生成涵盖 80% 以上关键内容的精炼摘要，并撰写包含所有技术细节的完整项目报告。

结论：单一的连续上下文已足以支撑复杂项目的端到端记忆与合成，无需外部向量数据库（RAG）介入即可实现高一致性输出。

3. 协同认知的涌现 (Emergence of Collaborative Cognition)

这是本研究最具意义的发现。当上下文被充分利用时，模型的角色发生了根本性跃迁：

从工具到伙伴：模型从单纯的“高密度答题引擎”转变为“认知伙伴”。

风格同化：模型能够采纳用户的发散性高层推理风格，并在后续交互中保持一致。

全局视角：它能够可靠地概括整个项目历程，按需检索任意片段，展现出传统 128k 窗口中不存在的全局连贯性。

结论：上下文窗口的扩展不仅仅是容量的增加，更是认知能力的质变。它使得 LLM 从辅助工具升级为可与人类深度共生的协作体。

【实证分析】

本次测试成功将上下文推至 1,536,000 Tokens 极限，系统反馈“达到对话长度上限”标志着物理边界的确认（见附图 1）。

本报告包含了详细的过程数据、可视化图表，以及多维度的创新性统计分析，全面揭示了长上下文场景下的模型行为特征。

【资源开放】

本项目所有研究成果、数据及代码均已开源，托管于个人学术主页：

🔗 https://tpwang-lab.github.io

资源内容包括：

🌐 项目主页：完整的英文网页版报告。

📄 PDF 报告：英文版与中文版正式报告（含高清图表）。

💻 源代码：数据清洗、分析及可视化的完整脚本。

📊 数据集：脱敏后的关键测试数据记录。

欢迎同行欢迎指正与交流。

[Empirical Study] DeepSeek's New 1M Context Model: Full-Window Stress Test & Cognitive Emergence

This post shares an empirical study on DeepSeek's new long-context model (released Feb 2026, web/mobile version), which extends the context window to 1,000,000 tokens.

We conducted a full-window stress test, pushing the limit to ~1.53M tokens, and analyzed the model's behavior across three key dimensions:

Key Findings:

Interaction Token Budget: A complete project lifecycle consumes 2M–6M tokens, varying by input format and internal sparse attention mechanisms.
Long-Range Recall & Synthesis: The model demonstrates high-fidelity memory across the entire context, capable of retrieving initial instructions and synthesizing comprehensive reports without external RAG.
Emergence of Collaborative Cognition: Beyond a certain threshold, the model shifts from a "Q&A Engine" to a "Cognitive Partner", adopting user reasoning styles and maintaining global coherence—a capability absent in standard 128k windows.

Resources

Full reports (EN/CN PDFs), source code, and detailed data analysis are open-sourced at:

🔗 Project Page: https://tpwang-lab.github.io

🔗 GitHub Repo: https://github.com/tpwang-lab/deepseek-million-token

Welcome feedback and reproduction attempts from the community!

Tags: #DeepSeek #LLM #LongContext #EmpiricalStudy #AI