使用Claude代码：HTML的超凡效果 Using Claude Code: The Unreasonable Effectiveness of HTML —— Claude Code

《AI交互的未来：从文本到可视化演进》摘要指出当前AI交互正经历从纯文本向可视化输出的转型。作者认为虽然Markdown仍是主流格式，但其呈现能力有限，而HTML凭借更丰富的视觉元素和交互性正成为新趋势。大脑处理视觉信息的天然优势预示着AI输出将向动态可视化发展，包括交互式视频和神经模拟。同时强调输入方式也需改进，需结合音频、手势等多模态交互。文章提出AI交互将经历"文本-Markdow

RR1335

542人浏览 · 2026-05-12 11:48:42

RR1335 · 2026-05-12 11:48:42 发布

https://x.com/karpathy/status/2053872850101285137?s=20

This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral https://x.com/zan2434/status/2046982383430496444 There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.

顺便说一句，这个方法非常有效。在你的查询结束时，让你的大语言模型“将响应结构化为HTML”，然后在浏览器中查看生成的文件。我也曾成功地让大语言模型将其输出展示为幻灯片等形式。更普遍地说，我认为音频是人类更倾向于输入给AI的方式，但视觉（图像/动画/视频）是AI更优的输出形式。我们大脑中约三分之一是一个专门处理视觉信息的大规模并行处理器，视觉信息是通往大脑的十车道高速公路。随着AI的进步，我认为我们将看到以下发展：

1) 原始文本（阅读费力）

2) Markdown（粗体、斜体、标题、表格，稍微易于阅读）<-- 目前的默认方式

3) HTML（仍然基于代码，但在图形、布局甚至交互性上有更大的灵活性）<-- 早期但正在形成的新标准 ...4,5,6,... n) 交互式神经视频/模拟我认为这种推演（虽然技术尚未实现）

最终会形成某种由扩散神经网络直接生成的交互式视频。关于如何将精确/程序化的“软件1.0”产物（如交互式模拟）与神经产物（扩散网格）结合，仍有许多开放性问题，但大致方向类似于最近爆火的https://x.com/zan2434/status/2046982383430496444 在输入方面也有必要和待改进之处。

仅靠音频、文本或视频是不够的，例如我感觉需要在屏幕上指向或手势示意某些内容，类似于你和一个人坐在电脑前时会做的所有事情。

简而言之，人类与AI之间的输入/输出思维融合正在进行中，在完全进入Neuralink式的脑机接口之前，还有许多工作要做，也有重大进展待实现。在当前阶段值得探索的是，强烈建议尝试请求HTML格式的输出。

Using Claude Code: The Unreasonable Effectiveness of HTML

使用Claude代码：HTML的超凡效果

Markdown has become the dominant file format used by agents to communicate with us. It’s simple, portable, has some rich text capability and is easy for you to edit. Claude has even gotten surprisingly good at using ASCII to make diagrams inside of markdown files.

But as agents have become more and more powerful, I have felt that markdown has become a restricting format. I find it difficult to read a markdown file of more than a hundred lines. I want richer visualizations, color and diagrams and I want to be able to share them easily.

I'm also increasingly not editing these files myself, but using them as specs, reference files, brainstorming outputs, etc. When I do make edits, I’m usually prompting Claude to edit them, which removes one of markdown’s largest benefits.

I’ve started preferring HTML as an output format instead of Markdown and increasingly see this being used by others on the Claude Code team, this is why.

Markdown已成为智能体与我们沟通时使用的主流文件格式。它简单、便携、具备一定的富文本功能，并且便于人工编辑。Claude甚至已经能出色地运用ASCII在Markdown文件中绘制图表。

但随着智能体功能日益强大，我逐渐感受到Markdown正成为一种限制性格式。当阅读超过百行的Markdown文件时，我发现相当吃力。我渴望更丰富的可视化效果、色彩元素和图表呈现，并希望能轻松共享这些内容。

此外，我越来越少亲自编辑这些文件，而是将其作为规范文档、参考资料或头脑风暴产出物来使用。即便需要修改，通常也是通过提示Claude来完成，这使得Markdown最大的优势之一不复存在。

现在我开始更青睐HTML作为输出格式，并注意到Claude代码团队中越来越多人采用这种形式，原因正在于此。

(if you want to start with some examples, you can see a bunch here:

https://thariqs.github.io/html-effectiveness

, just be sure to come back and read more about why)

---