python convert_checkpoint.py \
    --model_dir /models/llama \
    --output_dir /models/trt_engine \

👉 本质要求是:

/models/llama 这个目录里已经有“完整可用的原始模型”


🧠 一、到底要不要提前下载?

👉 要的,而且必须是“本地完整模型”

因为:

  • TensorRT-LLM 转换不会帮你从网上下载
  • 它只会读取本地 checkpoint

📂 二、目录只是习惯,不是必须

/models/ 只是一个常见规范,比如:

/models/
├── llama/
├── qwen/
├── trt_engine/

👉 方便管理而已

你完全可以用:

/home/ubuntu/llama
/data/models/llama2

只要改参数:

--model_dir /你的实际路径

📦 三、模型目录里需要有什么?

以 LLaMA / HuggingFace 格式为例,至少要有:

config.json
tokenizer.json / tokenizer.model
pytorch_model.bin 或 model.safetensors

👉 如果缺这些:

❗ convert 会直接报错


🚀 四、推荐下载方式

✅ 方式1:用 huggingface-cli

# pip3 install huggingface_hub

等待下载完成
# huggingface-cli download deepseek-ai/DeepSeek-V3.2   --local-dir /models/deepseek-v3.2   --local-dir-use-symlinks False
/usr/cluster/anaconda3/lib/python3.12/site-packages/huggingface_hub/commands/download.py:141: FutureWarning: Ignoring --local-dir-use-symlinks. Downloading to a local directory does not use symlinks anymore.
  warnings.warn(
⚠️  Warning: 'huggingface-cli download' is deprecated. Use 'hf download' instead.
Fetching 192 files:   0%|                                                                                                                                                                                                                     | 0/192 [00:00<?, ?it/s]Downloading 'assets/olympiad_cases/ioi_submissions_final.json' to '/models/deepseek-v3.2/.cache/huggingface/download/assets/olympiad_cases/mlaggjckAJkYj3taDNHC05KgNeU=.302473cbd67e461976e3921f647630ee1b0b099e.incomplete'
                                               Downloading '.gitattributes' to '/models/deepseek-v3.2/.cache/huggingface/download/wPaCkH-WbT7GsmxMKKrNZTV4nSM=.04b8d06e312e6a640ec17e0cf7991322c8fd3bf6.incomplete'
Downloading 'assets/olympiad_cases/CMO2025.jsonl' to '/models/deepseek-v3.2/.cache/huggingface/download/assets/olympiad_cases/3R-1xfKTgxxgXCCBfMQx0Xt4iz4=.6a4fef7d41be21b6b9dd5c865f82787a14008601.incomplete'
Downloading 'assets/olympiad_cases/wf_submissions.json' to '/models/deepseek-v3.2/.cache/huggingface/download/assets/olympiad_cases/-0E-GW85VNIc4vR8MK4KiZxIc3U=.50e032833e2801cc1b025238ddf62921d2780a0e.incomplete'
Downloading 'README.md' to '/models/deepseek-v3.2/.cache/huggingface/download/Xn7B-BWUGOee2Y6hCZtEhtFu4BE=.f763002f7c1e0048d30fd66ce616d4136655fd28.incomplete'
.gitattributes: 1.60kB [00:00, 8.40MB/s]
Download complete. Moving file to /models/deepseek-v3.2/.gitattributes
README.md: 7.36kB [00:00, 38.1MB/s]                                                                                                                                                                                                   | 1/192 [00:01<03:51,  1.21s/it]
Download complete. Moving file to /models/deepseek-v3.2/README.md
CMO2025.jsonl: 79.8kB [00:00, 248kB/s]
Download complete. Moving file to /models/deepseek-v3.2/assets/olympiad_cases/CMO2025.jsonl
Downloading 'assets/paper.pdf' to '/models/deepseek-v3.2/.cache/huggingface/download/assets/nleIIzOp7oS1exmi5ym6jExY63Q=.f6fda5753db7b106baa5eeb1286877f17e6a111354762f8aa53c7e6556498df7.incomplete'
wf_submissions.json: 0.00B [00:00, ?B/s]          Downloading 'config.json' to '/models/deepseek-v3.2/.cache/huggingface/download/8_PA_wEVGiVa2goH2H4KQOQpvVY=.d44a9480898d1e6a0ef8229dbb5227394f5c06c4.incomplete'
ioi_submissions_final.json: 414kB [00:01, 36Downloading 'encoding/encoding_dsv32.py' to '/models/deepseek-v3.2/.cache/huggingface/download/encoding/bt-AVtV0qWLBkXPB_JIvZ9DSQEs=.1eb6319e96dd009e6d73ef06c4078de4b090bfb7.incomplete'
config.json: 1.55kB [00:00, 261kB/s]0:01, 326kB/s]
Download complete. Moving file to /models/deepseek-v3.2/config.json                                                                                                                                                                        | 0.00/907k [00:00<?, ?B/s]
encoding_dsv32.py: 14.3kB [00:00, 41.0kB/s]

✅ 方式2:用 git-lfs(部分模型)

git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-hf /models/llama

⚠️ 五、几个容易踩的坑

❌ 1. 路径挂载问题(Docker里很常见)

如果你是在容器里跑:

👉 /models/llama 必须是容器内路径

通常要这样:

docker run -v /host/models:/models ...

❌ 2. 模型没下完整

比如只下了:

  • config.json
  • tokenizer

👉 没有权重文件(.bin / .safetensors)

👉 ❗ 一定报错


❌ 3. 权限问题

Permission denied

👉 容器用户没权限读目录


🧠 一句话总结

👉 /models/ 不是必须
👉 但模型必须提前下载到你指定的 --model_dir 路径里


💡 给你一个更专业的建议(你现在这个阶段)

如果你是在做 TensorRT-LLM + vLLM 这种部署:

👉 建议统一结构:

/data/
├── models/          # 原始模型
├── trt_engines/     # 转换后的 engine

这样:

  • 原始模型不动
  • engine 可以随时重建

Logo

欢迎加入DeepSeek 技术社区。在这里,你可以找到志同道合的朋友,共同探索AI技术的奥秘。

更多推荐