手把手教你在边缘开发板部署DeepSeek系列蒸馏模型

想要在香橙派上快速部署 DeepSeek模型？魔乐社区联合华为昇腾和香橙派，已为你准备好了“一键资源包” -- AI PC专区，帮助开发者快速上手端侧模型推理

魔乐社区

695人浏览 · 2025-03-11 10:25:21

魔乐社区 · 2025-03-11 10:25:21 发布

在 AI 技术爆发的今天，算力需求与日俱增。边缘 AI 开发板凭借本地化部署、低功耗、高灵活性的特点，支撑起AI 落地的 “最后一公里” —— 从智能家居到工业质检，从智慧城市到医疗诊断，边缘设备正让 AI 真正 “触手可及”。

MindIE+DeepSeek+香橙派的结合，成功将DeepSeek-R1-Distill-Qwen-1.5B、DeepSeek-R1-Distill-Qwen-7B、DeepSeek-R1-Distill-Llama-8B等蒸馏模型部署在AI开发板OrangePi AIpro（20T/24GDDR）上，让我们看到了边缘端在AI上的巨大潜力。

MindIE（Mind Inference Engine，昇腾推理引擎）是华为昇腾针对AI全场景业务的推理加速套件。通过分层开放AI能力，支撑用户多样化的AI业务需求，使能百模千态，释放昇腾硬件设备算力。向上支持多种主流AI框架，向下对接不同类型昇腾AI处理器，提供多层次编程接口，帮助用户快速构建基于昇腾平台的推理业务。
香橙派的OrangePi AIpro开发板采用昇腾AI技术路线，无论在外观上、性能上还是技术服务支持上都非常优秀，提供20TOPS和8TOPS两种规格澎湃算力，能覆盖生态开发板者的主流应用场景，让用户实践各种创新场景，并为其提供配套的软硬件。

部署资源速览：魔乐社区一站搞定

想要在香橙派上快速部署 DeepSeek模型？魔乐社区联合华为昇腾和香橙派，已为你准备好了“一键资源包” -- AI PC专区，帮助开发者快速上手端侧模型推理：

适配的DeepSeek蒸馏模型：包含DeepSeek-R1-Distill-Qwen-7B、DeepSeek-R1-Distill-Qwen-1.5B、DeepSeek-R1-Distill-Llama-8B等蒸馏模型及其量化版本；
快速入门和案例教程：从环境搭建到模型推理的全流程指导；
配套工具和代码样例：提供了一系列工具和代码样例，方便开发者进行模型的测试和优化；
开发者交流群：解决部署中的技术难题。

让DeepSeek系列蒸馏模型跑在香橙派上

01 环境准备

硬件：OrangePi AIPro(20T/24GDDR)开发版一台、TF卡一张、TF 卡读卡器、屏幕连接线、显示器、开发板电源等。

首先，从官网下载开发板的Ubuntu22.04镜像和相关的资料。（http://www.orangepi.cn/html/hardWare/computerAndMicrocontrollers/service-and-support/Orange-Pi-AIpro(20T).html）

然后将TF卡插入读卡器中，打开镜像烧录软件balenaEtcher进行烧录，烧录完成后会显示Successful。

将烧录好的TF卡插入卡槽中，连接键盘、鼠标、显示屏并启动开发板。

02 安装python依赖

安装python3.10：

wget https://www.python.org/ftp/python/3.10.2/Python-3.10.2.tgztar -xvf Python-3.10.2.tgz -C /usr/local/sudo apt update  sudo apt install -y build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-devcd /usr/local/Python-3.10.2 ./configure --prefix=/usr/local/python3.10  make  sudo make install# 可以通过运行以下命令来验证Python3.10是否已成功安装并配置为系统的默认Python版本python3.10 –version

安装使昇腾NPU可以适配PyTorch框架的插件torch_npu，下载链接：

pip install torch==2.1.0pip install ./torch_npu-2.1.0.post10-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

03 安装CANN

升级昇腾异构计算架构（CANN）开发工具。下载cann-toolkit（https://mindie.obs.cn-north-4.myhuaweicloud.com/xiangchengpai_20250211/Ascend-cann-toolkit_8.1.RC1_linux-aarch64.run）、cann-kernels（https://mindie.obs.cn-north-4.myhuaweicloud.com/xiangchengpai_20250211/Ascend-cann-kernels-310b_8.1.RC1_linux-aarch64.run）、cann-nnal（https://mindie.obs.cn-north-4.myhuaweicloud.com/xiangchengpai_20250211/Ascend-cann-nnal_8.1.RC1_linux-aarch64.run）安装包，运行命令：

chmod +x Ascend-cann-toolkit_8.1.RC1_linux-aarch64.runchmod +x Ascend-cann-kernels-310b_8.1.RC1_linux-aarch64.runchmod +x Ascend-cann-nnal_8.1.RC1_linux-aarch64.run./Ascend-cann-toolkit_8.1.RC1_linux-aarch64.run --install --force./Ascend-cann-kernels-310b_8.1.RC1_linux-aarch64.run --install./Ascend-cann-nnal_8.1.RC1_linux-aarch64.run --installsource /usr/local/Ascend/ascend-toolkit/set_env.shsource /usr/local/Ascend/nnal/atb/set_env.sh

04 安装MindIE

下载华为MindIE推理方案下的大语言模型推理组件MindIE-LLM，下载链接：（https://mindie.obs.cn-north-4.myhuaweicloud.com/xiangchengpai_20250211/Ascend-mindie-atb-models_2.0.RC1_linux-aarch64_py310_torch2.1.0-abi0.tar.gz）

mkdir MindIE-LLMcd MindIE-LLMtar -zxf ../Ascend-mindie-atb-models_2.0.RC1_linux-aarch64_py310_torch2.1.0-abi0.tar.gzpip install atb_llm-0.0.1-py3-none-any.whlsource set_env.sh

05 模型下载和部署

下载模型代码：

# 下载DeepSeek-R1-Dstill-Qwen-1.5Bgit clone https://modelers.cn/MindIE/DeepSeek-R1-Distill-Qwen-1.5B-OrangePi.git# 下载DeepSeek-R1-Dstill-Qwen-7Bgit clone https://modelers.cn/MindIE/DeepSeek-R1-Distill-Qwen-7B-OrangePi.git# 下载DeepSeek-R1-Dstill-Llama-8Bgit clone https://modelers.cn/MindIE/DeepSeek-R1-Distill-Llama-8B-OrangePi.git

安装依赖：

cd DeepSeek-R1-Distill-{model}-OrangePipip install -r ./requirements.txt
下载权重：

DeepSeek-R1-Distill-Qwen-1.5B（Int8）

（https://modelers.cn/models/MindIE/DeepSeek-R1-Distill-Qwen-1.5B-OrangePi）
DeepSeek-R1-Distill-Qwen-1.5B（FP16）

（https://modelers.cn/models/State_Cloud/DeepSeek-R1-Distill-Qwen-1.5B）
DeepSeek-R1-Distill-Qwen-7B (FP16)

（https://modelers.cn/models/State_Cloud/DeepSeek-R1-Distill-Qwen-7B）
生成[DeepSeek-R1-Distill-Qwen-7B的INT8量化权重请参考README（https://modelers.cn/models/MindIE/DeepSeek-R1-Distill-Qwen-7B-OrangePi）中的“ 本地部署w8a8量化”章节
DeepSeek-R1-Distill-Llama-8B (FP16)

（https://modelers.cn/models/State_Cloud/DeepSeek-R1-Distill-Llama-8B）
生成[DeepSeek-R1-Distill-Llama-8B的INT8量化权重请参考README（https://modelers.cn/models/MindIE/DeepSeek-R1-Distill-Llama-8B-OrangePi）中的“本地部署w8a8量化”章节

修改权重config.json，将torch_dtype字段改为float16，max_position_embedding字段改为4096。

06 执行推理

以上步骤完成之后即可在终端中输入问题进行测试：

cd $MindIE_LLM_PATHpython   -m examples.run_fa_edge \         --model_path ${权重路径} \         --input_text 'What is deep learning?' \         --max_output_length 128 \         --is_chat_model