昇腾910C+MindIE Qwen3-32B部署指南
本文介绍了在华为昇腾910X AI处理器上部署Qwen3-32B大模型的完整流程。主要内容包括:1)硬件要求(昇腾910X处理器、256GB内存、500GB存储)和软件环境准备(欧拉系统、docker、昇腾驱动);2)通过modelscope工具下载模型;3)两种部署方式:命令行临时启动和docker-compose持久化部署,详细说明了容器配置、环境变量设置和服务启动步骤;4)服务验证方法,包括
昇腾910系列+MindIE Qwen3-32B部署指南
国产芯片+国产框架+国产模型
一、环境准备
1.1、硬件要求
- 华为昇腾910X AI处理器及配套驱动
- 至少256GB可用内存
- 500GB以上存储空间
1.2、软件要求
- 安装适配系统docker环境
- 已安装昇腾驱动
- linux系统安装(一般是欧拉)
1.3 部署工具安装
#安装魔塔的工具,来下载大模型
pip install modelscope
二、部署步骤
1、下载镜像
# --local_dir 下载文件到指定本地文件夹
modelscope download --model 'Qwen/Qwen3-32B' --local_dir '/model/Qwen3-32B'
2、启动容器服务
2.1、命令行启动(临时验证,没保持持续运行)
docker run -it -e ASCEND_VISIBLE_DEVICES=8 \
--network host --privileged \
--shm-size 96g --ulimit memlock=-1 --ulimit stack=67108864 \
-e MASTER_IP="127.0.0.1" \
-e MASTER_PORT="9999" \
--device /dev/davinci0 --device /dev/davinci1 --device /dev/davinci2 --device /dev/davinci3 --device /dev/davinci4 --device /dev/davinci5 --device /dev/davinci6 --device /dev/davinci7 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /model/QwQ/QwQ-32B:/model:ro \
-v /usr/local/bin/npu-smi:/usr/bin/npu-smi \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/firmware/:/usr/local/Ascend/firmware:ro \
swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A3-py311-openeuler24.03-lts /bin/bash
## 启动服务后需要进入容器进行操作配置
# 1、修改mindie的配置文件
sed -e 's/"httpsEnabled" : true/"httpsEnabled" : false/' \
-e 's/"modelName" : "llama_65b"/"modelName" : "Qwen32B"/' \
-e 's|"modelWeightPath" : "/data/atb_testdata/weights/llama1-65b-safetensors"|"modelWeightPath" : "/model"|' /usr/local/Ascend/mindie/latest/mindie-service/conf/config/json
# 2、安装服务依赖以及设置变量
pip3 install -r /usr/local/Ascend/atb-models/requirements/models/requirements_qwen3.txt
export MINDIE_LOG_TO_STDOUT="true"
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH
# 3、启动服务
nohup ./usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon > output.log 2>&1 &
2.2、docker-compose启动
2.2.1、编写挂载脚本
vi start.sh
#!/bin/bash
# 1、修改mindie的配置文件
sed -e 's/"httpsEnabled" : true/"httpsEnabled" : false/' \
-e 's/"modelName" : "llama_65b"/"modelName" : "Qwen32B"/' \
-e 's|"modelWeightPath" : "/data/atb_testdata/weights/llama1-65b-safetensors"|"modelWeightPath" : "/model"|' /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json
# 2、安装服务依赖以及设置变量
pip3 install -r /usr/local/Ascend/atb-models/requirements/models/requirements_qwen3.txt
export MINDIE_LOG_TO_STDOUT="true"
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH
# 3、启动服务
nohup ./usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon > output.log 2>&1 &
2.2.2、编写docker-compose脚本
vi docker-compose.yaml
version: '3.8'
services:
mindie:
image: swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A3-py311-openeuler24.03-lts
container_name: mindie_qwen23b
stdin_open: true
tty: true
network_mode: "host"
privileged: true
shm_size: "96g"
restart: unless-stopped
ulimits:
memlock:
soft: -1
hard: -1
stack:
soft: 67108864
hard: 67108864
environment:
- ASCEND_VISIBLE_DEVICES=8
- MASTER_IP=127.0.0.1
- MASTER_PORT=9999
devices:
- "/dev/davinci0:/dev/davinci0"
- "/dev/davinci1:/dev/davinci1"
- "/dev/davinci2:/dev/davinci2"
- "/dev/davinci3:/dev/davinci3"
- "/dev/davinci4:/dev/davinci4"
- "/dev/davinci5:/dev/davinci5"
- "/dev/davinci6:/dev/davinci6"
- "/dev/davinci7:/dev/davinci7"
- "/dev/davinci_manager:/dev/davinci_manager"
- "/dev/devmm_svm:/dev/devmm_svm"
- "/dev/hisi_hdc:/dev/hisi_hdc"
volumes:
- "/model/Qwen3-32B:/model:ro"
- "/usr/local/bin/npu-smi:/usr/bin/npu-smi"
- "/usr/local/dcmi:/usr/local/dcmi"
- "/usr/local/Ascend/driver:/usr/local/Ascend/driver"
- "/usr/local/Ascend/firmware/:/usr/local/Ascend/firmware:ro"
- "/model/Qwen-deploy/start.sh:/start.sh" #挂载启动脚本
# ports:
# - "23707:1025"
command: ["bash", "-c", "/start.sh && tail -f /dev/null"]
2.3、验证服务启动
#1、查看服务端口(端口可以更改)
/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json中配置的 port" : 1025
这个端口是否启动
#2、查看日志是否有以下字段内容
Daemon start success!
2.4、验证接口
# <model> 字段需要根据配置文件中的实际情况进行配置,保持一致
curl -i --location --request POST 'http://127.0.0.1:1025/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "QwQ-32B",
"messages": [{"role": "user", "content": "你好"}],
"max_tokens": 2048,
"temperature": 0.6,
"top_p": 0.95,
"top_k": 40,
"n": 1,
"stream": false,
"frequency_penalty": 1.00
}'
curl -X POST 127.0.0.1:1025/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "你是一个有用的AI助手"},
{"role": "user", "content": "请用中文介绍一下华为昇腾910B处理器的特点"}
],
"max_tokens": 2048,
"stream": false,
"do_sample": true,
"temperature": 0.7,
"top_p": 0.9,
"model": "Qwen32B"
}'
三、参考资料(昇腾社区文档)
1、配置文件
https://www.hiascend.com/document/detail/zh/mindie/22RC1/envdeployment/instg/mindie_instg_0026.html
2、容器镜像下载
https://www.hiascend.com/developer/ascendhub/detail/af85b724a7e5469ebd7ea13c3439d48f
3、支持的模型列表
https://www.hiascend.com/software/mindie/modellist
4、魔塔官方(下载模型权重)
https://www.modelscope.cn/models
昇腾计算产业是基于昇腾系列(HUAWEI Ascend)处理器和基础软件构建的全栈 AI计算基础设施、行业应用及服务,https://devpress.csdn.net/organization/setting/general/146749包括昇腾系列处理器、系列硬件、CANN、AI计算框架、应用使能、开发工具链、管理运维工具、行业应用及服务等全产业链
更多推荐

所有评论(0)