昇腾910系列+MindIE Qwen3-32B部署指南

国产芯片+国产框架+国产模型

一、环境准备

1.1、硬件要求

  • 华为昇腾910X AI处理器及配套驱动
  • 至少256GB可用内存
  • 500GB以上存储空间

1.2、软件要求

  • 安装适配系统docker环境
  • 已安装昇腾驱动
  • linux系统安装(一般是欧拉)

1.3 部署工具安装

#安装魔塔的工具,来下载大模型

pip install modelscope

二、部署步骤

1、下载镜像

# --local_dir  下载文件到指定本地文件夹
modelscope download --model 'Qwen/Qwen3-32B' --local_dir '/model/Qwen3-32B'

2、启动容器服务

2.1、命令行启动(临时验证,没保持持续运行)
docker run -it -e ASCEND_VISIBLE_DEVICES=8 \
--network host   --privileged   \
--shm-size 96g   --ulimit memlock=-1   --ulimit stack=67108864  \
-e MASTER_IP="127.0.0.1" \
-e MASTER_PORT="9999" \
--device /dev/davinci0 --device /dev/davinci1 --device /dev/davinci2 --device /dev/davinci3 --device /dev/davinci4 --device /dev/davinci5 --device /dev/davinci6 --device /dev/davinci7 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /model/QwQ/QwQ-32B:/model:ro  \
-v /usr/local/bin/npu-smi:/usr/bin/npu-smi \
-v /usr/local/dcmi:/usr/local/dcmi  \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/firmware/:/usr/local/Ascend/firmware:ro \
swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A3-py311-openeuler24.03-lts  /bin/bash

## 启动服务后需要进入容器进行操作配置
# 1、修改mindie的配置文件
sed -e 's/"httpsEnabled" : true/"httpsEnabled" : false/' \
       -e 's/"modelName" : "llama_65b"/"modelName" : "Qwen32B"/' \
       -e 's|"modelWeightPath" : "/data/atb_testdata/weights/llama1-65b-safetensors"|"modelWeightPath" : "/model"|' /usr/local/Ascend/mindie/latest/mindie-service/conf/config/json

# 2、安装服务依赖以及设置变量
pip3 install -r /usr/local/Ascend/atb-models/requirements/models/requirements_qwen3.txt
export MINDIE_LOG_TO_STDOUT="true"
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH

# 3、启动服务 
nohup ./usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon > output.log 2>&1 &
2.2、docker-compose启动
2.2.1、编写挂载脚本
vi start.sh

#!/bin/bash

# 1、修改mindie的配置文件
sed -e 's/"httpsEnabled" : true/"httpsEnabled" : false/' \
       -e 's/"modelName" : "llama_65b"/"modelName" : "Qwen32B"/' \
       -e 's|"modelWeightPath" : "/data/atb_testdata/weights/llama1-65b-safetensors"|"modelWeightPath" : "/model"|' /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json

# 2、安装服务依赖以及设置变量
pip3 install -r /usr/local/Ascend/atb-models/requirements/models/requirements_qwen3.txt
export MINDIE_LOG_TO_STDOUT="true"
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH

# 3、启动服务 
nohup ./usr/local/Ascend/mindie/latest/mindie-service/bin/mindieservice_daemon > output.log 2>&1 &
2.2.2、编写docker-compose脚本
vi docker-compose.yaml

version: '3.8'
services:
  mindie:
    image: swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A3-py311-openeuler24.03-lts
    container_name: mindie_qwen23b
    stdin_open: true
    tty: true
    network_mode: "host"
    privileged: true
    shm_size: "96g"
    restart: unless-stopped
    ulimits:
      memlock:
        soft: -1
        hard: -1
      stack:
        soft: 67108864
        hard: 67108864
    environment:
      - ASCEND_VISIBLE_DEVICES=8
      - MASTER_IP=127.0.0.1
      - MASTER_PORT=9999
    devices:
      - "/dev/davinci0:/dev/davinci0"
      - "/dev/davinci1:/dev/davinci1"
      - "/dev/davinci2:/dev/davinci2"
      - "/dev/davinci3:/dev/davinci3"
      - "/dev/davinci4:/dev/davinci4"
      - "/dev/davinci5:/dev/davinci5"
      - "/dev/davinci6:/dev/davinci6"
      - "/dev/davinci7:/dev/davinci7"
      - "/dev/davinci_manager:/dev/davinci_manager"
      - "/dev/devmm_svm:/dev/devmm_svm"
      - "/dev/hisi_hdc:/dev/hisi_hdc"
    volumes:
      - "/model/Qwen3-32B:/model:ro"
      - "/usr/local/bin/npu-smi:/usr/bin/npu-smi"
      - "/usr/local/dcmi:/usr/local/dcmi"
      - "/usr/local/Ascend/driver:/usr/local/Ascend/driver"
      - "/usr/local/Ascend/firmware/:/usr/local/Ascend/firmware:ro"
      - "/model/Qwen-deploy/start.sh:/start.sh" #挂载启动脚本
#	ports:
#	  - "23707:1025"
    command: ["bash", "-c", "/start.sh && tail -f /dev/null"]

2.3、验证服务启动

#1、查看服务端口(端口可以更改)
/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json中配置的 port" : 1025
这个端口是否启动

#2、查看日志是否有以下字段内容
Daemon start success!

2.4、验证接口

# <model> 字段需要根据配置文件中的实际情况进行配置,保持一致

curl -i --location --request POST 'http://127.0.0.1:1025/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "QwQ-32B",
  "messages": [{"role": "user", "content": "你好"}],
  "max_tokens": 2048,
  "temperature": 0.6,
  "top_p": 0.95,
  "top_k": 40,
  "n": 1,
  "stream": false,
  "frequency_penalty": 1.00
}'


curl -X POST 127.0.0.1:1025/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
    {"role": "system", "content": "你是一个有用的AI助手"},
    {"role": "user", "content": "请用中文介绍一下华为昇腾910B处理器的特点"}
],
"max_tokens": 2048,
"stream": false,
"do_sample": true,
"temperature": 0.7,
"top_p": 0.9,
"model": "Qwen32B"
}'

三、参考资料(昇腾社区文档)

1、配置文件

https://www.hiascend.com/document/detail/zh/mindie/22RC1/envdeployment/instg/mindie_instg_0026.html

2、容器镜像下载

https://www.hiascend.com/developer/ascendhub/detail/af85b724a7e5469ebd7ea13c3439d48f

3、支持的模型列表

https://www.hiascend.com/software/mindie/modellist

4、魔塔官方(下载模型权重)

https://www.modelscope.cn/models

Logo

昇腾计算产业是基于昇腾系列(HUAWEI Ascend)处理器和基础软件构建的全栈 AI计算基础设施、行业应用及服务,https://devpress.csdn.net/organization/setting/general/146749包括昇腾系列处理器、系列硬件、CANN、AI计算框架、应用使能、开发工具链、管理运维工具、行业应用及服务等全产业链

更多推荐