准备使用vllm部署另一个语言模型，报错误 #3041

yanglaiyi · 2025-03-11T08:22:35Z

System Info / 系統信息

xinference, version 1.2.2
已经安装了qwen的语言模型，现在准备安装Belle-distilwhisper-large-v2-zh，报错误{
"detail": "[address=0.0.0.0:22635, pid=141] GPU index 0 has been occupied with a vLLM model: qwen2.5-instruct-0, therefore cannot allocate GPU memory for a new model."
}

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

docker / docker
pip install / 通过 pip install 安装
installation from source / 从源码安装

Version info / 版本信息

xinference, version 1.2.2

The command used to start Xinference / 用以启动 xinference 的命令

我直接安装官网的命令进行启动的，也是根据管网的只是进行部署的

Reproduction / 复现过程

先根据官网的教程部署qwen
部署Belle-distilwhisper-large-v2-zh模型

Expected behavior / 期待表现

能够同时安装qwen的同时，也能部署语音模型

qinxuye · 2025-03-12T10:28:58Z

已经部署了 LLM，没有卡部署语音模型。

XprobeBot added the gpu label Mar 11, 2025

XprobeBot added this to the v1.x milestone Mar 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

准备使用vllm部署另一个语言模型，报错误 #3041

准备使用vllm部署另一个语言模型，报错误 #3041

yanglaiyi commented Mar 11, 2025

qinxuye commented Mar 12, 2025

准备使用vllm部署另一个语言模型，报错误 #3041

准备使用vllm部署另一个语言模型，报错误 #3041

Comments

yanglaiyi commented Mar 11, 2025

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

qinxuye commented Mar 12, 2025