Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

准备使用vllm部署另一个语言模型,报错误 #3041

Open
1 of 3 tasks
yanglaiyi opened this issue Mar 11, 2025 · 1 comment
Open
1 of 3 tasks

准备使用vllm部署另一个语言模型,报错误 #3041

yanglaiyi opened this issue Mar 11, 2025 · 1 comment
Labels
Milestone

Comments

@yanglaiyi
Copy link

System Info / 系統信息

xinference, version 1.2.2
已经安装了qwen的 语言模型,现在准备安装Belle-distilwhisper-large-v2-zh,报错误{
"detail": "[address=0.0.0.0:22635, pid=141] GPU index 0 has been occupied with a vLLM model: qwen2.5-instruct-0, therefore cannot allocate GPU memory for a new model."
}

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

xinference, version 1.2.2

The command used to start Xinference / 用以启动 xinference 的命令

我直接安装官网的命令进行启动的,也是根据管网的只是进行部署的

Reproduction / 复现过程

  1. 先根据官网的教程部署qwen
  2. 部署Belle-distilwhisper-large-v2-zh模型

Expected behavior / 期待表现

能够同时安装qwen的同时,也能部署语音模型

@XprobeBot XprobeBot added the gpu label Mar 11, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Mar 11, 2025
@qinxuye
Copy link
Contributor

qinxuye commented Mar 12, 2025

已经部署了 LLM,没有卡部署语音模型。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants