[core] add sleep and wake up endpoint and v1 support #12987

youkaichao · 2025-02-09T13:33:34Z

expose /sleep and /wake_up endpoint, and also support it in v1

Signed-off-by: youkaichao <youkaichao@gmail.com>

github-actions · 2025-02-09T13:33:44Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: youkaichao <youkaichao@gmail.com>

vllm/engine/multiprocessing/client.py

tests/entrypoints/openai/test_sleep.py

vllm/engine/multiprocessing/client.py

njhill · 2025-02-09T16:39:48Z

@youkaichao I wonder if we could merge #12918 first which simplifies the v1 request handling a bit? (it should be ready now)

mergify · 2025-02-10T03:36:40Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @youkaichao.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…p-youkaichao Signed-off-by: cennn <2523403608@qq.com>

Signed-off-by: cennn <2523403608@qq.com>

Signed-off-by: youkaichao <youkaichao@gmail.com>

…p-youkaichao Signed-off-by: cennn <2523403608@qq.com>

Signed-off-by: cennn <2523403608@qq.com>

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao · 2025-02-20T02:21:14Z

vllm/entrypoints/openai/serving_transcription.py

@@ -295,6 +295,7 @@ async def create_transcription(
        # TODO(rob): figure out a way to pipe streaming in.
        # Non-streaming response.
        try:
+            assert result_generator is not None


this is to pass the linter.

njhill

Thanks @youkaichao LGTM

njhill · 2025-02-20T03:37:42Z

vllm/entrypoints/openai/api_server.py

+        await engine_client(raw_request).sleep(int(level))
+        # when we return a response, the sleep command
+        # is sent but does not finish yet.
+        # TODO: we should wait for the sleep to finish and then return


Not sure if it's worth updating this comment since it does now wait in V1?

updated in e8416d9 now.

Signed-off-by: youkaichao <youkaichao@gmail.com>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com> Signed-off-by: Michael Glass <mrglass@us.ibm.com>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com> Signed-off-by: Linkun Chen <github@lkchen.net>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com> Signed-off-by: saeediy <saidakbarp@gmail.com>

) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com>

youkaichao added 3 commits February 9, 2025 20:59

add sleep and wake up endpoint

3a7ecb4

Signed-off-by: youkaichao <youkaichao@gmail.com>

add tests

4fbb39e

Signed-off-by: youkaichao <youkaichao@gmail.com>

add ack

e74dd9a

Signed-off-by: youkaichao <youkaichao@gmail.com>

mergify bot added frontend v1 labels Feb 9, 2025

youkaichao added 5 commits February 9, 2025 21:34

add status code check

79ec652

Signed-off-by: youkaichao <youkaichao@gmail.com>

fix socket

13f4bdf

Signed-off-by: youkaichao <youkaichao@gmail.com>

use send_request_and_get_response

fa86994

Signed-off-by: youkaichao <youkaichao@gmail.com>

fix

ef145d9

Signed-off-by: youkaichao <youkaichao@gmail.com>

fix output

042c959

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao commented Feb 9, 2025

View reviewed changes

vllm/engine/multiprocessing/client.py Outdated Show resolved Hide resolved

youkaichao commented Feb 9, 2025

View reviewed changes

tests/entrypoints/openai/test_sleep.py Show resolved Hide resolved

njhill reviewed Feb 9, 2025

View reviewed changes

vllm/engine/multiprocessing/client.py Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Feb 10, 2025

njhill mentioned this pull request Feb 11, 2025

WIP: [Core] Expose sleep and wake_up api to the Client for Model State Management #13016

Closed

Merge branch 'main' of https://github.com/vllm-project/vllm into slee…

dbf1756

…p-youkaichao Signed-off-by: cennn <2523403608@qq.com>

mergify bot removed the needs-rebase label Feb 12, 2025

cennn and others added 6 commits February 12, 2025 14:54

support VLLM_USE_V1 && fix typo WAKE_UP

4010671

Signed-off-by: cennn <2523403608@qq.com>

Merge branch 'main' into sleep_apiserver

1f808d5

add comments

283507d

Signed-off-by: youkaichao <youkaichao@gmail.com>

minimize diff

e6d618d

Signed-off-by: youkaichao <youkaichao@gmail.com>

minimize diff

7511795

Signed-off-by: youkaichao <youkaichao@gmail.com>

add v1 test

aa23022

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao changed the title ~~[core] add sleep and wake up endpoint~~ [core] add sleep and wake up endpoint and v1 support Feb 13, 2025

youkaichao added 3 commits February 13, 2025 12:01

add timing

f7bcffa

Signed-off-by: youkaichao <youkaichao@gmail.com>

add v1 test

aaba2f4

Signed-off-by: youkaichao <youkaichao@gmail.com>

add v1 test

4b8bd93

Signed-off-by: youkaichao <youkaichao@gmail.com>

add v1 test

b48f686

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao mentioned this pull request Feb 15, 2025

[V1][Core] Generic mechanism for handling engine utility methods #13060

Merged

cennn and others added 4 commits February 19, 2025 23:41

Merge branch 'main' of https://github.com/vllm-project/vllm into slee…

e23031f

…p-youkaichao Signed-off-by: cennn <2523403608@qq.com>

rm time.sleep(30)

83813d8

Signed-off-by: cennn <2523403608@qq.com>

improve tests

c795115

Signed-off-by: youkaichao <youkaichao@gmail.com>

improve tests

1657de0

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao commented Feb 20, 2025

View reviewed changes

youkaichao marked this pull request as ready for review February 20, 2025 02:21

youkaichao requested review from DarkLight1337, robertgshaw2-redhat, simon-mo, WoosukKwon, ywang96, comaniac and alexm-redhat as code owners February 20, 2025 02:21

youkaichao requested a review from njhill February 20, 2025 02:27

comaniac approved these changes Feb 20, 2025

View reviewed changes

njhill approved these changes Feb 20, 2025

View reviewed changes

update comments

e8416d9

Signed-off-by: youkaichao <youkaichao@gmail.com>

youkaichao merged commit ba81163 into vllm-project:main Feb 20, 2025
7 of 9 checks passed

youkaichao deleted the sleep_apiserver branch February 20, 2025 04:41

sivanantha321 mentioned this pull request Mar 4, 2025

Upgrade vLLM version to 0.7.3 kserve/kserve#4281

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] add sleep and wake up endpoint and v1 support #12987

[core] add sleep and wake up endpoint and v1 support #12987

youkaichao commented Feb 9, 2025 •

edited

Loading

github-actions bot commented Feb 9, 2025

njhill commented Feb 9, 2025

mergify bot commented Feb 10, 2025

youkaichao Feb 20, 2025

njhill left a comment

njhill Feb 20, 2025

youkaichao Feb 20, 2025

[core] add sleep and wake up endpoint and v1 support #12987

[core] add sleep and wake up endpoint and v1 support #12987

Conversation

youkaichao commented Feb 9, 2025 • edited Loading

github-actions bot commented Feb 9, 2025

njhill commented Feb 9, 2025

mergify bot commented Feb 10, 2025

youkaichao Feb 20, 2025

Choose a reason for hiding this comment

njhill left a comment

Choose a reason for hiding this comment

njhill Feb 20, 2025

Choose a reason for hiding this comment

youkaichao Feb 20, 2025

Choose a reason for hiding this comment

youkaichao commented Feb 9, 2025 •

edited

Loading