Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Remove RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE flag #51649

Merged

Conversation

akyang-anyscale
Copy link
Contributor

@akyang-anyscale akyang-anyscale commented Mar 24, 2025

Why are these changes needed?

This PR removes a feature flag that controls whether the proxy should use cached replica queue length values for routing. The FF was introduced over a year ago as a way for users to quickly switch back to the previous implementation. It has been enabled by default for over a year now and works as expected, so let's remove it. Consequently, this PR also removes RAY_SERVE_ENABLE_STRICT_MAX_ONGOING_REQUESTS, as it is always enabled if RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE is enabled.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…T_MAX_ONGOING_REQUESTS ff

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale added the go add ONLY when ready to merge, run all tests label Mar 24, 2025
Copy link
Contributor

@GeneDer GeneDer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@akyang-anyscale akyang-anyscale changed the title [serve] Remove flag to enable/disable RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE [serve] Remove RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE flag Mar 24, 2025
@omatthew98 omatthew98 self-requested a review March 24, 2025 21:09
@zcin zcin merged commit 925b25c into ray-project:master Mar 24, 2025
4 checks passed
@akyang-anyscale akyang-anyscale deleted the alexyang/remove-queuelength-ff branch March 24, 2025 23:56
dentiny pushed a commit to dentiny/ray that referenced this pull request Mar 25, 2025
…51649)

## Why are these changes needed?

This PR removes a feature flag that controls whether the proxy should
use cached replica queue length values for routing. The FF was
[introduced](ray-project#42943) over a year
ago as a way for users to quickly switch back to the previous
implementation. It has been enabled by default for [over a
year](ray-project#43169) now and works as
expected, so let's remove it. Consequently, this PR also removes
`RAY_SERVE_ENABLE_STRICT_MAX_ONGOING_REQUESTS`, as it is always enabled
if `RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE` is enabled.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
dhakshin32 pushed a commit to dhakshin32/ray that referenced this pull request Mar 27, 2025
…51649)

## Why are these changes needed?

This PR removes a feature flag that controls whether the proxy should
use cached replica queue length values for routing. The FF was
[introduced](ray-project#42943) over a year
ago as a way for users to quickly switch back to the previous
implementation. It has been enabled by default for [over a
year](ray-project#43169) now and works as
expected, so let's remove it. Consequently, this PR also removes
`RAY_SERVE_ENABLE_STRICT_MAX_ONGOING_REQUESTS`, as it is always enabled
if `RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE` is enabled.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: Dhakshin Suriakannu <d_suriakannu@apple.com>
d-miketa pushed a commit to d-miketa/ray that referenced this pull request Mar 28, 2025
…51649)

## Why are these changes needed?

This PR removes a feature flag that controls whether the proxy should
use cached replica queue length values for routing. The FF was
[introduced](ray-project#42943) over a year
ago as a way for users to quickly switch back to the previous
implementation. It has been enabled by default for [over a
year](ray-project#43169) now and works as
expected, so let's remove it. Consequently, this PR also removes
`RAY_SERVE_ENABLE_STRICT_MAX_ONGOING_REQUESTS`, as it is always enabled
if `RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE` is enabled.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
srinathk10 pushed a commit that referenced this pull request Mar 28, 2025
## Why are these changes needed?

This PR removes a feature flag that controls whether the proxy should
use cached replica queue length values for routing. The FF was
[introduced](#42943) over a year
ago as a way for users to quickly switch back to the previous
implementation. It has been enabled by default for [over a
year](#43169) now and works as
expected, so let's remove it. Consequently, this PR also removes
`RAY_SERVE_ENABLE_STRICT_MAX_ONGOING_REQUESTS`, as it is always enabled
if `RAY_SERVE_ENABLE_QUEUE_LENGTH_CACHE` is enabled.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants