add optimze of dsv3 #970

momo609 · 2025-05-27T07:03:45Z

What this PR does / why we need it?

Optimize the performance of calculation logic in sampler and deepseekv2.

Does this PR introduce any user-facing change?

Added VLLM_ENABLE_TOPK_OPTIMZE config in sampler

How was this patch tested?

pytest test_sampler.py

MengqingCao

Please add pr description
Run bash format.sh locally to fix lint failures

vllm_ascend/models/deepseek_v2.py

vllm_ascend/ops/fused_moe.py

vllm_ascend/ops/utils.py

vllm_ascend/sample/sampler.py

momo609 · 2025-05-30T07:08:41Z

@wangxiyuan

vllm_ascend/patch/worker/patch_common/patch_sampler.py

tests/sample/test_sampler.py

Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>

ganyi1996ppo · 2025-05-30T10:19:55Z

vllm_ascend/models/deepseek_v2.py

@@ -225,8 +225,7 @@ def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
            enable_force_load_balance = False
        num_tokens, hidden_dim = hidden_states.shape

-        if self.n_shared_experts is not None:
-            shared_output = self.shared_experts(hidden_states)
+        old_hidden_states = hidden_states.detach()


Can you upload the performance profiling on this part? Distach this variable from the pytorch graph seems can't actually triggers the parallel execution according to my knowledge.

Yikun · 2025-05-30T13:11:45Z

tests/e2e/doctests/001-quickstart-test.sh

 _info "====> Start simple_test"
 simple_test
 _info "====> Start quickstart_offline_test"
 quickstart_offline_test
 _info "====> Start quickstart_online_test"
 quickstart_online_test
+_info "====> Start quickstart_offline_test_topk"


We should add offline test in py by setting os.eviron rather than here:

https://github.com/vllm-project/vllm-ascend/blob/main/tests%2Fsinglecard%2Ftest_offline_inference.py#L44-L44

github-actions bot added module:ops module:core labels May 27, 2025

momo609 force-pushed the main branch from 23fbb8c to 8263545 Compare May 29, 2025 03:47

MengqingCao reviewed May 29, 2025

View reviewed changes

github-actions bot added the module:tests label May 29, 2025

momo609 force-pushed the main branch 3 times, most recently from 52aff53 to c35b678 Compare May 30, 2025 07:06

wangxiyuan reviewed May 30, 2025

View reviewed changes

vllm_ascend/patch/worker/patch_common/patch_sampler.py Show resolved Hide resolved

wangxiyuan reviewed May 30, 2025

View reviewed changes

tests/sample/test_sampler.py Outdated Show resolved Hide resolved

momo609 force-pushed the main branch 3 times, most recently from efe0a26 to 90ae7ec Compare May 30, 2025 08:44

add optimze of dsv3.

07c6282

Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>

momo609 force-pushed the main branch from 90ae7ec to 07c6282 Compare May 30, 2025 08:47

ganyi1996ppo reviewed May 30, 2025

View reviewed changes

Yikun reviewed May 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add optimze of dsv3 #970

add optimze of dsv3 #970

momo609 commented May 27, 2025 •

edited

Loading

Uh oh!

MengqingCao left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

momo609 commented May 30, 2025

Uh oh!

Uh oh!

Uh oh!

ganyi1996ppo May 30, 2025

Uh oh!

Yikun May 30, 2025

Uh oh!

Uh oh!

add optimze of dsv3 #970

Are you sure you want to change the base?

add optimze of dsv3 #970

Conversation

momo609 commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

MengqingCao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

momo609 commented May 30, 2025

Uh oh!

Uh oh!

Uh oh!

ganyi1996ppo May 30, 2025

Choose a reason for hiding this comment

Uh oh!

Yikun May 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

momo609 commented May 27, 2025 •

edited

Loading