Skip to content

feat: Update logits bitmask kernel to v3 #3009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 26, 2025
Merged

Conversation

syuoni
Copy link
Collaborator

@syuoni syuoni commented Mar 24, 2025

The XGrammar team provides important insights on the kernel workload. In most cases, the bitmask tensor is almost-full (bit values are 1) and almost-empty (bit values are 0).

Compared the kernel version on main (v2), the PR introduces the kernel developed in mlc-ai/xgrammar#186 (v3):

  • The kernel v3 shows ~1.3x and ~2.0x speedup on large batch sizes for the almost-full and almost-empty scenarios, respectively.
  • The kernel v3 slightly sacrifices the performance on half-full scenario, compared to v2.

See https://github.com/mlc-ai/xgrammar/tree/main/examples/benchmark#benchmark-apply-token-bitmask-inplace-kernels for more perf numbers. Please see mlc-ai/xgrammar#186 for more background.

@syuoni syuoni requested review from byshiue and Funatiq March 24, 2025 06:40
@syuoni
Copy link
Collaborator Author

syuoni commented Mar 24, 2025

/bot run

@syuoni syuoni requested a review from wm2012011492 March 24, 2025 06:41
@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #253 [ run ] triggered by Bot

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 24, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #292 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #253 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #248 completed with status: 'FAILURE'

@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #292 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #281 completed with status: 'FAILURE'

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 24, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #310 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 24, 2025

PR_Github #310 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #294 completed with status: 'FAILURE'

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 25, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #347 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #347 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #319 completed with status: 'FAILURE'

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 25, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #433 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #433 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #371 completed with status: 'FAILURE'

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 25, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #442 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 25, 2025

PR_Github #442 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #378 completed with status: 'FAILURE'

@syuoni
Copy link
Collaborator Author

syuoni commented Mar 26, 2025

/bot run

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #491 [ run ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #491 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #423 completed with status: 'SUCCESS'

@byshiue
Copy link
Collaborator

byshiue commented Mar 26, 2025

/bot reuse-pipeline

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #527 [ reuse-pipeline ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #527 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #491 for commit 60fd55d

@byshiue byshiue enabled auto-merge (squash) March 26, 2025 07:00
Copy link
Collaborator

@byshiue byshiue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
@byshiue
Copy link
Collaborator

byshiue commented Mar 26, 2025

/bot reuse-pipeline

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #535 [ reuse-pipeline ] triggered by Bot

@niukuo
Copy link
Collaborator

niukuo commented Mar 26, 2025

PR_Github #535 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #491 for commit ff297de

@byshiue byshiue merged commit f70b439 into NVIDIA:main Mar 26, 2025
2 checks passed
@syuoni syuoni deleted the bitmask-v3 branch March 26, 2025 14:09
wu1du2 pushed a commit to wu1du2/TensorRT-LLM that referenced this pull request May 11, 2025
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants