sgl-project / sglang Public

Notifications You must be signed in to change notification settings
Fork 1.3k
Star 12k

Code
Issues 386
Pull requests 148
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: sgl-project/sglang

Development Roadmap (2025 H1)

#4042 opened Mar 4, 2025 by zhyncs

Open 8

DeepSeek-R1 Optimization Option Ablations

#3956 opened Feb 28, 2025 by m0g1cian

Open 31

Labels 32 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

386 Open 1,066 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Track] VLM accuracy in MMMU benchmark good first issue

Good for newcomers

visIon-LM

#4456 opened Mar 15, 2025 by yizhang2077

[Feature] Please add more operation examples to documentation of frontend tutorials

#4443 opened Mar 15, 2025 by Nutingnon

2 tasks done

[Bug] logprobs can only be set to False in the /v1/chat/completions endpoint

#4440 opened Mar 14, 2025 by yuhsaun-t

5 tasks done

[Feature] enable SGLang custom all reduce by default good first issue

Good for newcomers

help wanted

Extra attention is needed

high priority performance

#4436 opened Mar 14, 2025 by zhyncs

2 tasks

[Accuracy] [Online Quantization] Llama 1B FP16/FP8/W8A8_FP8 accuracy bug

Something isn't working

high priority quant

LLM Quantization

#4434 opened Mar 14, 2025 by hebiao064

[Bug] Not able to load model with bitsandbytes and lora adapters together

#4431 opened Mar 14, 2025 by abpani

5 tasks done

[Bug] does not support InternVL2_5-78B-MPO-AWQ

#4430 opened Mar 14, 2025 by bltcn

5 tasks done

[Feature] NVIDIA GeForce RTX 50 Series with CUDA capability sm_120

#4429 opened Mar 14, 2025 by shahizat

2 tasks done

[Bug] Docker run lmsysorg/sglang:v0.4.4.post1-rocm630 Error: no TensileLibrary_lazy_gfx90a.dat file.

#4421 opened Mar 14, 2025 by luciaganlulu

5 tasks done

[Feature] add support for seperated VIT embeddings for VLM models

#4420 opened Mar 14, 2025 by qibaoyuan

2 tasks done

[Feature] Can you support the VLA series models? For example, openVLA.

#4414 opened Mar 14, 2025 by psong123

2 tasks done

[Usage] What's the best practice of deploying DeepSeekV3 using sglang? deepseek

#4409 opened Mar 14, 2025 by CUHKSZzxy

[Bug]Error reported during benchmark testing：requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

#4407 opened Mar 14, 2025 by winni0

1 of 5 tasks

[Bug] Behavior difference in use_ragged between versions 0.4.2.post3 and 0.4.4 for Gemma2-9b-it model flashinfer

#4406 opened Mar 14, 2025 by lcw99

5 tasks done

[Bug] Can't benchmark deepseek_v2 with dummy weights

#4405 opened Mar 14, 2025 by knukiban

5 tasks done

[Bug] When starting with dp, forward_batch.global_num_tokens_gpu is None.

#4404 opened Mar 14, 2025 by zzk2021

2 of 5 tasks

[Feature] Add QWQ’s Benchmark Code for Inference Performance Evaluation documentation

Improvements or additions to documentation

good first issue

Good for newcomers

help wanted

Extra attention is needed

#4394 opened Mar 13, 2025 by richardodliu

2 tasks done

SGlang blocked！！！！

#4389 opened Mar 13, 2025 by Dada-Cloudzxy

[Feature] integrate flash-attention high priority

#4385 opened Mar 13, 2025 by zhyncs

2 tasks

[Feature] integrate FlashMLA high priority

#4384 opened Mar 13, 2025 by zhyncs

2 tasks

[Bug] sglang Error Failed to initialize the TMA descriptor

#4382 opened Mar 13, 2025 by JackMeiLong

3 of 5 tasks

when doing inference with mutilple gpus,does sglang auto detect and use nccl lib to increase QTS？

#4380 opened Mar 13, 2025 by chuangzhidan

[Feature] Support tool calls for DeepSeek.

#4379 opened Mar 13, 2025 by tingjun-cs

2 tasks

[Bug] please fix，some dp scheduler can't recive message from tokenizer_manager

#4378 opened Mar 13, 2025 by fmantianxing

[Bug] Use torch.inference_mode instead of torch.no_grad

#4366 opened Mar 13, 2025 by Alcanderian

5 tasks done

Previous 1 2 3 4 5 … 15 16 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly