- 👋 Hi, I’m @yiakwy-xpu-ml-framework-team
- 👀 I’m interested in accelerating the word through algorithms, chips and intelligence. (compiler/transpiler, c++ ops development/optimization for critical path of overall performance and python bindings for HPC application.)
- 🌱 I’m currently working on core framework infrastracture and AI compilier technologies.
- 📫 Please drop me a message through yiak.wy@gmail.com
-
independent contributor @ HPC Users Alliance
- United States
-
09:40
(UTC -12:00) - https://yiakwy.github.io/
- in/lei-wang-1722a28a
- @yiakwy2023
- https://mp.weixin.qq.com/s/AVujFosiC15ZmSRvByYcRQ
- https://mp.weixin.qq.com/s/13NKhY3GccjU9Emz-cRSHQ
Popular repositories Loading
-
Tooklkit-remote-pdb-for-pytorch-distributed
Tooklkit-remote-pdb-for-pytorch-distributed PublicDebugging torch distributed program
Python 7
-
NV_grouped_gemm
NV_grouped_gemm PublicForked from fanshiqing/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM for MoE.
Cuda 4
-
GC-OXFORD-CVPR2021-gbp-poplar
GC-OXFORD-CVPR2021-gbp-poplar PublicForked from joeaortiz/gbp-poplar
Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)
C++ 2
-
NV-DOCA-code-examples
NV-DOCA-code-examples PublicForked from zybzyb1/NVIDIA-DOCA-App-Code-Sharing
DOCA Application code sharing Contest
-
If the problem persists, check the GitHub status page or contact support.