ParCIS Lab, BUPT
Popular repositories Loading
-
FlashSparse
FlashSparse PublicFlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by…
-
DNN-cpp-proxies
DNN-cpp-proxies PublicC++/MPI proxies for distributed training of deep neural networks.
C++ 1
-
Repositories
- MatmulSwigluCustom Public
ParCIS/MatmulSwigluCustom’s past year of commit activity - FlashSparse Public
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.
ParCIS/FlashSparse’s past year of commit activity - Chimera Public
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
ParCIS/Chimera’s past year of commit activity - Ok-Topk Public
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
ParCIS/Ok-Topk’s past year of commit activity - Magicube Public
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
ParCIS/Magicube’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Most used topics
Loading…