Shijie Cao

Cited by

	All	Since 2019
Citations	441	440
h-index	6	6
i10-index	5	5

120

20192020202120222023202422 65 104 108 113 26

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Chen ZhangShanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Wencong XiaoAlibaba GroupVerified email at alibaba-inc.com
Lintao ZhangMicrosoft Research AsiaVerified email at microsoft.com
Lingxiao MaSenior Researcher, Microsoft ResearchVerified email at pku.edu.cn
Zhuliang YaoTsinghua UniversityVerified email at mails.tsinghua.edu.cn
Fan YangMicrosoft ResearchVerified email at microsoft.com
Derek ChiouProfessor, ECE, UT Austin and Partner Architect, Microsoft AzureVerified email at ece.utexas.edu
Xu NingyiMicrosoft Research

Shijie Cao

Microsoft Research Asia

Verified email at microsoft.com - Homepage

Efficient Deep Learning Deep Learning System Computer Architecture


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity S Cao, C Zhang, Z Yao, W Xiao, L Nie, D Zhan, Y Liu, M Wu, L Zhang Proceedings of the 2019 ACM/SIGDA International Symposium on Field …, 2019	189	2019
Balanced sparsity for efficient dnn inference on gpu Z Yao, S Cao, W Xiao, C Zhang, L Nie Proceedings of the AAAI conference on artificial intelligence 33 (01), 5676-5683, 2019	118	2019
Seernet: Predicting convolutional neural network feature-map sparsity through low-bit quantization S Cao, L Ma, W Xiao, C Zhang, Y Liu, L Zhang, L Nie, Z Yang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019	80	2019
Dense-to-sparse gate for mixture-of-experts X Nie, S Cao, X Miao, L Ma, J Xue, Y Miao, Z Yang, Z Yang, CUI Bin	22	2021
Evomoe: An evolutional mixture-of-experts training framework via dense-to-sparse gate X Nie, X Miao, S Cao, L Ma, Q Liu, J Xue, Y Miao, Y Liu, Z Yang, B Cui arXiv preprint arXiv:2112.14397, 2021	14	2021
Integer or floating point? new outlooks for low-bit quantization on large language models Y Zhang, L Zhao, S Cao, W Wang, T Cao, F Yang, M Yang, S Zhang, N Xu arXiv preprint arXiv:2305.12356, 2023	9	2023
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference R Hwang, J Wei, S Cao, C Hwang, X Tang, T Cao, M Yang, M Rhu arXiv preprint arXiv:2308.12066, 2023	3	2023
Efficient gpu kernels for n: m-sparse weights in deep learning B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ... Proceedings of Machine Learning and Systems 5, 2023	2	2023
AFPQ: Asymmetric Floating Point Quantization for LLMs Y Zhang, S Zhang, S Cao, D Du, J Wei, T Cao, N Xu arXiv preprint arXiv:2311.01792, 2023	1	2023
NN-Stretch: Automatic Neural Network Branching for Parallel Inference on Heterogeneous Multi-Processors J Wei, T Cao, S Cao, S Jiang, S Fu, M Yang, Y Zhang, Y Liu Proceedings of the 21st Annual International Conference on Mobile Systems …, 2023	1	2023
Adam accumulation to reduce memory footprints of both activations and gradients for large-scale dnn training Y Zhang, Y Han, S Cao, G Dai, Y Miao, T Cao, F Yang, N Xu arXiv preprint arXiv:2305.19982, 2023	1	2023
Accurate and structured pruning for efficient automatic speech recognition H Jiang, LL Zhang, Y Li, Y Wu, S Cao, T Cao, Y Yang, J Li, M Yang, L Qiu arXiv preprint arXiv:2305.19549, 2023	1	2023
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation D Du, Y Zhang, S Cao, J Guo, T Cao, X Chu, N Xu arXiv preprint arXiv:2402.10631, 2024		2024
FlexSaaS: A Reconfigurable Accelerator for Web Search Selection S Cao, L Nie, D Zhan, W Wang, N Xu, R Das, M Wu, L Zhang, D Chiou ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12 (1), 1-20, 2019		2019
The Case for Learning Machine Language G Liu, CJM Liang, S Cao, S Lu, L van Doorn

The system can't perform the operation now. Try again later.

Articles 1–15

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors