팔로우
Jianyu Huang
Jianyu Huang
Meta Platforms, Inc.
meta.com의 이메일 확인됨 - 홈페이지
제목
인용
인용
연도
Deep Learning Recommendation Model for Personalization and Recommendation Systems
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
6632019
A Study of BFLOAT16 for Deep Learning Training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
3092019
Strassen's algorithm reloaded
J Huang, TM Smith, GM Henry, RA van de Geijn
High Performance Computing, Networking, Storage and Analysis, SC16 …, 2016
832016
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
772022
Performance optimization for the k-nearest neighbors kernel on x86 architectures
CD Yu, J Huang, W Austin, B Xiao, G Biros
Proceedings of the International Conference for High Performance Computing …, 2015
432015
FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
D Khudia, J Huang, P Basu, S Deng, H Liu, J Park, M Smelyanskiy
arXiv preprint arXiv:2101.05615, 0
40
Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2021. Software-Hardware Co-design …
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
arXiv preprint arXiv:2104.05158, 2022
39*2022
Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs/1906.00091 (2019)
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
35*2019
Generating families of practical fast matrix multiplication algorithms
J Huang, L Rice, DA Matthews, RA van de Geijn
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
352017
High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
312021
Mixed-Precision Embedding Using a Cache
JA Yang, J Huang, J Park, PTP Tang, A Tulloch
arXiv preprint arXiv:2010.11305, 2020
222020
Implementing Strassen's Algorithm with CUTLASS on NVIDIA Volta GPUs
J Huang, CD Yu, RA van de Geijn
arXiv preprint arXiv:1808.07984, 2018
222018
Strassen's Algorithm for Tensor Contraction
J Huang, DA Matthews, RA van de Geijn
SIAM Journal on Scientific Computing 40 (3), C305-C326, 2018
212018
Strassen’s Algorithm Reloaded on GPUs
J Huang, CD Yu, RA Geijn
ACM Transactions on Mathematical Software (TOMS) 46 (1), 1-22, 2020
192020
BLISlab: A Sandbox for Optimizing GEMM
J Huang, RA van de Geijn
arXiv preprint arXiv:1609.00076, 2016
152016
Efficient soft-error detection for low-precision deep learning recommendation models
S Li, J Huang, PTP Tang, D Khudia, J Park, HD Dixit, Z Chen
2022 IEEE International Conference on Big Data (Big Data), 1556-1563, 2022
132022
A study of BFLOAT16 for deep learning training (2019)
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 1905
121905
Low-precision hardware architectures meet recommendation model inference at scale
Z Deng, J Park, PTP Tang, H Liu, J Yang, H Yuen, J Huang, D Khudia, ...
IEEE Micro 41 (5), 93-100, 2021
102021
Implementing Strassen’s Algorithm with BLIS
FW Note, J Huang, TM Smith, GM Henry, RA van de Geijn
arXiv preprint arXiv:1605.01078, 2016
10*2016
Practical fast matrix multiplication algorithms
J Huang
The University of Texas at Austin, 2018
72018
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20