팔로우
Liang Luo
제목
인용
인용
연도
Incbricks: Toward in-network computation with an in-network cache
M Liu, L Luo, J Nelson, L Ceze, A Krishnamurthy, K Atreya
Proceedings of the Twenty-Second International Conference on Architectural …, 2017
1692017
High-performance, distributed training of large-scale deep learning recommendation models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
142*2021
Parameter hub: a rack-scale parameter server for distributed deep neural network training
L Luo, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
Proceedings of the ACM Symposium on Cloud Computing, 41-54, 2018
1352018
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel. CoRR abs/2304.11277 (2023)
Y Zhao, A Gu, R Varma, L Luo, CC Huang, M Xu, L Wright, H Shojanazeri, ...
90*2023
PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud.
L Luo, P West, J Nelson, A Krishnamurthy, L Ceze
Proceedings of the 3rd MLSys Conference, 2020, 2020
67*2020
Laser: Light, accurate sharing detection and repair
L Luo, A Sriraman, B Fugate, S Hu, G Pokam, CJ Newburn, J Devietti
2016 IEEE International Symposium on High Performance Computer Architecture …, 2016
392016
Troubleshooting {Transiently-Recurring} Errors in Production Systems with {Blame-Proportional} Logging
L Luo, S Nath, LR Sivalingam, M Musuvathi, L Ceze
2018 USENIX Annual Technical Conference (USENIX ATC 18), 321-334, 2018
212018
Motivating in-network aggregation for distributed deep neural network training
L Luo, M Liu, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
Workshop on Approximate Computing Across the Stack, 2017
172017
Parameter box: High performance parameter servers for efficient distributed deep neural network training
L Luo, J Nelson, L Ceze, A Phanishayee, A Krishnamurthy
MLSys 2018, 2018
142018
DHEN: A deep and hierarchical ensemble network for large-scale click-through rate prediction
B Zhang, L Luo, X Liu, J Li, Z Chen, W Zhang, X Wei, Y Hao, M Tsang, ...
arXiv preprint arXiv:2203.11014, 2022
112022
{NetHint}:{White-Box} networking for {Multi-Tenant} data centers
J Chen, H Zhang, W Zhang, L Luo, J Chase, I Stoica, D Zhuo
19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022
62022
Pre-train and search: Efficient embedding table sharding with pre-trained neural cost models
D Zha, L Feng, L Luo, B Bhushanam, Z Liu, Y Hu, J Nie, Y Huang, Y Tian, ...
Proceedings of Machine Learning and Systems 5, 2023
52023
Srifty: Swift and thrifty distributed neural network training on the cloud
L Luo, P West, P Patel, A Krishnamurthy, L Ceze
Proceedings of Machine Learning and Systems 4, 833-847, 2022
42022
Accelerating spmm kernel with cache-first edge sampling for graph neural networks
CY Lin, L Luo, L Ceze
arXiv preprint arXiv:2104.10716, 2021
32021
Cloud collectives: Towards cloud-aware collectives forml workloads with rank reordering
L Luo, J Nelson, A Krishnamurthy, L Ceze
arXiv preprint arXiv:2105.14088, 2021
22021
Wukong: Towards a Scaling Law for Large-Scale Recommendation
B Zhang, L Luo, Y Chen, J Nie, X Liu, D Guo, Y Zhao, S Li, Y Hao, Y Yao, ...
arXiv preprint arXiv:2403.02545, 2024
2024
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
L Luo, B Zhang, M Tsang, Y Ma, CH Chu, Y Chen, S Li, Y Hao, Y Zhao, ...
arXiv preprint arXiv:2403.00877, 2024
2024
P4SGD: Programmable Switch Enhanced Model-Parallel Training on Generalized Linear Models on Distributed FPGAs
H Huang, Y Li, J Sun, X Zhu, J Zhang, L Luo, J Li, Z Wang
IEEE Transactions on Parallel and Distributed Systems, 2023
2023
Characterizing and Taming Resolution in Convolutional Neural Networks
E Yan, L Luo, L Ceze
2021 IEEE International Symposium on Workload Characterization (IISWC), 189-200, 2021
2021
Towards More Efficient Communication for Distributed Learning Systems
L Luo
University of Washington, 2020
2020
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20