팔로우
Naveen Mellempudi
Naveen Mellempudi
amd.com의 이메일 확인됨
제목
인용
인용
연도
A study of BFLOAT16 for deep learning training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
3312019
Mixed precision training of convolutional neural networks using integer operations
D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ...
arXiv preprint arXiv:1802.00930, 2018
1912018
Ternary neural networks with fine-grained quantization
N Mellempudi, A Kundu, D Mudigere, D Das, B Kaul, P Dubey
arXiv preprint arXiv:1705.01462, 2017
1372017
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,234,930, 2019
1292019
Fp8 formats for deep learning
P Micikevicius, D Stosic, N Burgess, M Cornea, P Dubey, R Grisenthwaite, ...
arXiv preprint arXiv:2209.05433, 2022
1032022
Mixed precision training with 8-bit floating point
N Mellempudi, S Srinivasan, D Das, B Kaul
arXiv preprint arXiv:1905.12334, 2019
742019
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 10,643,297, 2020
482020
Optimized compute hardware for machine learning operations
D Das, R Gramunt, M Smelyanskiy, J Corbal, D Mudigere, NK Mellempudi, ...
US Patent 10,776,699, 2020
472020
Scaling half-precision floating point tensors for training deep neural networks
N Mellempudi, D Das
US Patent 11,501,139, 2022
452022
On scale-out deep learning training for cloud and hpc
S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ...
arXiv preprint arXiv:1801.08030, 2018
352018
Mixed low-precision deep learning inference using dynamic fixed point
N Mellempudi, A Kundu, D Das, D Mudigere, B Kaul
arXiv preprint arXiv:1701.08978, 2017
282017
Performing power management in a multicore processor
VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi
US Patent 9,910,481, 2018
242018
Incremental precision networks using residual inference and fine-grain quantization
A Kundu, N Mellempudi, D Mudigere, D Das
US Patent 11,556,772, 2023
202023
Ternary residual networks
A Kundu, K Banerjee, N Mellempudi, D Mudigere, D Das, B Kaul, ...
arXiv preprint arXiv:1707.04679, 2017
152017
Conversion hardware mechanism
N Mellempudi, D Das, MEI Chunhui, K Wong, DD Kalamkar, HH Jiang, ...
US Patent 11,494,163, 2022
142022
Dynamic precision management for integer deep learning primitives
N Mellempudi, D Mudigere, D Das, S Sridharan
US Patent 11,321,805, 2022
92022
Technologies for scaling deep learning training
NK Mellempudi, S Sridharan, D Mudigere, D Das
US Patent 11,068,780, 2021
72021
High performance scalable FPGA accelerator for deep neural networks
S Srinivasan, P Janedula, S Dhoble, S Avancha, D Das, N Mellempudi, ...
arXiv preprint arXiv:1908.11809, 2019
52019
Efficient post-training quantization with fp8 formats
H Shen, N Mellempudi, X He, Q Gao, C Wang, M Wang
Proceedings of Machine Learning and Systems 6, 483-498, 2024
42024
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,775,873, 2020
42020
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20