Follow
ByeongWook Kim
ByeongWook Kim
NAVER CLOVA
Verified email at navercorp.com
Title
Cited by
Cited by
Year
Lut-gemm: Quantized matrix multiplication based on luts for efficient inference in large-scale generative language models
G Park, B Park, M Kim, S Lee, J Kim, B Kwon, SJ Kwon, B Kim, Y Lee, ...
arXiv preprint arXiv:2206.09557, 2022
682022
Structured compression by weight encryption for unstructured pruning and quantization
SJ Kwon, D Lee, B Kim, P Kapoor, B Park, GY Wei
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020
372020
Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns
Y Jeon, B Park, SJ Kwon, B Kim, J Yun, D Lee
SC20: International Conference for High Performance Computing, Networking …, 2020
282020
Extremely low bit transformer quantization for on-device neural machine translation
I Chung, B Kim, Y Choi, SJ Kwon, Y Jeon, B Park, S Kim, D Lee
arXiv preprint arXiv:2009.07453, 2020
272020
Alphatuning: Quantization-aware parameter-efficient adaptation of large-scale pre-trained language models
SJ Kwon, J Kim, J Bae, KM Yoo, JH Kim, B Park, B Kim, JW Ha, N Sung, ...
arXiv preprint arXiv:2210.03858, 2022
222022
Deeptwist: Learning model compression via occasional weight distortion
D Lee, P Kapoor, B Kim
arXiv preprint arXiv:1810.12823, 2018
222018
Learning low-rank approximation for cnns
D Lee, SJ Kwon, B Kim, GY Wei
arXiv preprint arXiv:1905.10145, 2019
192019
Flexor: Trainable fractional quantization
D Lee, SJ Kwon, B Kim, Y Jeon, B Park, J Yun
Advances in neural information processing systems 33, 1311-1321, 2020
142020
Retraining-based iterative weight quantization for deep neural networks
D Lee, B Kim
arXiv preprint arXiv:1805.11233, 2018
122018
Network pruning for low-rank binary indexing
D Lee, SJ Kwon, B Kim, P Kapoor, GY Wei
arXiv preprint arXiv:1905.05686, 2019
62019
Computation-efficient quantization method for deep neural networks
P Kapoor, D Lee, B Kim, S Lee
52018
Rethinking channel dimensions to isolate outliers for low-bit weight quantization of large language models
JH Heo, J Kim, B Kwon, B Kim, SJ Kwon, D Lee
arXiv preprint arXiv:2309.15531, 2023
32023
Winning both the accuracy of floating point activation and the simplicity of integer arithmetic
Y Kim, J Jang, J Lee, J Park, J Kim, B Kim, SJ Kwon, D Lee
The Eleventh International Conference on Learning Representations, 2022
32022
Encoding weights of irregular sparsity for fixed-to-fixed model compression
B Park, SJ Kwon, D Oh, B Kim, D Lee
arXiv preprint arXiv:2105.01869, 2021
32021
Q-Rater: Non-convex optimization for post-training uniform quantization
B Kim, D Lee, Y Ro, Y Jeon, SJ Kwon, B Park, D Oh
arXiv preprint arXiv:2105.01868, 2021
22021
No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
JY Yang, B Kim, J Bae, B Kwon, G Park, E Yang, SJ Kwon, D Lee
arXiv preprint arXiv:2402.18096, 2024
12024
Post-training weighted quantization of neural networks for language models
SJ Kwon, D Lee, Y Jeon, B Kim, BS Park, Y Ro
12020
HyperCLOVA X Technical Report
KM Yoo, J Han, S In, H Jeon, J Jeong, J Kang, H Kim, KM Kim, M Kim, ...
arXiv preprint arXiv:2404.01954, 2024
2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
S Woo, B Park, B Kim, M Jo, S Kwon, D Jeon, D Lee
arXiv preprint arXiv:2402.17812, 2024
2024
Modulating Regularization Frequency for Efficient Compression-Aware Model Training
D Lee, SJ Kwon, B Kim, J Yun, B Park, Y Jeon
arXiv preprint arXiv:2105.01875, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–20