Analytical modeling is enough for high-performance BLIS TM Low, FD Igual, TM Smith, ES Quintana-Orti ACM Transactions on Mathematical Software (TOMS) 43 (2), 1-18, 2016 | 181 | 2016 |
Anatomy of high-performance many-threaded matrix multiplication TM Smith, R Van De Geijn, M Smelyanskiy, JR Hammond, FG Van Zee 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 170 | 2014 |
The BLIS framework: Experiments in portability FG Van Zee, TM Smith, B Marker, TM Low, RAVD Geijn, FD Igual, ... ACM Transactions on Mathematical Software (TOMS) 42 (2), 1-19, 2016 | 127 | 2016 |
Strassen's algorithm reloaded J Huang, TM Smith, GM Henry, RA Van De Geijn SC'16: Proceedings of the International Conference for High Performance …, 2016 | 88 | 2016 |
Implementing high-performance complex matrix multiplication via the 3m and 4m methods FG Van Zee, TM Smith ACM Transactions on Mathematical Software (TOMS) 44 (1), 1-36, 2017 | 42 | 2017 |
A Tight I/O Lower Bound for Matrix Multiplication TM Smith, B Lowery, J Langou, RA van de Geijn arXiv preprint arXiv:1702.02017, 2019 | 20 | 2019 |
Compressive sensing using iterative hard thresholding with low precision data representation: Theory and applications NM Gürel, K Kara, A Stojanov, T Smith, T Lemmin, D Alistarh, M Püschel, ... IEEE Transactions on Signal Processing 68, 4268-4282, 2020 | 13 | 2020 |
The MOMMS family of matrix multiplication algorithms TM Smith, RA van de Geijn arXiv preprint arXiv:1904.05717, 2019 | 11 | 2019 |
Implementing strassen's algorithm with blis J Huang, TM Smith, GM Henry, RA van de Geijn arXiv preprint arXiv:1605.01078, 2016 | 10 | 2016 |
Pushing the bounds for matrix-matrix multiplication TM Smith, RA van de Geijn CoRR abs/1702.02017, 2017 | 9 | 2017 |
Fast quantized arithmetic on x86: Trading compute for data movement A Stojanov, TM Smith, D Alistarh, M Püschel 2018 IEEE International Workshop on Signal Processing Systems (SiPS), 349-354, 2018 | 8 | 2018 |
Theory and practice of classical matrix-matrix multiplication for hierarchical memory architectures TM Smith | 6 | 2018 |
Toward ABFT for BLIS GEMM TM Smith, RA van de Geijn, M Smelyanskiy, ES Quintana-Orti Tech. Rep. TR-15–05. The University of Texas at Austin, 2015 | 6 | 2015 |
Automating the last-mile for high performance dense linear algebra RM Veras, TM Low, TM Smith, R van de Geijn, F Franchetti arXiv preprint arXiv:1611.08035, 2016 | 5 | 2016 |
Analytical models for the BLIS framework TM Low, FD Igual, TM Smith, ES Quintana-Ortí ACM Transactions on Mathematical Software, 2015 | 5 | 2015 |
Implementing level-3 BLAS with BLIS: Early experience FG Van Zee, T Smith, FD Igual, M Smelyanskiy, X Zhang, M Kistler, ... The University of Texas at Austin, Department of Computer Science, FLAME …, 2013 | 5 | 2013 |
Opportunities for Parallelism in Matrix Multiplication TM Smith, RA van de Geijn, M Smelyanskiy, J Hammond, FG Van Zee Univ. Texas Techinical Report, 2013 | 4* | 2013 |
Lowering barriers into HPC through open education RA van de Geijn, J Huang, ME Myers, DN Parikh, TM Smith EdEduHPC-17: Workshop on Education for High-Performance Computing., 2017 | 3 | 2017 |
Code generation to aid parallel code development B Marker, T Smith, D Batory, F Van Zee, R Van de Geijn Technical report TR-14-08, The University of Texas at Austin, Department of …, 2014 | 2 | 2014 |
Inducing complex matrix multiplication via the 3m and 4m methods FLAME Working Note# 81 FG Van Zee, TM Smith | 1 | 2016 |