The potential of the cell processor for scientific computing
S Williams, J Shalf, L Oliker, S Kamil, P Husbands, K Yelick
Proceedings of the 3rd Conference on Computing Frontiers, 9-20, 2006
Opentuner: An extensible framework for program autotuning
J Ansel, S Kamil, K Veeramachaneni, J Ragan-Kelley, J Bosboom, ...
Proceedings of the 23rd international conference on Parallel architectures …, 2014
Optimization and performance modeling of stencil computations on modern microprocessors
K Datta, S Kamil, S Williams, L Oliker, J Shalf, K Yelick
SIAM review 51 (1), 129-159, 2009
An auto-tuning framework for parallel multicore stencil computations
S Kamil, C Chan, L Oliker, J Shalf, S Williams
2010 IEEE international symposium on parallel & distributed processing …, 2010
Implicit and explicit optimizations for stencil computations
S Kamil, K Datta, S Williams, L Oliker, J Shalf, K Yelick
Proceedings of the 2006 workshop on Memory system performance and …, 2006
Performance optimizations and bounds for sparse matrix-vector multiply
R Vuduc, JW Demmel, KA Yelick, S Kamil, R Nishtala, B Lee
SC'02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, 26-26, 2002
The tensor algebra compiler
F Kjolstad, S Kamil, S Chou, D Lugato, S Amarasinghe
Proceedings of the ACM on Programming Languages 1 (OOPSLA), 1-29, 2017
Scientific computing kernels on the cell processor
S Williams, J Shalf, L Oliker, S Kamil, P Husbands, K Yelick
International Journal of Parallel Programming 35 (3), 263-298, 2007
SEJITS: Getting productivity and performance with selective embedded JIT specialization
B Catanzaro, S Kamil, Y Lee, K Asanovic, J Demmel, K Keutzer, J Shalf, ...
Programming Models for Emerging Architectures 1 (1), 1-9, 2009
Impact of modern memory subsystems on cache optimizations for stencil computations
S Kamil, P Husbands, L Oliker, J Shalf, K Yelick
Proceedings of the 2005 workshop on Memory system performance, 36-43, 2005
Power efficiency in high performance computing
S Kamil, J Shalf, E Strohmaier
2008 IEEE International Symposium on Parallel and Distributed Processing, 1-8, 2008
Communication-optimal parallel recursive rectangular matrix multiplication
J Demmel, D Eliahu, A Fox, S Kamil, B Lipshitz, O Schwartz, O Spillinger
2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013
Tiramisu: A polyhedral compiler for expressing fast and portable code
R Baghdadi, J Ray, MB Romdhane, E Del Sozzo, A Akkas, Y Zhang, ...
2019 IEEE/ACM International Symposium on Code Generation and Optimization …, 2019
Analyzing ultra-scale application communication requirements for a reconfigurable hybrid interconnect
J Shalf, S Kamil, L Oliker, D Skinner
SC'05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, 17-17, 2005
Analysis of photonic networks for a chip multiprocessor using scientific applications
G Hendry, S Kamil, A Biberman, J Chan, BG Lee, M Mohiyuddin, A Jain, ...
2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, 104-113, 2009
Communication requirements and interconnect optimization for high-end scientific applications
S Kamil, L Oliker, A Pinar, J Shalf
IEEE Transactions on Parallel and Distributed Systems 21 (2), 188-202, 2009
Energy-efficient computing for extreme-scale science
D Donofrio, L Oliker, J Shalf, MF Wehner, C Rowen, J Krueger, S Kamil, ...
Computer 42 (11), 62-71, 2009
Reconfigurable hybrid interconnection for static and dynamic scientific applications
S Kamil, A Pinar, D Gunter, M Lijewski, L Oliker, J Shalf
Proceedings of the 4th international conference on Computing frontiers, 183-194, 2007
Graphit: A high-performance graph dsl
Y Zhang, M Yang, R Baghdadi, S Kamil, J Shun, S Amarasinghe
Proceedings of the ACM on Programming Languages 2 (OOPSLA), 1-30, 2018
Silicon nanophotonic network-on-chip using TDM arbitration
G Hendry, J Chan, S Kamil, L Oliker, J Shalf, LP Carloni, K Bergman
2010 18th IEEE Symposium on High Performance Interconnects, 88-95, 2010
