FTI: High performance fault tolerance interface for hybrid systems L Bautista-Gomez, S Tsuboi, D Komatitsch, F Cappello, N Maruyama, ... Proceedings of 2011 international conference for high performance computing …, 2011 | 438 | 2011 |
Lightweight silent data corruption detection based on runtime data analysis for HPC applications E Berrocal, L Bautista-Gomez, S Di, Z Lan, F Cappello Proceedings of the 24th International Symposium on High-Performance Parallel …, 2015 | 119* | 2015 |
Optimization of multi-level checkpoint model for large scale HPC applications S Di, MS Bouguerra, L Bautista-Gomez, F Cappello 2014 IEEE 28th international parallel and distributed processing symposium …, 2014 | 112 | 2014 |
Unprotected computing: A large-scale study of dram raw error rate on a supercomputer L Bautista-Gomez, F Zyulkyarov, O Unsal, S McIntosh-Smith SC'16: Proceedings of the International Conference for High Performance …, 2016 | 98 | 2016 |
GPGPUs: How to combine high computational power with high reliability LB Gomez, F Cappello, L Carro, N DeBardeleben, B Fang, S Gurumurthi, ... 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1-9, 2014 | 83 | 2014 |
Improving the computing efficiency of HPC systems using a combination of proactive and preventive checkpointing MS Bouguerra, A Gainaru, LB Gomez, F Cappello, S Matsuoka, ... 2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013 | 77 | 2013 |
Distributed diskless checkpoint for large scale systems LAB Gomez, N Maruyama, F Cappello, S Matsuoka 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid …, 2010 | 69 | 2010 |
Detecting and correcting data corruption in stencil applications through multivariate interpolation L Bautista-Gomez, F Cappello 2015 IEEE International Conference on Cluster Computing, 595-602, 2015 | 50 | 2015 |
Reducing waste in extreme scale systems through introspective analysis L Bautista-Gomez, A Gainaru, S Perarnau, D Tiwari, S Gupta, ... 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016 | 48 | 2016 |
Low-overhead diskless checkpoint for hybrid computing systems LB Gomez, A Nukada, N Maruyama, F Cappello, S Matsuoka 2010 International Conference on High Performance Computing, 1-10, 2010 | 43* | 2010 |
Improving floating point compression through binary masks LAB Gomez, F Cappello 2013 IEEE international conference on big data, 326-331, 2013 | 37 | 2013 |
Spatial support vector regression to detect silent errors in the exascale era O Subasi, S Di, L Bautista-Gomez, P Balaprakash, O Unsal, J Labarta, ... 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016 | 31 | 2016 |
Optimization of a multilevel checkpoint model with uncertain execution scales S Di, L Bautista-Gome, F Cappello SC'14: Proceedings of the International Conference for High Performance …, 2014 | 31 | 2014 |
Resource analysis of Ethereum 2.0 clients M Cortes-Goicoechea, L Franceschini, L Bautista-Gomez 2021 3rd Conference on Blockchain Research & Applications for Innovative …, 2021 | 25 | 2021 |
A study of checkpointing in large scale training of deep neural networks E Rojas, AN Kahira, E Meneses, LB Gomez, RM Badia arXiv preprint arXiv:2012.00825, 2020 | 25 | 2020 |
Exploiting spatial smoothness in HPC applications to detect silent data corruption L Bautista-Gomez, F Cappello 2015 IEEE 17th International Conference on High Performance Computing and …, 2015 | 25 | 2015 |
Adaptive performance-constrained in situ visualization of atmospheric simulations M Dorier, R Sisneros, LB Gomez, T Peterka, L Orf, L Rahmani, G Antoniu, ... 2016 IEEE International Conference on Cluster Computing (CLUSTER), 269-278, 2016 | 24 | 2016 |
Analysis of the tradeoffs between energy and run time for multilevel checkpointing P Balaprakash, LAB Gomez, MS Bouguerra, SM Wild, F Cappello, ... High Performance Computing Systems. Performance Modeling, Benchmarking, and …, 2015 | 19 | 2015 |
Detecting silent data corruption for extreme-scale MPI applications L Bautista-Gomez, F Cappello Proceedings of the 22nd European MPI Users' Group Meeting, 1-10, 2015 | 18 | 2015 |
Scalable Reed-Solomon-based reliable local storage for HPC applications on IaaS clouds LB Gomez, B Nicolae, N Maruyama, F Cappello, S Matsuoka Euro-Par 2012 Parallel Processing: 18th International Conference, Euro-Par …, 2012 | 18 | 2012 |