Dynet: The dynamic neural network toolkit G Neubig, C Dyer, Y Goldberg, A Matthews, W Ammar, A Anastasopoulos, ... arXiv preprint arXiv:1701.03980, 2017 | 278 | 2017 |
Understanding objects in detail with fine-grained attributes A Vedaldi, S Mahendran, S Tsogkas, S Maji, R Girshick, J Kannala, ... Proceedings of the IEEE conference on computer vision and pattern ¡¦, 2014 | 133 | 2014 |
A taxonomy and review of generalization research in NLP D Hupkes, M Giulianelli, V Dankers, M Artetxe, Y Elazar, T Pimentel, ... Nature Machine Intelligence 5 (10), 1161-1174, 2023 | 121* | 2023 |
Understanding learning dynamics of language models with SVCCA N Saphra, A Lopez arXiv preprint arXiv:1811.00225, 2018 | 116* | 2018 |
Understanding privacy-related questions on stack overflow M Tahaei, K Vaniea, N Saphra Proceedings of the 2020 CHI conference on human factors in computing systems ¡¦, 2020 | 101 | 2020 |
The multiberts: Bert reproductions for robustness analysis T Sellam, S Yadlowsky, J Wei, N Saphra, A D'Amour, T Linzen, J Bastings, ... arXiv preprint arXiv:2106.16163, 2021 | 93 | 2021 |
An algerian arabic-french code-switched corpus R Cotterell, A Renduchintala, N Saphra, C Callison-Burch Workshop on free/open-source arabic corpora and corpora processing tools ¡¦, 2014 | 76 | 2014 |
Pareto probing: Trading off accuracy for complexity T Pimentel, N Saphra, A Williams, R Cotterell arXiv preprint arXiv:2010.02180, 2020 | 63 | 2020 |
Linear connectivity reveals generalization strategies J Juneja, R Bansal, K Cho, J Sedoc, N Saphra arXiv preprint arXiv:2205.12411, 2022 | 50 | 2022 |
Sudden drops in the loss: Syntax acquisition, phase transitions, and simplicity bias in MLMs A Chen, R Shwartz-Ziv, K Cho, ML Leavitt, N Saphra arXiv preprint arXiv:2309.07311, 2023 | 38 | 2023 |
A non-linear structural probe JC White, T Pimentel, N Saphra, R Cotterell arXiv preprint arXiv:2105.10185, 2021 | 31 | 2021 |
A framework for (under) specifying dependency syntax without overloading annotators N Schneider, B O'Connor, N Saphra, D Bamman, M Faruqui, NA Smith, ... arXiv preprint arXiv:1306.2091, 2013 | 31 | 2013 |
LSTMs compose (and learn) bottom-up N Saphra, A Lopez arXiv preprint arXiv:2010.04650, 2020 | 19* | 2020 |
Benchmarking compositionality with formal languages J Valvoda, N Saphra, J Rawski, A Williams, R Cotterell arXiv preprint arXiv:2208.08195, 2022 | 16 | 2022 |
First tragedy, then parse: History repeats itself in the new era of large language models N Saphra, E Fleisig, K Cho, A Lopez arXiv preprint arXiv:2311.05020, 2023 | 15 | 2023 |
Amrica: an amr inspector for cross-language alignments N Saphra, A Lopez Proceedings of the 2015 conference of the north american chapter of the ¡¦, 2015 | 13 | 2015 |
Transcendence: Generative Models Can Outperform The Experts That Train Them E Zhang, V Zhu, N Saphra, A Kleiman, BL Edelman, M Tambe, ... arXiv preprint arXiv:2406.11741, 2024 | 8 | 2024 |
Latent state models of training dynamics MY Hu, A Chen, N Saphra, K Cho arXiv preprint arXiv:2308.09543, 2023 | 5 | 2023 |
Benchmarks as microscopes: A call for model metrology M Saxon, A Holtzman, P West, WY Wang, N Saphra arXiv preprint arXiv:2407.16711, 2024 | 4 | 2024 |
TRAM: Bridging Trust Regions and Sharpness Aware Minimization T Sherborne, N Saphra, P Dasigi, H Peng arXiv preprint arXiv:2310.03646, 2023 | 4 | 2023 |