Self-training and pre-training are complementary for speech recognition Q Xu, A Baevski, T Likhomanenko, P Tomasello, A Conneau, R Collobert, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 168 | 2021 |
Massively multilingual ASR: 50 languages, 1 model, 1 billion parameters V Pratap, A Sriram, P Tomasello, A Hannun, V Liptchinsky, G Synnaeve, ... arXiv preprint arXiv:2007.03001, 2020 | 136 | 2020 |
Scaling speech technology to 1,000+ languages V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ... Journal of Machine Learning Research 25 (97), 1-52, 2024 | 115 | 2024 |
Rethinking evaluation in asr: Are our models robust enough? T Likhomanenko, Q Xu, V Pratap, P Tomasello, J Kahn, G Avidov, ... arXiv preprint arXiv:2010.11745, 2020 | 91 | 2020 |
Generative spoken dialogue language modeling TA Nguyen, E Kharitonov, J Copet, Y Adi, WN Hsu, A Elkahky, ... Transactions of the Association for Computational Linguistics 11, 250-266, 2023 | 56 | 2023 |
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, PA Duquenne, ... arXiv preprint arXiv:2308.11596, 2023 | 36 | 2023 |
Stop: A dataset for spoken task oriented semantic parsing P Tomasello, A Shrivastava, D Lazar, PC Hsu, D Le, A Sagar, A Elkahky, ... 2022 IEEE Spoken Language Technology Workshop (SLT), 991-998, 2023 | 26 | 2023 |
Flashlight: Enabling innovation in tools for machine learning JD Kahn, V Pratap, T Likhomanenko, Q Xu, A Hannun, J Cai, P Tomasello, ... International Conference on Machine Learning, 10557-10574, 2022 | 22 | 2022 |
Seamless: Multilingual Expressive and Streaming Speech Translation L Barrault, YA Chung, MC Meglioli, D Dale, N Dong, M Duppenthaler, ... arXiv preprint arXiv:2312.05187, 2023 | 13 | 2023 |
Deliberation model for on-device spoken language understanding D Le, A Shrivastava, P Tomasello, S Kim, A Livshits, O Kalinli, ML Seltzer arXiv preprint arXiv:2204.01893, 2022 | 11 | 2022 |
textless-lib: A library for textless spoken language processing E Kharitonov, J Copet, K Lakhotia, TA Nguyen, P Tomasello, A Lee, ... arXiv preprint arXiv:2202.07359, 2022 | 10 | 2022 |
Speech-to-speech translation for a real-world unwritten language PJ Chen, K Tran, Y Yang, J Du, J Kao, YA Chung, P Tomasello, ... arXiv preprint arXiv:2211.06474, 2022 | 9 | 2022 |
Dscnet: Replicating lidar point clouds with deep sensor cloning P Tomasello, S Sidhu, A Shen, MW Moskewicz, N Redmon, G Joshi, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 8 | 2019 |
Efficient speech representation learning with low-bit quantization CF Yeh, WN Hsu, P Tomasello, A Mohamed arXiv preprint arXiv:2301.00652, 2022 | 6 | 2022 |
Hybrid transducer and attention based encoder-decoder modeling for speech-to-text tasks Y Tang, AY Sun, H Inaguma, X Chen, N Dong, X Ma, PD Tomasello, ... arXiv preprint arXiv:2305.03101, 2023 | 5 | 2023 |
Continual learning for on-device speech recognition using disentangled conformers A Diwan, CF Yeh, WN Hsu, P Tomasello, E Choi, D Harwath, A Mohamed ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 4 | 2023 |
Do Coarser Units Benefit Cluster Prediction-Based Speech Pre-Training? A Elkahky, WN Hsu, P Tomasello, TA Nguyen, R Algayres, Y Adi, J Copet, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 2 | 2023 |
Efficient monotonic multihead attention X Ma, A Sun, S Ouyang, H Inaguma, P Tomasello arXiv preprint arXiv:2312.04515, 2023 | 1 | 2023 |
Generative Spoken Dialogue Language Modeling: preprint version TA Nguyen, E Kharitonov, J Copet, Y Adi, WN Hsu, A Elkahky, ... | 1 | 2022 |
[TACL] Generative Spoken Dialogue Language Modeling TA Nguyen, E Kharitonov, J Copet, Y Adi, WN Hsu, A Elkahky, ... The 61st Annual Meeting Of The Association For Computational Linguistics, 2023 | | 2023 |