Follow
Zili Huang
Title
Cited by
Cited by
Year
SUPERB: Speech processing Universal PERformance Benchmark
S Yang, PH Chi, YS Chuang, CIJ Lai, K Lakhotia, YY Lin, AT Liu, J Shi, ...
arXiv preprint arXiv:2105.01051, 2021
7472021
Angular Softmax for Short-Duration Text-independent Speaker Verification.
Z Huang, S Wang, K Yu
Interspeech, 3623-3627, 2018
1162018
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis
D Raj, P Denisov, Z Chen, H Erdogan, Z Huang, M He, S Watanabe, J Du, ...
2021 IEEE Spoken Language Technology Workshop (SLT), 897-904, 2021
862021
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
HS Tsai, HJ Chang, WC Huang, Z Huang, K Lakhotia, S Yang, S Dong, ...
arXiv preprint arXiv:2203.06849, 2022
832022
DOVER-Lap: A method for combining overlap-aware diarization outputs
D Raj, LP Garcia-Perera, Z Huang, S Watanabe, D Povey, A Stolcke, ...
2021 IEEE Spoken Language Technology Workshop (SLT), 881-888, 2021
722021
Speaker diarization with region proposal network
Z Huang, S Watanabe, Y Fujita, P García, Y Shao, D Povey, S Khudanpur
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
712020
Investigating self-supervised learning for speech enhancement and separation
Z Huang, S Watanabe, S Yang, P García, S Khudanpur
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
552022
The hitachi-jhu dihard iii system: Competitive end-to-end neural diarization and x-vector clustering systems combined by dover-lap
S Horiguchi, N Yalta, P Garcia, Y Takashima, Y Xue, D Raj, Z Huang, ...
arXiv preprint arXiv:2102.01363, 2021
392021
Recover missing sensor data with iterative imputing network
J Zhou, Z Huang
Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018
392018
Discriminative neural embedding learning for short-duration text-independent speaker verification
S Wang, Z Huang, Y Qian, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing 27 (11 …, 2019
372019
Multi-class spectral clustering with overlaps for speaker diarization
D Raj, Z Huang, S Khudanpur
2021 IEEE Spoken Language Technology Workshop (SLT), 582-589, 2021
342021
SUPERB@ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
T Feng, A Dong, CF Yeh, S Yang, TQ Lin, J Shi, KW Chang, Z Huang, ...
2022 IEEE Spoken Language Technology Workshop (SLT), 1096-1103, 2023
312023
Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
M He, D Raj, Z Huang, J Du, Z Chen, S Watanabe
arXiv preprint arXiv:2108.03342, 2021
312021
Joint i-vector with end-to-end system for short duration text-independent speaker verification
Z Huang, S Wang, Y Qian
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
202018
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Z Huang, D Raj, P García, S Khudanpur
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
152023
Joint speaker diarization and speech recognition based on region proposal networks
Z Huang, M Delcroix, LP Garcia, S Watanabe, D Raj, S Khudanpur
Computer Speech & Language 72, 101316, 2022
62022
JHU Diarization System Description.
Z Huang, LP García-Perera, J Villalba, D Povey, N Dehak
IberSPEECH, 236-239, 2018
62018
UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech Processing
Z Huang, Y Shao, SX Zhang, D Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Z Huang, Z Chen, N Kanda, J Wu, Y Wang, J Li, T Yoshioka, X Wang, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
22023
A Large-Scale Evaluation of Speech Foundation Models
S Yang, HJ Chang, Z Huang, AT Liu, CI Lai, H Wu, J Shi, X Chang, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–20