Shi-Xiong (Austin) Zhang

Cited by

	All	Since 2019
Citations	2546	2170
h-index	28	26
i10-index	49	44

660

330

165

495

20092010201120122013201420152016201720182019202020212022202320247 7 11 18 15 50 44 55 62 95 136 238 376 540 650 224

Public access

View all

6 articles

1 article

available

not available

Based on funding mandates

Co-authors

Meng YUTencent AI LabVerified email at tencent.com
Yong XuPrincipal Researcher, Tencent America, Bellevue, USAVerified email at tencent.com
Dong Yu (俞栋)Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA FellowVerified email at global.tencent.com
Rongzhi GuTencent AI LabVerified email at pku.edu.cn
Yifan GongPrincipal Science Manager, Microsoft Corp.Verified email at microsoft.com
Mark GalesCambridge UniversityVerified email at eng.cam.ac.uk
Jinyu LiPartner Applied Science Manager, MicrosoftVerified email at microsoft.com
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
M.W. MakThe Hong Kong Polytechnic UniversityVerified email at polyu.edu.hk
Xunying LiuChinese University of Hong KongVerified email at se.cuhk.edu.hk
Yong ZhaoMicrosoft CorporationVerified email at microsoft.com
Kaisheng YaoGoogleVerified email at google.com
Fahimeh BahmaninezhadMicrosoftVerified email at microsoft.com
Jianwei YuTencent AI labVerified email at tencent.com
Kate KnillUniversity of CambridgeVerified email at eng.cam.ac.uk
Philip WoodlandProfessor of Information Engineering, Cambridge University Engineering DepartmentVerified email at eng.cam.ac.uk
Yajie MiaoCarnegie Mellon UniversityVerified email at cs.cmu.edu
Rui ZhaomicrosoftVerified email at microsoft.com
Rogier van DalenSamsung AI CenterVerified email at samsung.com

Shi-Xiong (Austin) Zhang

Other namesShi-Xiong Zhang, Shixiong Zhang

Sr. Director | AI Foundations@Capital One | ex-Microsoft, ex-Tencent, Cambridge PhD

Verified email at capitalone.com

Multi-modal Foundation Models ASR Speech Processing NLP


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
An overview of deep-learning-based audio-visual speech enhancement and separation D Michelsanti, ZH Tan, SX Zhang, Y Xu, M Yu, D Yu, J Jensen IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 1368-1396, 2021	225	2021
End-to-end attention based text-dependent speaker verification SX Zhang, Z Chen, Y Zhao, J Li, Y Gong 2016 IEEE Spoken Language Technology Workshop (SLT), 171-178, 2016	202	2016
Time Domain Audio Visual Speech Separation J Wu, Y Xu, SX Zhang, LW Chen, M Yu, L Xie, D Yu Automatic Speech Recognition and Understanding Workshop, ASRU 2019,, 2019	117	2019
ADL-MVDR: All deep learning MVDR beamformer for target speech separation Z Zhang, Y Xu, M Yu, SX Zhang, L Chen, D Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	114	2021
Investigation of Multilingual Deep Neural Networks for Spoken Term Detection K Knill, MJF Gales, S Rath, P Woodland, SX Zhang ASRU, 2013	102	2013
Multi-modal multi-channel target speech separation R Gu, SX Zhang, Y Xu, L Chen, Y Zou, D Yu IEEE Journal of Selected Topics in Signal Processing 14 (3), 530-541, 2020	99	2020
Computerized intelligent assistant for conferences A Diamant, KM Ben-Dor, E Krupka, R Halaly, Y Smolin, I Gurvich, ... US Patent 10,867,610, 2020	98	2020
SIMPLIFYING LONG SHORT-TERM MEMORY ACOUSTIC MODELS FOR FAST TRAINING AND DECODING Y Miao, J Li, Y Wang, S Zhang, Y Gong ICASSP, 2016	98	2016
Audio-visual Recognition of Overlapped speech for the LRS2 dataset J Yu, SX Zhang, J Wu, S Ghorbani, B Wu, S Kang, S Liu, X Liu, H Meng, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	93	2020
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information R Gu, L Chen, SX Zhang, J Zheng, Y Xu, M Yu, D Su, Y Zou, D Yu	91	2019
A comprehensive study of speech separation: spectrogram vs waveform separation F Bahmaninezhad, J Wu, R Gu, SX Zhang, Y Xu, M Yu, D Yu arXiv preprint arXiv:1905.07497, 2019	88	2019
End-to-end multi-channel speech separation R Gu, J Wu, SX Zhang, L Chen, Y Xu, M Yu, D Su, Y Zou, D Yu arXiv preprint arXiv:1905.06286, 2019	85	2019
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning R Gu, SX Zhang, L Chen, Y Xu, M Yu, D Su, Y Zou, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	62	2020
New era for robust speech recognition: exploiting deep learning S Watanabe, M Delcroix, F Metze, JR Hershey, et al. Springer, 2017	61*	2017
Audio-visual speech separation and dereverberation with a two-stage multimodal network K Tan, Y Xu, SX Zhang, M Yu, D Yu IEEE Journal of Selected Topics in Signal Processing 14 (3), 542-553, 2020	51	2020
Structured SVMs for automatic speech recognition SX Zhang, MJF Gales IEEE Transactions on Audio, Speech, and Language Processing 21 (3), 544-555, 2012	50	2012
DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION SX Zhang, C Liu, K Yao, Y Gong ICASSP 2015, 2015	48	2015
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives AS Subramanian, C Weng, M Yu, SX Zhang, Y Xu, S Watanabe, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	43	2020
FAST-RIR: Fast neural diffuse room impulse response generator A Ratnarajah, SX Zhang, M Yu, Z Tang, D Manocha, D Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	39	2022
Neural Spatio-Temporal Beamformer for Target Speech Separation Y Xu, M Yu, SX Zhang, L Chen, C Weng, J Liu, D Yu arXiv preprint arXiv:2005.03889, 2020	38	2020

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors