Æȷοì
Zhi-Jie Yan
Zhi-Jie Yan
iDST, Alibaba Inc.
alibaba-inc.comÀÇ À̸ÞÀÏ È®ÀεÊ
Á¦¸ñ
Àοë
Àοë
¿¬µµ
I-Vector Based Clustering Training Data in Speech Recognition
Q Huo, ZJ Yan, Y Zhang, J Xu
US Patent App. 13/640,804, 2015
2372015
Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models
Y Chu, J Xu, X Zhou, Q Yang, S Zhang, Z Yan, C Zhou, J Zhou
arXiv preprint arXiv:2311.07919, 2023
1632023
Deep-FSMN for large vocabulary continuous speech recognition
S Zhang, M Lei, Z Yan, L Dai
2018 IEEE International Conference on Acoustics, Speech and Signal ¡¦, 2018
1302018
M2MeT: The ICASSP 2022 multi-channel multi-party meeting transcription challenge
F Yu, S Zhang, Y Fu, L Xie, S Zheng, Z Du, W Huang, P Guo, Z Yan, B Ma, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and ¡¦, 2022
922022
Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition
Z Gao, S Zhang, I McLoughlin, Z Yan
arXiv preprint arXiv:2206.08317, 2022
832022
A unified trajectory tiling approach to high quality speech rendering
Y Qian, FK Soong, ZJ Yan
IEEE transactions on audio, speech, and language processing 21 (2), 280-290, 2012
662012
A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
ZJ Yan, Q Huo, J Xu
Proc. Interspeech 2013, 104-108, 2013
642013
Improving latency-controlled BLSTM acoustic models for online speech recognition
S Xue, Z Yan
2017 IEEE International Conference on Acoustics, Speech and Signal ¡¦, 2017
632017
A context-sensitive-chunk BPTT approach to training deep LSTM/BLSTM recurrent neural networks for offline handwriting recognition
K Chen, ZJ Yan, Q Huo
2015 13th International Conference on Document Analysis and Recognition ¡¦, 2015
442015
Lauragpt: Listen, attend, understand, and regenerate audio with gpt
Z Du, J Wang, Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, ...
arXiv preprint arXiv:2310.04673, 2023
402023
Prosospeech: Enhancing prosody with quantized vector pre-training in text-to-speech
Y Ren, M Lei, Z Huang, S Zhang, Q Chen, Z Yan, Z Zhao
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and ¡¦, 2022
402022
Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition.
S Zhang, M Lei, Z Yan
Interspeech, 2180-2184, 2019
392019
Rich-context unit selection (RUS) approach to high quality TTS
ZJ Yan, Y Qian, FK Soong
2010 IEEE International Conference on Acoustics, Speech and Signal ¡¦, 2010
392010
Rich context modeling for high quality HMM-based TTS
ZJ Yan, Y Qian, FK Soong
Tenth Annual Conference of the International Speech Communication Association, 2009
362009
Improved modeling for F0 generation and V/U decision in HMM-based TTS
Q Zhang, F Soong, Y Qian, Z Yan, J Pan, Y Yan
2010 IEEE International Conference on Acoustics, Speech and Signal ¡¦, 2010
342010
Trajectory Tiling Approach for Text-to-Speech
Y Qian, ZJ Yan, YJ Wu, FKP Soong
US Patent App. 12/962,543, 2012
332012
Method and apparatus for initiating an operation using voice data
XU Minqiang, Z Yan, J Gao, M Chu
US Patent App. 15/292,632, 2017
312017
Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge
F Yu, S Zhang, P Guo, Y Fu, Z Du, S Zheng, W Huang, L Xie, ZH Tan, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and ¡¦, 2022
302022
Streaming chunk-aware multihead attention for online end-to-end speech recognition
S Zhang, Z Gao, H Luo, M Lei, J Gao, Z Yan, L Xie
arXiv preprint arXiv:2006.01712, 2020
302020
Cosyvoice: A scalable multilingual zero-shot text-to-speech synthesizer based on supervised semantic tokens
Z Du, Q Chen, S Zhang, K Hu, H Lu, Y Yang, H Hu, S Zheng, Y Gu, Z Ma, ...
arXiv preprint arXiv:2407.05407, 2024
292024
ÇöÀç ½Ã½ºÅÛÀÌ ÀÛµ¿µÇÁö ¾Ê½À´Ï´Ù. ³ªÁß¿¡ ´Ù½Ã ½ÃµµÇØ ÁÖ¼¼¿ä.
ÇмúÀÚ·á 1–20