Fastspeech 2: Fast and high-quality end-to-end text to speech Y Ren*, C Hu*, X Tan, T Qin, S Zhao, Z Zhao, TY Liu ICLR 2021, 2020 | 1170 | 2020 |
ViP3D: End-to-end visual trajectory prediction via 3d agent queries J Gu*, C Hu*, T Zhang, X Chen, Y Wang, Y Wang, H Zhao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 46 | 2023 |
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory C Hu*, J Fu*, C Du, S Luo, J Zhao, H Zhao arXiv preprint arXiv:2306.03901, 2023 | 42 | 2023 |
Neural Dubber: Dubbing for Videos According to Scripts C Hu, Q Tian, T Li, Y Wang, Y Wang, H Zhao Advances in Neural Information Processing Systems 34, 2021 | 19 | 2021 |
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech Z Liu*, Q Tian*, C Hu*, X Liu, M Wu, Y Wang, H Zhao, Y Wang arXiv preprint arXiv:2207.06088, 2022 | 9 | 2022 |
CVC: Contrastive Learning for Non-parallel Voice Conversion T Li*, Y Liu*, C Hu*, H Zhao INTERSPEECH 2021, 2020 | 9 | 2020 |
Diff-Foley: Synchronized video-to-audio synthesis with latent diffusion models S Luo, C Yan, C Hu, H Zhao Advances in Neural Information Processing Systems 36, 2024 | 8 | 2024 |
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models X Tian, J Gu, B Li, Y Liu, C Hu, Y Wang, K Zhan, P Jia, X Lang, H Zhao arXiv preprint arXiv:2402.12289, 2024 | 3 | 2024 |
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech—A Study between English and Mandarin T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 3 | 2023 |