Æȷοì
Xinxin Zhu ñ¹ýÛ鑫
Xinxin Zhu ñ¹ýÛ鑫
Institute of Automation of the Chinese Academy Sciences (CASIA)
nlpr.ia.ac.cnÀÇ À̸ÞÀÏ È®ÀεÊ
Á¦¸ñ
Àοë
Àοë
¿¬µµ
Normalized and geometry-aware self-attention network for image captioning
L Guo, J Liu, X Zhu, P Yao, S Lu, H Lu
Proceedings of the IEEE/CVF conference on computer vision and pattern ¡¦, 2020
2512020
Cptr: Full transformer network for image captioning
W Liu, S Chen, L Guo, X Zhu, J Liu
arXiv preprint arXiv:2101.10804, 2021
2082021
Captioning transformer with stacked attention modules
X Zhu, L Li, J Liu, H Peng, X Niu
Applied Sciences 8 (5), 739, 2018
1152018
Vast: A vision-audio-subtitle-text omni-modality foundation model and dataset
S Chen, H Li, Q Wang, Z Zhao, M Sun, X Zhu, J Liu
Advances in Neural Information Processing Systems 36, 72842-72866, 2023
882023
Valor: Vision-audio-language omni-perception pretraining model and dataset
S Chen, X He, L Guo, X Zhu, W Wang, J Tang, J Liu
arXiv preprint arXiv:2304.08345, 2023
842023
Image captioning with triple-attention and stack parallel LSTM
X Zhu, L Li, J Liu, Z Li, H Peng, X Niu
Neurocomputing 319, 55-65, 2018
622018
Non-autoregressive image captioning with counterfactuals-critical multi-agent learning
L Guo, J Liu, X Zhu, X He, J Jiang, H Lu
arXiv preprint arXiv:2005.04690, 2020
562020
Chatbridge: Bridging modalities with large language model as a language catalyst
Z Zhao, L Guo, T Yue, S Chen, S Shao, X Zhu, Z Yuan, J Liu
arXiv preprint arXiv:2305.16103, 2023
452023
OPT: Omni-perception pre-trainer for cross-modal understanding and generation
J Liu, X Zhu, F Liu, L Guo, Z Zhao, M Sun, W Wang, H Lu, S Zhou, J Zhang, ...
arXiv preprint arXiv:2107.00249, 2021
442021
Global-local propagation network for RGB-D semantic segmentation
S Chen, X Zhu, W Liu, X He, J Liu
arXiv preprint arXiv:2101.10801, 2021
242021
AutoCaption: Image captioning with neural architecture search
X Zhu, W Wang, L Guo, J Liu
arXiv preprint arXiv:2012.09742, 2020
182020
Global-guided selective context network for scene parsing
J Jiang, J Liu, J Fu, X Zhu, Z Li, H Lu
IEEE Transactions on Neural Networks and Learning Systems 33 (4), 1752-1764, 2020
142020
MOSO: Decomposing motion, scene and object for video prediction
M Sun, W Wang, X Zhu, J Liu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ¡¦, 2023
122023
Sounding video generator: A unified framework for text-guided sounding video generation
J Liu, W Wang, S Chen, X Zhu, J Liu
IEEE Transactions on Multimedia 26, 141-153, 2023
92023
Fast sequence generation with multi-agent reinforcement learning
L Guo, J Liu, X Zhu, H Lu
arXiv preprint arXiv:2101.09698, 2021
92021
Dual hierarchical temporal convolutional network with QA-aware dynamic normalization for video story question answering
F Liu, J Liu, X Zhu, R Hong, H Lu
Proceedings of the 28th ACM International Conference on Multimedia, 4253-4261, 2020
82020
Mm21 pre-training for video understanding challenge: Video captioning with pretraining techniques
S Chen, X Zhu, D Hao, W Liu, J Liu, Z Zhao, L Guo, J Liu
Proceedings of the 29th ACM International Conference on Multimedia, 4853-4857, 2021
72021
Dynamic warping network for semantic video segmentation
J Li, Y Zhao, X He, X Zhu, J Liu
Complexity 2021 (1), 6680509, 2021
72021
Image captioning with word gate and adaptive self-critical learning
X Zhu, L Li, J Liu, L Guo, Z Fang, H Peng, X Niu
Applied Sciences 8 (6), 909, 2018
72018
Cptr: Full transformer network for image captioning (2021)
W Liu, S Chen, L Guo, X Zhu, J Liu
arXiv preprint arXiv:2101.10804, 0
6
ÇöÀç ½Ã½ºÅÛÀÌ ÀÛµ¿µÇÁö ¾Ê½À´Ï´Ù. ³ªÁß¿¡ ´Ù½Ã ½ÃµµÇØ ÁÖ¼¼¿ä.
ÇмúÀÚ·á 1–20