Æȷοì
Handong Li
Á¦¸ñ
Àοë
Àοë
¿¬µµ
Vast: A vision-audio-subtitle-text omni-modality foundation model and dataset
S Chen, H Li, Q Wang, Z Zhao, M Sun, X Zhu, J Liu
Advances in Neural Information Processing Systems 36, 72842-72866, 2023
972023
GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model
X Yang, G Liu, G Feng, D Bu, P Wang, J Jiang, S Chen, Q Yang, H Miao, ...
Cell Research, 1-16, 2024
292024
Cosa: Concatenated sample pretrained vision-language foundation model
S Chen, X He, H Li, X Jin, J Feng, J Liu
arXiv preprint arXiv:2306.09085, 2023
72023
Explore the Limits of Omni-modal Pretraining at Scale
Y Zhang, H Li, J Liu, X Yue
arXiv preprint arXiv:2406.09412, 2024
22024
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
Z Liu, S Chen, L Guo, H Li, X He, J Liu
Proceedings of the 31st ACM International Conference on Multimedia, 5120-5131, 2023
12023
ÇöÀç ½Ã½ºÅÛÀÌ ÀÛµ¿µÇÁö ¾Ê½À´Ï´Ù. ³ªÁß¿¡ ´Ù½Ã ½ÃµµÇØ ÁÖ¼¼¿ä.
ÇмúÀÚ·á 1–5