VidTr: Video Transformer Without Convolutions Y Zhang, X Li, C Liu, B Shuai, Y Zhu, B Brattoli, H Chen, I Marsic, J Tighe 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 13557-13567, 2021 | 202* | 2021 |
A comprehensive study of deep video action recognition Y Zhu, X Li, C Liu, M Zolfaghari, Y Xiong, C Wu, Z Zhang, J Tighe, ... arXiv preprint arXiv:2012.06567, 2020 | 185 | 2020 |
Deep Learning for RFID-Based Activity Recognition X Li, Y Zhang, I Marsic, A Sarcevic, RS Burd The 14th ACM Conference on Embedded Networked Sensor Systems (SenSys 2016), 2016 | 155 | 2016 |
Multimodal affective analysis using hierarchical attention strategy with word-level alignment Y Gu, K Yang, S Fu, S Chen, X Li, I Marsic Proceedings of the conference. Association for Computational Linguistics …, 2018 | 149 | 2018 |
SiamMOT: Siamese Multi-Object Tracking B Shuai, A Berneshawi, X Li, D Modolo, J Tighe arXiv preprint arXiv:2105.11595, 2021 | 139 | 2021 |
Long short-term transformer for online action detection M Xu, Y Xiong, H Chen, X Li, W Xia, Z Tu, S Soatto Advances in Neural Information Processing Systems 34, 1086-1099, 2021 | 93 | 2021 |
Multi-stream network with temporal attention for environmental sound classification X Li, V Chebiyyam, K Kirchhoff arXiv preprint arXiv:1901.08608, 2019 | 82 | 2019 |
TubeR: Tubelet transformer for video action detection J Zhao, Y Zhang, X Li, H Chen, S Bing, M Xu, C Liu, K Kundu, Y Xiong, ... arXiv preprint arXiv:2104.00969, 2021 | 71 | 2021 |
Hybrid attention based multimodal network for spoken language classification Y Gu, K Yang, S Fu, S Chen, X Li, I Marsic Proceedings of the Conference. association for Computational Linguistics …, 2018 | 66 | 2018 |
Speech intention classification with multimodal deep learning Y Gu, X Li, S Chen, J Zhang, I Marsic Advances in Artificial Intelligence: 30th Canadian Conference on Artificial …, 2017 | 62 | 2017 |
Video contrastive learning with global context H Kuang, Y Zhu, Z Zhang, X Li, J Tighe, S Schwertfeger, C Stachniss, M Li Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 58 | 2021 |
Concurrent activity recognition with multimodal CNN-LSTM structure X Li, Y Zhang, J Zhang, S Chen, I Marsic, RA Farneth, RS Burd arXiv preprint arXiv:1702.01638, 2017 | 52 | 2017 |
Directional temporal modeling for action recognition X Li, B Shuai, J Tighe Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 49 | 2020 |
Human conversation analysis using attentive multimodal networks with hierarchical encoder-decoder Y Gu, X Li, K Huang, S Fu, K Yang, S Chen, M Zhou, I Marsic Proceedings of the 26th ACM international conference on Multimedia, 537-545, 2018 | 40 | 2018 |
Activity Recognition for Medical Teamwork Based on Passive RFID X Li, D Yao, X Pan, J Johannaman, JW Yang, R Webman, A Sarcevic, ... 2016 IEEE International Conference on RFID (RFID), 1-9, 2016 | 37 | 2016 |
What to look at and where: Semantic and spatial refined transformer for detecting human-object interactions ASM Iftekhar, H Chen, K Kundu, X Li, J Tighe, D Modolo Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 34 | 2022 |
Mutual correlation attentive factors in dyadic fusion networks for speech emotion recognition Y Gu, X Lyu, W Sun, W Li, S Chen, X Li, I Marsic Proceedings of the 27th ACM International Conference on Multimedia, 157-166, 2019 | 33 | 2019 |
Deep neural network for RFID-based activity recognition X Li, Y Zhang, M Li, I Marsic, JW Yang, RS Burd Proceedings of the Eighth Wireless of the Students, by the Students, and for …, 2016 | 27 | 2016 |
Region-based Activity Recognition Using Conditional GAN X Li, Y Zhang, J Zhang, Y Chen, H Li, I Marsic, RS Burd Proceedings of the 2017 ACM on Multimedia Conference, 1059-1067, 2017 | 25 | 2017 |
Speech Audio Super-Resolution for Speech Recognition. X Li, V Chebiyyam, K Kirchhoff, AI Amazon INTERSPEECH, 3416-3420, 2019 | 24 | 2019 |