CLIP4Clip: An empirical study of CLIP for end to end video clip retrieval and captioning
Huaishao Luo(Southwest Jiaotong University), Tianrui Li(Shanghai Normal University), Yang Chen(Microsoft Research Asia (China)), Lei Ji(University of Chinese Academy of Sciences), Wen Lei(Microsoft Research Asia (China)), Ming Zhong(Microsoft Research Asia (China)), Nan Duan(Microsoft Research Asia (China))
Cited by 676
Related Papers
Predicting citywide crowd flows using deep spatio-temporal residual networks
|Artificial Intelligence|2018|517
Forecasting Fine-Grained Air Quality Based on Big Data
|Unknown|2015|472
Deep Air Quality Forecasting Using Hybrid Deep Learning Framework
|IEEE Transactions on Knowledge and Data Engineering|2019|467
Deep Distributed Fusion Network for Air Quality Prediction
|Unknown|2018|325
Long sequence time-series forecasting with deep learning: A survey
|Information Fusion|2023|306