UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Huaishao Luo(Southwest Jiaotong University), Ming Zhou, Botian Shi, Taroon Bharti, Tianrui Li(Shanghai Normal University), Lei Ji, Haoyang Huang, Jason Li, Nan Duan(Microsoft Research Asia (China))
arXiv (Cornell University)
February 15, 2020
Cited by 169


Related Papers