Thomas Leung

Large-Scale Video Classification with Convolutional Neural Networks

Andrej Karpathy, George Toderici, Sanketh Shetty et al.|Unknown|2014

Cited by 6.3k

Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated architecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).

Texture synthesis by non-parametric sampling

Alexei A. Efros, Thomas Leung|Unknown|1999

Cited by 3k

A non-parametric method for texture synthesis is proposed. The texture synthesis process grows a new image outward from an initial seed, one pixel at a time. A Markov random field model is assumed, and the conditional distribution of a pixel given all its neighbors synthesized so far is estimated by querying the sample image and finding all similar neighborhoods. The degree of randomness is controlled by a single perceptually intuitive parameter. The method aims at preserving as much local structure as possible and produces good results for a wide variety of synthetic and real-world textures.

Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons

Thomas Leung, Jitendra Malik|International Journal of Computer Vision|2001

Cited by 1.5k

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Andrej Karpathy, George Toderici, Sanketh Shetty et al.|Unknown|2014

Cited by 1.3k

Learning Fine-Grained Image Similarity with Deep Ranking

Jiang Wang, Yang Song, Thomas Leung et al.|Unknown|2014

Cited by 1.2k

Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class image differences. This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from images. It has higher learning capability than models based on hand-crafted features. A novel multiscale network structure has been developed to describe the images effectively. An efficient triplet sampling algorithm is also proposed to learn the model with distributed asynchronized stochastic gradient. Extensive experiments show that the proposed algorithm outperforms models based on hand-crafted visual features and deep classification models.

Is this you? Claim your profile.

Top publicationsby citations