Matthew D. Zeiler

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, Rob Fergus|Lecture notes in computer science|2014

Cited by 15.4kOpen Access

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler|arXiv (Cornell University)|2012

Cited by 5.5kOpen Access

We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning of a learning rate and appears robust to noisy gradient information, different model architecture choices, various data modalities and selection of hyperparameters. We show promising results compared to other methods on the MNIST digit classification task using a single machine and on a large scale voice dataset in a distributed cluster environment.

Regularization of Neural Networks using DropConnect

Li Wan, Matthew D. Zeiler, Sixin Zhang et al.|International review of cytology|2013

Cited by 1.9k

We introduce DropConnect, a generalization of Dropout (Hinton et al., 2012), for regular-izing large fully-connected layers within neu-ral networks. When training with Dropout, a randomly selected subset of activations are set to zero within each layer. DropCon-nect instead sets a randomly selected sub-set of weights within the network to zero. Each unit thus receives input from a ran-dom subset of units in the previous layer. We derive a bound on the generalization per-formance of both Dropout and DropCon-nect. We then evaluate DropConnect on a range of datasets, comparing to Dropout, and show state-of-the-art results on several image recognition benchmarks by aggregating mul-tiple DropConnect-trained models. 1.

Deconvolutional networks

Matthew D. Zeiler, Dilip Krishnan, Graham W. Taylor et al.|Unknown|2010

Cited by 1.7k

Building robust low and mid-level image representations, beyond edge primitives, is a long-standing goal in vision. Many existing feature detectors spatially pool edge information which destroys cues such as edge intersections, parallelism and symmetry. We present a learning framework where features that capture these mid-level cues spontaneously emerge from image data. Our approach is based on the convolutional decomposition of images under a spar-sity constraint and is totally unsupervised. By building a hierarchy of such decompositions we can learn rich feature sets that are a robust image representation for both the analysis and synthesis of images.

Adaptive deconvolutional networks for mid and high level feature learning

Matthew D. Zeiler, Graham W. Taylor, Rob Fergus|Unknown|2011

Cited by 1.3k

We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.

Matthew D. Zeiler

Is this you? Claim your profile.

Top publicationsby citations