S

Srinadh Bhojanapalli

Google (United States)

ORCID: 0000-0002-4147-2106

Publishes on Sparse and Compressive Sensing Techniques, Topic Modeling, Advanced Neural Network Applications. 68 papers and 3.3k citations.

68Publications
3.3kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Exploring Generalization in Deep Learning
Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester et al.|Neural Information Processing Systems|2017
Cited by 489

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.

Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli, Ayan Chakrabarti, Daniel Gläsner et al.|2021 IEEE/CVF International Conference on Computer Vision (ICCV)|2021
Cited by 393

Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification. However, details of the Transformer architecture –such as the use of non-overlapping patches– lead one to wonder whether these networks are as robust. In this paper, we perform an extensive study of a variety of different measures of robustness of ViT models and compare the findings to ResNet baselines. We investigate robustness to input perturbations as well as robustness to model perturbations. We find that when pre-trained with a sufficient amount of data, ViT models are at least as robust as the ResNet counterparts on a broad range of perturbations. We also find that Transformers are robust to the removal of almost any single layer, and that while activations from later layers are highly correlated with each other, they nevertheless play an important role in classification.

Exploring Generalization in Deep Learning
Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester et al.|arXiv (Cornell University)|2017
Cited by 296Open Access

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks
Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli et al.|arXiv (Cornell University)|2018
Cited by 215Open Access

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.