D

Denis Yarats

Supélec

Publishes on Reinforcement Learning in Robotics, Domain Adaptation and Few-Shot Learning, Topic Modeling. 38 papers and 4.5k citations.

38Publications
4.5kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Convolutional Sequence to Sequence Learning
Jonas Gehring, Michael Auli, David Grangier et al.|arXiv (Cornell University)|2017
Cited by 1.9kOpen Access

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to recurrent models, computations over all elements can be fully parallelized during training and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT'14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.

Convolutional Sequence to Sequence Learning
Jonas Gehring, Michael Auli, David Grangier et al.|International Conference on Machine Learning|2017
Cited by 1.3k

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to recurrent models, computations over all elements can be fully parallelized during training and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT'14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.

Image Augmentation Is All You Need: Regularizing Deep Reinforcement\n Learning from Pixels
Ilya Kostrikov, Denis Yarats, Rob Fergus|arXiv (Cornell University)|2020
Cited by 171Open Access

We propose a simple data augmentation technique that can be applied to\nstandard model-free reinforcement learning algorithms, enabling robust learning\ndirectly from pixels without the need for auxiliary losses or pre-training. The\napproach leverages input perturbations commonly used in computer vision tasks\nto regularize the value function. Existing model-free approaches, such as Soft\nActor-Critic (SAC), are not able to train deep networks effectively from image\npixels. However, the addition of our augmentation method dramatically improves\nSAC's performance, enabling it to reach state-of-the-art performance on the\nDeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC)\nmethods and recently proposed contrastive learning (CURL). Our approach can be\ncombined with any model-free reinforcement learning algorithm, requiring only\nminor modifications. An implementation can be found at\nhttps://sites.google.com/view/data-regularized-q.\n

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
Ilya Kostrikov, Denis Yarats, Rob Fergus|arXiv (Cornell University)|2020
Cited by 135Open Access

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training. The approach leverages input perturbations commonly used in computer vision tasks to regularize the value function. Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels. However, the addition of our augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC) methods and recently proposed contrastive learning (CURL). Our approach can be combined with any model-free reinforcement learning algorithm, requiring only minor modifications. An implementation can be found at https://sites.google.com/view/data-regularized-q.

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
Denis Yarats, Amy Zhang, Ilya Kostrikov et al.|Proceedings of the AAAI Conference on Artificial Intelligence|2021
Cited by 121Open Access

Training an agent to solve control tasks directly from high-dimensional images with model-free reinforcement learning (RL) has proven difficult. A promising approach is to learn a latent representation together with the control policy. However, fitting a high-capacity encoder using a scarce reward signal is sample inefficient and leads to poor performance. Prior work has shown that auxiliary losses, such as image reconstruction, can aid efficient representation learning. However, incorporating reconstruction loss into an off-policy learning algorithm often leads to training instability. We explore the underlying reasons and identify variational autoencoders, used by previous investigations, as the cause of the divergence. Following these findings, we propose effective techniques to improve training stability. This results in a simple approach capable of matching state-of-the-art model-free and model-based algorithms on MuJoCo control tasks. Furthermore, our approach demonstrates robustness to observational noise, surpassing existing approaches in this setting. Code, results, and videos are anonymously available at https://sites.google.com/view/sac-ae/home.