Deterministic policy gradient algorithms
David Silver(Google DeepMind (United Kingdom)), Martin Riedmiller(California Institute of Technology), Thomas Degris, Guy Lever, Daan Wierstra(Google DeepMind (United Kingdom)), Nicolas Heess(Google DeepMind (United Kingdom))
HAL (Le Centre pour la Communication Scientifique Directe)
January 1, 2014
Cited by 1,742
Related Papers
Human-level control through deep reinforcement learning
|Nature|2015|29.9k
Playing Atari with Deep Reinforcement Learning
|arXiv (Cornell University)|2013|5.1k
A direct adaptive method for faster backpropagation learning: the RPROP algorithm
|IEEE International Conference on Neural Networks|2002|3.9k
Striving for Simplicity: The All Convolutional Net
|arXiv (Cornell University)|2014|2.6k
Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method
|Lecture notes in computer science|2005|770