Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018 | 2507 | 2018 |
A distributional perspective on reinforcement learning MG Bellemare*, W Dabney*, R Munos arXiv preprint arXiv:1707.06887, 2017 | 1660 | 2017 |
Distributional reinforcement learning with quantile regression W Dabney, M Rowland, M Bellemare, R Munos Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018 | 753 | 2018 |
Distributed distributional deterministic policy gradients G Barth-Maron, MW Hoffman, D Budden, W Dabney, D Horgan, D Tb, ... arXiv preprint arXiv:1804.08617, 2018 | 600 | 2018 |
Successor features for transfer in reinforcement learning A Barreto, W Dabney, R Munos, JJ Hunt, T Schaul, HP van Hasselt, ... Advances in neural information processing systems 30, 2017 | 585 | 2017 |
Implicit quantile networks for distributional reinforcement learning W Dabney, G Ostrovski, D Silver, R Munos International conference on machine learning, 1096-1105, 2018 | 551 | 2018 |
Recurrent experience replay in distributed reinforcement learning S Kapturowski, G Ostrovski, J Quan, R Munos, W Dabney International conference on learning representations, 2018 | 512 | 2018 |
A distributional code for value in dopamine-based reinforcement learning W Dabney, Z Kurth-Nelson, N Uchida, CK Starkweather, D Hassabis, ... Nature 577 (7792), 671-675, 2020 | 406 | 2020 |
The cramer distance as a solution to biased wasserstein gradients MG Bellemare, I Danihelka, W Dabney, S Mohamed, ... arXiv preprint arXiv:1705.10743, 2017 | 405 | 2017 |
Revisiting fundamentals of experience replay W Fedus, P Ramachandran, R Agarwal, Y Bengio, H Larochelle, ... International Conference on Machine Learning, 3061-3071, 2020 | 260 | 2020 |
Deep reinforcement learning and its neuroscientific implications M Botvinick, JX Wang, W Dabney, KJ Miller, Z Kurth-Nelson Neuron 107 (4), 603-616, 2020 | 175 | 2020 |
Fast task inference with variational intrinsic successor features S Hansen, W Dabney, A Barreto, T Van de Wiele, D Warde-Farley, V Mnih arXiv preprint arXiv:1906.05030, 2019 | 148 | 2019 |
An analysis of categorical distributional reinforcement learning M Rowland, M Bellemare, W Dabney, R Munos, YW Teh International Conference on Artificial Intelligence and Statistics, 29-37, 2018 | 125 | 2018 |
Distributional reinforcement learning MG Bellemare, W Dabney, M Rowland MIT Press, 2023 | 103 | 2023 |
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos arXiv preprint arXiv:1704.04651, 2017 | 98 | 2017 |
A geometric perspective on optimal representations for reinforcement learning M Bellemare, W Dabney, R Dadashi, A Ali Taiga, PS Castro, N Le Roux, ... Advances in neural information processing systems 32, 2019 | 95 | 2019 |
Statistics and samples in distributional reinforcement learning M Rowland, R Dadashi, S Kumar, R Munos, MG Bellemare, W Dabney International Conference on Machine Learning, 5528-5536, 2019 | 91 | 2019 |
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019 | 89 | 2019 |
Temporally-extended {\epsilon}-greedy exploration W Dabney, G Ostrovski, A Barreto arXiv preprint arXiv:2006.01782, 2020 | 88 | 2020 |
RLPy: a value-function-based reinforcement learning framework for education and research. A Geramifard, C Dann, RH Klein, W Dabney, JP How J. Mach. Learn. Res. 16 (1), 1573-1578, 2015 | 87 | 2015 |