Daniel Russo

인용

	전체	2019년 이후
서지정보	4890	4308
h-index	16	16
i10-index	19	19

1100

550

275

825

20132014201520162017201820192020202120222023202415 29 36 75 141 262 365 621 840 1010 1013 453

공개 액세스

모두 보기

자료 2개

자료 0개

공개

비공개

재정 지원 요구사항 기준

팔로우

Daniel Russo

Columbia University

gsb.columbia.edu의 이메일 확인됨 - 홈페이지


제목 서지정보순 정렬 연도순 정렬 제목순 정렬	인용 인용	연도
A tutorial on thompson sampling D Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends in Machine Learning 11 (1), 1–96, 2018	1063	2018
Learning to optimize via posterior sampling D Russo, B Van Roy Mathematics of Operations Research 39 (4), 1221-1243, 2014	724	2014
An information-theoretic analysis of thompson sampling D Russo, B Van Roy Journal of Machine Learning Research 17 (68), 1-30, 2016	408	2016
A finite time analysis of temporal difference learning with linear function approximation J Bhandari, D Russo, R Singal Operations Research 69 (3), 950--973, 2021	367	2021
How much does your data exploration overfit? Controlling bias via information usage. D Russo, J Zou IEEE Transactions on Information Theory, 2019	340*	2019
Learning to optimize via information-directed sampling D Russo, B Van Roy Operations Research 66 (1), 230-252, 2018	325*	2018
Deep exploration via randomized value functions I Osband, B Van Roy, DJ Russo, Z Wen Journal of Machine Learning Research 20 (124), 1-62, 2019	323	2019
Simple Bayesian Algorithms for Best-Arm Identification D Russo Operations Research 68 (6), 1625--1647, 2020	289*	2020
Eluder Dimension and the Sample Complexity of Optimistic Exploration. D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2256-2264, 2013	242	2013
Global optimality guarantees for policy gradient methods J Bhandari, D Russo Operations Research, 2024	236	2024
Improving the expected improvement algorithm C Qin, D Klabjan, D Russo Advances in Neural Information Processing Systems, 5382--5392, 2017	143	2017
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	94	2013
Worst-case regret bounds for exploration via randomized value functions D Russo Advances in Neural Information Processing Systems 32, 2019	88	2019
On the linear convergence of policy gradient methods for finite mdps J Bhandari, D Russo International Conference on Artificial Intelligence and Statistics, 2386-2394, 2021	87*	2021
Satisficing in time-sensitive bandit learning D Russo, B Van Roy Mathematics of Operations Research 47 (4), 2815-2839, 2022	59*	2022
Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation C Qin, D Russo arXiv preprint arXiv:2202.09036, 2022	34*	2022
A note on the equivalence of upper confidence bounds and gittins indices for patient agents D Russo Operations Research 69 (1), 273-278, 2021	16	2021
On the futility of dynamics in robust mechanism design SR Balseiro, A Kim, D Russo Operations Research 69 (6), 1767-1783, 2021	10	2021
Policy gradient optimization of Thompson sampling policies S Min, CC Moallemi, DJ Russo arXiv preprint arXiv:2006.16507, 2020	10	2020
Approximation benefits of policy gradient methods with aggregated states D Russo Management Science 69 (11), 6898-6911, 2023	9	2023

현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.

학술자료 1–20

연간 인용횟수

중복된 서지정보

병합된 서지정보

공동 저자 추가공동 저자

팔로우

인용