팔로우
Masatoshi Uehara
제목
인용
인용
연도
Double reinforcement learning for efficient off-policy evaluation in markov decision processes
N Kallus, M Uehara
Journal of Machine Learning Research 21 (167), 2020
1052020
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2020
992020
Generative adversarial nets from a density ratio estimation perspective
M Uehara, I Sato, M Suzuki, K Nakayama, Y Matsuo
arXiv preprint arXiv:1610.02920, 2016
772016
Efficiently breaking the curse of horizon in off-policy evaluation with double reinforcement learning
N Kallus, M Uehara
Operations Research, 2022
61*2022
Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning
N Kallus, M Uehara
Advances in Neural Information Processing Systems 32, 2019
362019
Pessimistic model-based offline reinforcement learning under partial coverage
M Uehara, W Sun
arXiv preprint arXiv:2107.06226, 2021
30*2021
Representation learning for online and offline rl in low-rank mdps
M Uehara, X Zhang, W Sun
arXiv preprint arXiv:2110.04652, 2021
252021
Off-policy evaluation and learning for external validity under a covariate shift
M Uehara, M Kato, S Yasui
Advances in Neural Information Processing Systems 33, 49-61, 2020
24*2020
Causal inference under unmeasured confounding with negative controls: A minimax learning approach
N Kallus, X Mao, M Uehara
arXiv preprint arXiv:2103.14029, 2021
232021
Finite sample analysis of minimax offline reinforcement learning: Completeness, fast rates and first-order efficiency
M Uehara, M Imaizumi, N Jiang, N Kallus, W Sun, T Xie
arXiv preprint arXiv:2102.02981, 2021
212021
Statistically efficient off-policy policy gradients
N Kallus, M Uehara
Proceedings of the 37th International Conference on Machine Learning, 5089-5100, 2020
192020
Localized debiased machine learning: Efficient inference on quantile treatment effects and beyond
N Kallus, X Mao, M Uehara
arXiv preprint arXiv:1912.12945, 2019
15*2019
Optimal off-policy evaluation from multiple logging policies
N Kallus, Y Saito, M Uehara
International Conference on Machine Learning, 5247-5256, 2021
122021
Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage
J Chang, M Uehara, D Sreenivas, R Kidambi, W Sun
Advances in Neural Information Processing Systems 34, 965-979, 2021
102021
A unified statistically efficient estimation framework for unnormalized models
M Uehara, T Kanamori, T Takenouchi, T Matsuda
International Conference on Artificial Intelligence and Statistics, 809-819, 2020
10*2020
Analysis of noise contrastive estimation from the perspective of asymptotic variance
M Uehara, T Matsuda, F Komaki
arXiv preprint arXiv:1808.07983, 2018
92018
Fast rates for the regret of offline reinforcement learning
Y Hu, N Kallus, M Uehara
arXiv preprint arXiv:2102.00479, 2021
72021
Imputation estimators for unnormalized models with missing data
M Uehara, T Matsuda, JK Kim
International Conference on Artificial Intelligence and Statistics, 831-841, 2020
62020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
N Kallus, M Uehara
Advances in Neural Information Processing Systems 33, 2020
62020
A minimax learning approach to off-policy evaluation in confounded Partially Observable Markov Decision Processes
C Shi, M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 20057-20094, 2022
4*2022
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20