팔로우
Huang Jiawei
제목
인용
인용
연도
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2019
992019
Weightnet: Revisiting the design space of weight networks
N Ma, X Zhang, J Huang, J Sun
European Conference on Computer Vision, 776-792, 2020
362020
Minimax value interval for off-policy evaluation and policy optimization
N Jiang, J Huang
Advances in Neural Information Processing Systems 33, 2747-2758, 2020
342020
From Importance Sampling to Doubly Robust Policy Gradient
J Huang, N Jiang
International Conference on Machine Learning, 4434-4443, 2019
132019
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
J Huang, J Chen, L Zhao, T Qin, N Jiang, TY Liu
International Conference on Learning Representations, 2021
52021
A minimax learning approach to off-policy evaluation in confounded Partially Observable Markov Decision Processes
C Shi, M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 2022
4*2022
On the convergence rate of off-policy policy optimization methods with density-ratio correction
J Huang, N Jiang
International Conference on Artificial Intelligence and Statistics, 2658-2705, 2022
32022
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
J Huang, L Zhao, T Qin, W Chen, N Jiang, TY Liu
arXiv preprint arXiv:2205.12418, 2022
2022
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–8