Æȷοì
Huang Jiawei
Huang Jiawei
inf.ethz.chÀÇ À̸ÞÀÏ È®ÀÎµÊ - ȨÆäÀÌÁö
Á¦¸ñ
Àοë
Àοë
¿¬µµ
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2019
1742019
Weightnet: Revisiting the design space of weight networks
N Ma, X Zhang, J Huang, J Sun
European Conference on Computer Vision, 776-792, 2020
942020
Minimax value interval for off-policy evaluation and policy optimization
N Jiang, J Huang
Advances in Neural Information Processing Systems 33, 2747-2758, 2020
742020
A minimax learning approach to off-policy evaluation in confounded Partially Observable Markov Decision Processes
C Shi, M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 2022
30*2022
From Importance Sampling to Doubly Robust Policy Gradient
J Huang, N Jiang
International Conference on Machine Learning, 4434-4443, 2019
262019
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
J Huang, J Chen, L Zhao, T Qin, N Jiang, TY Liu
International Conference on Learning Representations 2022, 2022
242022
On the convergence rate of off-policy policy optimization methods with density-ratio correction
J Huang, N Jiang
International Conference on Artificial Intelligence and Statistics, 2658-2705, 2022
10*2022
On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation
J Huang, B Yardim, N He
arXiv preprint arXiv:2305.11283, 2023
22023
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret
J Huang, L Zhao, T Qin, W Chen, N Jiang, TY Liu
Advances in Neural Information Processing Systems 35, 2022
22022
Robust Knowledge Transfer in Tiered Reinforcement Learning
J Huang, N He
Advances in Neural Information Processing Systems 36, 2024
2024
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
J Huang, N He, A Krause
arXiv preprint arXiv:2402.05724, 2024
2024
ÇöÀç ½Ã½ºÅÛÀÌ ÀÛµ¿µÇÁö ¾Ê½À´Ï´Ù. ³ªÁß¿¡ ´Ù½Ã ½ÃµµÇØ ÁÖ¼¼¿ä.
ÇмúÀÚ·á 1–11