Follow
Yuan Cao
Title
Cited by
Cited by
Year
Gradient descent optimizes over-parameterized deep ReLU networks
D Zou, Y Cao, D Zhou, Q Gu
Machine learning 109, 467-492, 2020
7222020
Generalization bounds of stochastic gradient descent for wide and deep neural networks
Y Cao, Q Gu
Advances in neural information processing systems 32, 2019
4012019
Closing the generalization gap of adaptive gradient methods in training deep neural networks
J Chen, D Zhou, Y Tang, Z Yang, Y Cao, Q Gu
arXiv preprint arXiv:1806.06763, 2018
1992018
Towards understanding the spectral bias of deep learning
Y Cao, Z Fang, Y Wu, DX Zhou, Q Gu
arXiv preprint arXiv:1912.01198, 2019
1972019
Generalization error bounds of gradient descent for learning over-parameterized deep relu networks
Y Cao, Q Gu
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3349-3356, 2020
186*2020
On the convergence of adaptive gradient methods for nonconvex optimization
D Zhou, J Chen, Y Cao, Y Tang, Z Yang, Q Gu
arXiv preprint arXiv:1808.05671, 2018
1842018
How much over-parameterization is sufficient to learn deep ReLU networks?
Z Chen, Y Cao, D Zou, Q Gu
arXiv preprint arXiv:1911.12360, 2019
1322019
A generalized neural tangent kernel analysis for two-layer neural networks
Z Chen, Y Cao, Q Gu, T Zhang
Advances in Neural Information Processing Systems 33, 13363-13373, 2020
93*2020
Benign overfitting in two-layer convolutional neural networks
Y Cao, Z Chen, M Belkin, Q Gu
Advances in neural information processing systems 35, 25237-25250, 2022
852022
Agnostic learning of a single neuron with gradient descent
S Frei, Y Cao, Q Gu
Advances in Neural Information Processing Systems 33, 5417-5428, 2020
612020
Risk bounds for over-parameterized maximum margin classification on sub-gaussian mixtures
Y Cao, Q Gu, M Belkin
Advances in Neural Information Processing Systems 34, 8407-8418, 2021
552021
Understanding the generalization of adam in learning neural networks with proper regularization
D Zou, Y Cao, Y Li, Q Gu
arXiv preprint arXiv:2108.11371, 2021
402021
Algorithm-dependent generalization bounds for overparameterized deep residual networks
S Frei, Y Cao, Q Gu
Advances in neural information processing systems 32, 2019
372019
Local and global inference for high dimensional nonparanormal graphical models
Q Gu, Y Cao, Y Ning, H Liu
arXiv preprint arXiv:1502.02347, 2015
36*2015
Tight sample complexity of learning one-hidden-layer convolutional neural networks
Y Cao, Q Gu
Advances in Neural Information Processing Systems 32, 2019
232019
Provable generalization of sgd-trained neural networks of any width in the presence of adversarial label noise
S Frei, Y Cao, Q Gu
International Conference on Machine Learning, 3427-3438, 2021
202021
Agnostic learning of halfspaces with gradient descent via soft margins
S Frei, Y Cao, Q Gu
International Conference on Machine Learning, 3417-3426, 2021
192021
Online machine learning modeling and predictive control of nonlinear systems with scheduled mode transitions
C Hu, Y Cao, Z Wu
AIChE Journal 69 (2), e17882, 2023
182023
The benefits of mixup for feature learning
D Zou, Y Cao, Y Li, Q Gu
International Conference on Machine Learning, 43423-43479, 2023
172023
High-temperature structure detection in ferromagnets
Y Cao, M Neykov, H Liu
Information and Inference: A Journal of the IMA 11 (1), 55-102, 2022
102022
The system can't perform the operation now. Try again later.
Articles 1–20