Follow
Roman Novak
Roman Novak
OpenAI
Verified email at polytechnique.edu - Homepage
Title
Cited by
Cited by
Year
Deep neural networks as gaussian processes
J Lee, Y Bahri, R Novak, SS Schoenholz, J Pennington, J Sohl-Dickstein
arXiv preprint arXiv:1711.00165, 2017
12602017
Wide neural networks of any depth evolve as linear models under gradient descent
J Lee, L Xiao, S Schoenholz, Y Bahri, R Novak, J Sohl-Dickstein, ...
Advances in neural information processing systems 32, 2019
10742019
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
9712022
Sensitivity and generalization in neural networks: an empirical study
R Novak, Y Bahri, DA Abolafia, J Pennington, J Sohl-Dickstein
arXiv preprint arXiv:1802.08760, 2018
4812018
Bayesian deep convolutional networks with many channels are gaussian processes
R Novak, L Xiao, J Lee, Y Bahri, G Yang, J Hron, DA Abolafia, ...
arXiv preprint arXiv:1810.05148, 2018
3682018
Neural tangents: Fast and easy infinite neural networks in python
R Novak, L Xiao, J Hron, J Lee, AA Alemi, J Sohl-Dickstein, ...
arXiv preprint arXiv:1912.02803, 2019
2522019
Dataset distillation with infinitely wide convolutional networks
T Nguyen, R Novak, L Xiao, J Lee
Advances in Neural Information Processing Systems 34, 5186-5198, 2021
2182021
Finite versus infinite neural networks: an empirical study
J Lee, S Schoenholz, J Pennington, B Adlam, L Xiao, R Novak, ...
Advances in Neural Information Processing Systems 33, 15156-15172, 2020
2062020
Infinite attention: NNGP and NTK for deep attention networks
J Hron, Y Bahri, J Sohl-Dickstein, R Novak
International Conference on Machine Learning, 4376-4386, 2020
1262020
Fast finite width neural tangent kernel
R Novak, J Sohl-Dickstein, SS Schoenholz
International Conference on Machine Learning, 17018-17044, 2022
582022
On the infinite width limit of neural networks with a standard parameterization
J Sohl-Dickstein, R Novak, SS Schoenholz, J Lee
arXiv preprint arXiv:2001.07301, 2020
542020
Beyond human data: Scaling self-training for problem-solving with language models
A Singh, JD Co-Reyes, R Agarwal, A Anand, P Patil, PJ Liu, J Harrison, ...
arXiv preprint arXiv:2312.06585, 2023
412023
Exploring the neural algorithm of artistic style
Y Nikulin, R Novak
arXiv preprint arXiv:1602.07188, 2016
372016
Improving the neural algorithm of artistic style
R Novak, Y Nikulin
arXiv preprint arXiv:1605.04603, 2016
352016
Exact posterior distributions of wide Bayesian neural networks
J Hron, Y Bahri, R Novak, J Pennington, J Sohl-Dickstein
arXiv preprint arXiv:2006.10541, 2020
312020
Iterative refinement for machine translation
R Novak, M Auli, D Grangier
arXiv preprint arXiv:1610.06602, 2016
302016
Small-scale proxies for large-scale transformer training instabilities
M Wortsman, PJ Liu, L Xiao, K Everett, A Alemi, B Adlam, JD Co-Reyes, ...
arXiv preprint arXiv:2309.14322, 2023
262023
Fast neural kernel embeddings for general activations
I Han, A Zandieh, J Lee, R Novak, L Xiao, A Karbasi
Advances in neural information processing systems 35, 35657-35671, 2022
142022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
J Hron, R Novak, J Pennington, J Sohl-Dickstein
International conference on machine learning, 8926-8945, 2022
72022
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
J Hron, L Culp, G Elsayed, R Liu, B Adlam, M Bileschi, B Bohnet, ...
arXiv preprint arXiv:2408.07852, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20