The power of depth for feedforward neural networks R Eldan, O Shamir Conference on learning theory, 907-940, 2016 | 887 | 2016 |
Learnability, stability and uniform convergence S Shalev-Shwartz, O Shamir, N Srebro, K Sridharan The Journal of Machine Learning Research 9999, 2635-2670, 2010 | 810* | 2010 |
Making gradient descent optimal for strongly convex stochastic optimization A Rakhlin, O Shamir, K Sridharan arXiv preprint arXiv:1109.5647, 2011 | 744 | 2011 |
Optimal Distributed Online Prediction Using Mini-Batches. O Dekel, R Gilad-Bachrach, O Shamir, L Xiao Journal of Machine Learning Research 13 (1), 2012 | 742 | 2012 |
Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes O Shamir, T Zhang International conference on machine learning, 71-79, 2013 | 608 | 2013 |
Communication-efficient distributed optimization using an approximate newton-type method O Shamir, N Srebro, T Zhang International conference on machine learning, 1000-1008, 2014 | 585 | 2014 |
On the computational efficiency of training neural networks R Livni, S Shalev-Shwartz, O Shamir Advances in neural information processing systems 27, 2014 | 554 | 2014 |
Size-independent sample complexity of neural networks N Golowich, A Rakhlin, O Shamir Conference On Learning Theory, 297-299, 2018 | 513 | 2018 |
Better mini-batch algorithms via accelerated gradient methods A Cotter, O Shamir, N Srebro, K Sridharan Advances in neural information processing systems 24, 2011 | 372 | 2011 |
Adaptively learning the crowd kernel O Tamuz, C Liu, S Belongie, O Shamir, AT Kalai arXiv preprint arXiv:1105.1033, 2011 | 306 | 2011 |
Nonstochastic multi-armed bandits with graph-structured feedback N Alon, N Cesa-Bianchi, C Gentile, S Mannor, Y Mansour, O Shamir SIAM Journal on Computing 46 (6), 1785-1826, 2017 | 284* | 2017 |
Spurious local minima are common in two-layer relu neural networks I Safran, O Shamir International conference on machine learning, 4433-4441, 2018 | 273 | 2018 |
Proving the lottery ticket hypothesis: Pruning is all you need E Malach, G Yehudai, S Shalev-Schwartz, O Shamir International Conference on Machine Learning, 6682-6691, 2020 | 233 | 2020 |
Learning and generalization with the information bottleneck O Shamir, S Sabato, N Tishby Theoretical Computer Science 411 (29-30), 2696-2711, 2010 | 229 | 2010 |
An optimal algorithm for bandit and zero-order convex optimization with two-point feedback O Shamir The Journal of Machine Learning Research 18 (1), 1703-1713, 2017 | 227 | 2017 |
Is local SGD better than minibatch SGD? B Woodworth, KK Patel, S Stich, Z Dai, B Bullins, B Mcmahan, O Shamir, ... International Conference on Machine Learning, 10334-10343, 2020 | 219 | 2020 |
Depth-width tradeoffs in approximating natural functions with neural networks I Safran, O Shamir International conference on machine learning, 2979-2987, 2017 | 211* | 2017 |
Communication complexity of distributed convex learning and optimization Y Arjevani, O Shamir Advances in neural information processing systems 28, 2015 | 206 | 2015 |
Learning to classify with missing and corrupted features O Dekel, O Shamir Proceedings of the 25th international conference on Machine learning, 216-223, 2008 | 205 | 2008 |
On the complexity of bandit and derivative-free stochastic convex optimization O Shamir Conference on Learning Theory, 3-24, 2013 | 202 | 2013 |