Noam Shazeer
Noam Shazeer
google.com의 이메일 확인됨
제목
인용
인용
연도
Attention is all you need
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
arXiv preprint arXiv:1706.03762, 2017
223922017
Exploring the limits of transfer learning with a unified text-to-text transformer
C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ...
arXiv preprint arXiv:1910.10683, 2019
12562019
Scheduled sampling for sequence prediction with recurrent neural networks
S Bengio, O Vinyals, N Jaitly, N Shazeer
arXiv preprint arXiv:1506.03099, 2015
12542015
Exploring the limits of language modeling
R Jozefowicz, O Vinyals, M Schuster, N Shazeer, Y Wu
arXiv preprint arXiv:1602.02410, 2016
9302016
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer
N Shazeer, A Mirhoseini, K Maziarz, A Davis, Q Le, G Hinton, J Dean
arXiv preprint arXiv:1701.06538, 2017
6392017
End-to-end text-dependent speaker verification
G Heigold, I Moreno, S Bengio, N Shazeer
2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016
4962016
Advances in neural information processing systems
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
Neural Information Processing Systems Foundation, 5998-6008, 2017
4242017
Image transformer
N Parmar, A Vaswani, J Uszkoreit, L Kaiser, N Shazeer, A Ku, D Tran
International Conference on Machine Learning, 4055-4064, 2018
4032018
Tensor2tensor for neural machine translation
A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ...
arXiv preprint arXiv:1803.07416, 2018
3742018
Generating wikipedia by summarizing long sequences
PJ Liu, M Saleh, E Pot, B Goodrich, R Sepassi, L Kaiser, N Shazeer
arXiv preprint arXiv:1801.10198, 2018
3642018
Serving content-relevant advertisements with client-side device support
D Anderson, P Buchheit, JA Dean, GR Harik, CL Gonsalves, N Shazeer, ...
US Patent 8,086,559, 2011
3492011
One model to learn them all
L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ...
arXiv preprint arXiv:1706.05137, 2017
2502017
Music transformer
CZA Huang, A Vaswani, J Uszkoreit, N Shazeer, I Simon, C Hawthorne, ...
arXiv preprint arXiv:1809.04281, 2018
2102018
Method and apparatus for characterizing documents based on clusters of related words
G Harik, NM Shazeer
US Patent 7,383,258, 2008
1832008
Suggesting and/or providing targeting criteria for advertisements
R Koningstein, V Spitkovsky, GR Harik, N Shazeer
US Patent 8,392,249, 2013
1762013
Using concepts for ad targeting
R Koningstein, V Spitkovsky, G Harik, N Shazeer
US Patent App. 10/721,010, 2005
1682005
L. u. Kaiser and I. Polosukhin
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez
Advances in Neural Information Processing Systems 30, 5998-6008, 2017
134*2017
Adafactor: Adaptive learning rates with sublinear memory cost
N Shazeer, M Stern
International Conference on Machine Learning, 4596-4604, 2018
1312018
Mesh-tensorflow: Deep learning for supercomputers
N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ...
arXiv preprint arXiv:1811.02084, 2018
1242018
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
A Roberts, C Raffel, N Shazeer
arXiv preprint arXiv:2002.08910, 2020
1072020
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20