Jakob Uszkoreit
Cited by
Cited by
Attention is all you need
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
arXiv preprint arXiv:1706.03762, 2017
A decomposable attention model for natural language inference
AP Parikh, O Täckström, D Das, J Uszkoreit
arXiv preprint arXiv:1606.01933, 2016
Self-attention with relative position representations
P Shaw, J Uszkoreit, A Vaswani
arXiv preprint arXiv:1803.02155, 2018
Advances in neural information processing systems
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
Neural Information Processing Systems Foundation, 5998-6008, 2017
Natural questions: a benchmark for question answering research
T Kwiatkowski, J Palomaki, O Redfield, M Collins, A Parikh, C Alberti, ...
Transactions of the Association for Computational Linguistics 7, 453-466, 2019
Tensor2tensor for neural machine translation
A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ...
arXiv preprint arXiv:1803.07416, 2018
Image transformer
N Parmar, A Vaswani, J Uszkoreit, L Kaiser, N Shazeer, A Ku, D Tran
International Conference on Machine Learning, 4055-4064, 2018
Universal transformers
M Dehghani, S Gouws, O Vinyals, J Uszkoreit, Ł Kaiser
arXiv preprint arXiv:1807.03819, 2018
One model to learn them all
L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ...
arXiv preprint arXiv:1706.05137, 2017
Cross-lingual word clusters for direct transfer of linguistic structure
O Täckström, R McDonald, J Uszkoreit
The 2012 Conference of the North American Chapter of the Association for …, 2012
Music transformer
CZA Huang, A Vaswani, J Uszkoreit, N Shazeer, I Simon, C Hawthorne, ...
arXiv preprint arXiv:1809.04281, 2018
An image is worth 16x16 words: Transformers for image recognition at scale
A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ...
arXiv preprint arXiv:2010.11929, 2020
Large scale parallel document mining for machine translation
J Uszkoreit, J Ponte, A Popat, M Dubiner
Proceedings of the 23rd International Conference on Computational …, 2010
Lattice-based minimum error rate training for statistical machine translation
W Macherey, F Och, I Thayer, J Uszkoreit
Coarse-to-fine question answering for long documents
E Choi, D Hewlett, J Uszkoreit, I Polosukhin, A Lacoste, J Berant
Proceedings of the 55th Annual Meeting of the Association for Computational …, 2017
Distributed word clustering for large scale class-based language modeling in machine translation
J Uszkoreit, T Brants
Proceedings of ACL-08: HLT, 755-762, 2008
Insertion transformer: Flexible sequence generation via insertion operations
M Stern, W Chan, J Kiros, J Uszkoreit
International Conference on Machine Learning, 5976-5985, 2019
L. u. Kaiser and I. Polosukhin
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez
Advances in Neural Information Processing Systems 30, 5998-6008, 2017
Inducing sentence structure from parallel corpora for reordering
J DeNero, J Uszkoreit
Proceedings of the 2011 Conference on Empirical Methods in Natural Language …, 2011
Attention is all you need. CoRR abs/1706.03762 (2017)
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
arXiv preprint arXiv:1706.03762, 2017
The system can't perform the operation now. Try again later.
Articles 1–20