Follow
Ammar Abbas
Ammar Abbas
Amazon Research, Cambridge, UK
Verified email at amazon.co.uk
Title
Cited by
Cited by
Year
A geometric approach to obtain a bird's eye view from an image
S Ammar Abbas, A Zisserman
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
742019
Camp: a two-stage approach to modelling prosody in context
Z Hodari, A Moinet, S Karlapati, J Lorenzo-Trueba, T Merritt, A Joly, ...
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
322021
Prosodic representation learning and contextual sampling for neural text-to-speech
S Karlapati, A Abbas, Z Hodari, A Moinet, A Joly, P Karanasou, ...
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
212021
Simple and effective multi-sentence TTS with expressive and coherent prosody
P Makarov, A Abbas, M Łajszczak, A Joly, S Karlapati, A Moinet, ...
arXiv preprint arXiv:2206.14643, 2022
152022
CopyCat2: A single model for multi-speaker TTS and many-to-many fine-grained prosody transfer
S Karlapati, P Karanasou, M Lajszczak, A Abbas, A Moinet, P Makarov, ...
arXiv preprint arXiv:2206.13443, 2022
132022
Recovering Homography from Camera Captured Documents using Convolutional Neural Networks
SA Abbas, S Hussain
arXiv preprint arXiv:1709.03524, 2017
122017
Expressive, variable, and controllable duration modelling in TTS
A Abbas, T Merritt, A Moinet, S Karlapati, E Muszynska, S Slangen, E Gatti, ...
arXiv preprint arXiv:2206.14165, 2022
102022
A learned conditional prior for the VAE acoustic space of a TTS system
P Karanasou, S Karlapati, A Moinet, A Joly, A Abbas, S Slangen, ...
arXiv preprint arXiv:2106.10229, 2021
92021
BASE TTS: Lessons from building a billion-parameter text-to-speech model on 100K hours of data
M Łajszczak, G Cámbara, Y Li, F Beyhan, A van Korlaar, F Yang, A Joly, ...
arXiv preprint arXiv:2402.08093, 2024
82024
ecat: An end-to-end model for multi-speaker tts & many-to-many fine-grained prosody transfer
A Abbas, S Karlapati, B Schnell, P Karanasou, MG Moya, A Nagaraj, ...
arXiv preprint arXiv:2306.11327, 2023
22023
Controllable Emphasis with zero data for text-to-speech
A Joly, M Nicolis, E Peterova, A Lombardi, A Abbas, A van Korlaar, ...
arXiv preprint arXiv:2307.07062, 2023
12023
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino, K Yanagisawa, K Pokora, ...
arXiv preprint arXiv:2307.16679, 2023
2023
Multi-scale spectrogram modelling for neural text-to-speech
A Abbas, B Bollepalli, A Moinet, A Joly, P Karanasou, P Makarov, ...
arXiv preprint arXiv:2106.15649, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–13