Text-to-speech model can preserve speaker’s emotional tone and acoustic environment. Read more at Ars Technica…