Close this search box.

Anumanchipalli G.K., Oliveira L.C., Black A.W.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

pp 6890



This paper presents an `Accent Group’ based intonation model for statistical parametric speech synthesis. We propose an approach to automatically model phonetic realizations of fundamental frequency(F0) contours as a sequence of intonational events anchored to a group of syllables (an Accent Group). We train an accent grouping model specific to that of the speaker, using a stochastic context free grammar and contextual decision trees on the syllables. This model is used to `parse’ an unseen text into its constituent accent groups over each of which appropriate intonation is predicted. The performance of the model is shown objectively and subjectively on a variety of prosodically diverse tasks- read speech, news broadcast and audio books.