Pellegrini T., Correia R., Trancoso I., Baptista J., Mamede N., Eskenazi M.

Computer Speech and Language

pp 1127



Spoken European Portuguese (EP) is known to be difficult to understand for L2 learners, due to phenomena such as strong vowel reduction. In this paper, we present a method to automatically generate exercises aimed at improving listening comprehension skills in EP. Learners identify the words pronounced in real speech utterances. The exercises introduce two innovative aspects: using broadcast news videos for curriculum and automatically generating exercises with material updated on a daily basis. The videos are automatically transcribed by a speech recognition engine. A filtering chain, used to select appropriate sentences, was validated by a first survey comprised of both manually and automatically selected sentences. Both sets were assigned good to very good subjective quality scores. A second survey concerned the features of the exercise interface. Subjects with varying self-reported exposure to Portuguese as a second language tested several interfaces and functionalities and highlighted their preferred features. The results confirmed that the largest difficulty was the fast speech rate. All participants valued slowed-down audio and video documents, though this feature was more often used by the lowest proficiency subjects. The exercises were integrated into a Web platform where they are automatically updated daily. Though further evaluation is needed to find whether the platform affords skill acquisition, it is expected to be particularly valuable for distance learners who need opportunities to access authentic audio documents in EP.