Pellegrini T., Correia R., Trancoso I., Baptista J., Mamede N.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

pp 1629



The goal of this work is the automatic selection of materials for a listening comprehension game. We would like to select automatically transcribed sentences from recent broadcast news corpora, in order to gather material for the games with little human effort. The recognized words are used as the ground solution
of the exercises, thus sentences with misrecognitions need to be filtered out. Our experiments confirmed the feasibility of the filter chain that automatically selects sentences, although harder confidence thresholds may be needed. Together with the correct words, wrong candidates, namely distractors, are also needed to
build the exercises. Two techniques of distractor generation are presented, either based on the confusion networks produced by the recognizer, or on phonetic distances. The experiments confirmed the complementarity of both approaches.