Isabel Trancoso, CMU Portugal Faculty member at Instituto Superior Técnico/INESC ID, was interviewed by the “90 Segundos de Ciência” Podcast on Antena 1 about CMU Portugal’s Exploratory Research Project “Privacy in speaker diarization” (Privadia). The project’s main mission is to develop a speech recognition system that ensures the privacy of the speaker’s data.
The growing number of Machine Learning as service applications has caused an increased awareness of their potential to compromise users’ privacy, as shown by the intense debate around the GDPR. Among other data types, a large amount of information may be extracted from speech going far beyond linguistic contents.
“Speech contains a lot of information about the speaker, not only his identity but also his gender, his age group, his emotional state, and above all, several diseases that can affect speech”, says Isabel Trancoso. This implies that one should regard speech as “Personally Identifiable Information”.
Current machine learning models can remotely transcribe speech recordings, identify speakers, and perform “diarization”, often referred to as the problem of determining “who spoke when” in a conversation. However, there is not a lot of research regarding privacy in speech processing and that is where the Privadia project comes in.
“The fact that data can now be extracted implies possible transgressions of the speaker’s privacy, and it also implies that we may be able to modify what the speaker said and make him say things that he never did, the so-called deep fakes applied to speech. We plan with this project to develop technologies that prevent all these misuses of speech”, refers the researcher.
The main challenge will be combining state-of-the-art speaker representations or embeddings with cryptographic techniques. The project also explores alternative approaches to privacy based on deep learning speech manipulation techniques.
Privadia is a CMU Portugal Exploratory Research Project developed in partnership with INESC ID, Instituto Superior Técnico and the Language Technologies Institute at Carnegie Mellon University.