The 3rd @CMUPortugal Data Science Talk highlighted Social, Cultural and Political Biases through the Lens of NLP

Ashique Khudabukhsh is a Project Scientist at the Language Technologies Institute from Carnegie Mellon University (CMU) whose current research lies at the intersection of NLP and AI for Social Impact. In this field, he takes a particular interest in analyzing globally important events in South East Asia and developing methods for noisy social media texts generated in this linguistically diverse region. Another broad focus of his research is US politics involving devising novel methods to quantify, interpret and understand political polarization.

These were, in fact, some of the topics focused under the talk “Social, Cultural and Political Biases through the Lens of NLP” which had three main parts, centered on broad lines of NLP to address:

Cultural and social biases in popular Bollywood and Hollywood movies.
The first part of the talk approached a broad range of NLP techniques to uncover subtle social and cultural biases present in popular entertainment. Beyond occupational stereotypes and gender representation, the study looked at social signals such as son’s preference, retrograde social practices, and bias towards lighter skin color in popular Bollywood movies spanning seven decades and contrasting with similar corpora of Hollywood and world movies.

The long-standing international conflict between the two nuclear adversaries India and Pakistan
The second part of the talk examined what the speaker named as hostility-diffusing, peace-seeking hope speech in the context of the 2019 India-Pakistan conflict. The research tackled several practical challenges that arise from multilingual texts and demonstrate how novel methods can effectively extend linguistic resources (e.g., content classifier, labeled examples) from a world language (e.g., English) to a low-resource language (e.g., Hindi).

The current US political crisis.
The final part of the talk presented a new methodology that offers a fresh perspective on interpreting and understanding political and ideological biases through machine translation. The data set consists of more than 85 million comments on over 200K news videos uploaded by the official YouTube channels of four major US cable news networks. Focusing on a year that saw a raging pandemic, sustained worldwide protests demanding racial justice, an election of global consequence, and a far-from-peaceful transfer of power, the research showed that the used method could light on the deepening political division in the US.

Ashique Khudabukhsh (CMU) joined the group of speakers who have participated in the series of webinars entitled “Data Science Talks @ CMU Portugal” organized in the framework of the Advanced Training Program in Data Science and Machine Learning of the CMU Portugal Program, which is planned to start in 2021

If you weren’t able to watch the session, here’s the link:

Previous Data Science Talks @CMU Portugal
Conversational Assistants for Complex Search Tasks” by Jamie Callan
AI Learns to Race: Machine Learning for Autonomous Driving” by Eric Nyberg