Data Science Talks @CMU Portugal: Social, Cultural and Political Biases through the Lens of NLP by Ashique Khudabukhsh (CMU)

In the framework of the Advanced Training Program in Data Science and Machine Learning of the CMU Portugal Program which is planned to start in 2021, the CMU Portugal Program is organizing a series of webinars entitled “Data Science Talks @ CMU Portugal”.

The third talk will take place on February 23rd, from 2-3 p. m. (Lisbon) / 9-10 a.m. (Pittsburgh)  with Ashique Khudabukhsh, Project Scientist at the Language Technologies Institute, Carnegie Mellon University (CMU) under the theme “Social, Cultural and Political Biases through the Lens of NLP”

Registration is free but mandatory.


In this talk,  Ashique KhudaBukhsh will summarize three broad lines of NLP research focusing on firstly cultural and social biases in popular Bollywood and Hollywood movies, secondly the long-standing international conflict between the two nuclear adversaries India and Pakistan, and moreover the current US political crisis. 

  1. The first part of the talk applies a broad range of NLP techniques to uncover subtle social and cultural biases present in popular entertainment. Beyond occupational stereotypes and gender representation, our study looks at subtler social signals such as son’s preference, retrograde social practices, and bias towards lighter skin color. We consider a substantial corpus of popular Bollywood movies spanning seven decades and contrast our findings with similar corpora of Hollywood and world movies.
  2. The second part of the talk seeks to examine what we term as hostility-diffusing, peace-seeking hope speech in the context of the 2019 India-Pakistan conflict. In doing so, we tackle several practical challenges that arise from multilingual texts and demonstrate how novel methods can effectively extend linguistic resources (e.g., content classifier, labeled examples) from a world language (e.g., English) to a low-resource language (e.g., Hindi).
  3. The final part of the talk presents a new methodology that offers a fresh perspective on interpreting and understanding political and ideological biases through machine translation. Our data set consists of more than 85 million comments on over 200K news videos uploaded by the official YouTube channels of four major US cable news networks. Focusing on a year that saw a raging pandemic, sustained worldwide protests demanding racial justice, an election of global consequence, and a far-from-peaceful transfer of power, we show how our method can shed light on the deepening political divide in the US.

Ashique KhudaBukhsh

Ashique KhudaBukhsh is currently a Project Scientist at the Language Technologies Institute, Carnegie Mellon University (CMU) mentored by Prof. Tom Mitchell. Prior to this role, he was a postdoc mentored by Prof. Jaime Carbonell at CMU. His PhD thesis (Computer Science Department, CMU, also advised by Prof. Jaime Carbonell) focused on distributed active learning. His current research lies at the intersection of NLP and AI for Social Impact. In this field, he is interested in analyzing globally important events in South East Asia and developing methods for noisy social media texts generated in this linguistically diverse region. His other broad research focus is US politics; in this area, his research involves devising novel methods to quantify, interpret and understand political polarization.

Should you have any doubts regarding registration please contact

Previous Data Science Talks @CMU Portugal

Download Flyer

Start date

Feb. 23, 2021

2:00 pm

End Date

Feb. 23, 2021

3:00 pm