NLP | MilaNLP Lab @ Bocconi University

The Importance of Modeling Social Factors of Language: Theory and Practice

Natural language processing (NLP) applications are now more powerful and ubiquitous than ever before. With rapidly developing (neural) models and ever-more available data, current NLP models have access to more information than any human speaker …

Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction

We investigate grounded language learning through real-world data, by modelling a teacher-learner dynamics through the natural interactions occurring between users and search engines.

FEEL-IT: Emotion and Sentiment Classification for the Italian Language

Sentiment analysis is a common task to understand people's reactions online. Still, we often need more nuanced information: is the post negative because the user is angry or because they are sad? An abundance of approaches has been introduced for …

MilaNLP @ WASSA: Does BERT Feel Sad When You Cry?

The paper describes the MilaNLP team’s submission (Bocconi University, Milan) in the WASSA 2021 Shared Task on Empathy Detection and Emotion Classification. We focus on Track 2 - Emotion Classification - which consists of predicting the emotion of …

Beyond Black & White: Leveraging Annotator Disagreement via Soft-Label Multi-Task Learning

Supervised learning assumes that a ground truth label exists. However, the reliability of this ground truth depends on human annotators, who often disagree. Prior work has shown that this disagreement can be helpful in training models. We propose a …

Universal Joy A Data Set and Results for Classifying Emotions Across Languages

While emotions are universal aspects of human psychology, they are expressed differently across different languages and cultures. We introduce a new data set of over 530k anonymized public Facebook posts across 18 languages, labeled with five …

BERTective: Language Models and Contextual Information for Deception Detection

Spotting a lie is challenging but has an enormous potential impact on security as well as private and public safety. Several NLP methods have been proposed to classify texts as truthful or deceptive. In most cases, however, the target texts’ …

Cross-lingual Contextualized Topic Models with Zero-shot Learning

We introduce a novel topic modeling method that can make use of contextulized embeddings (e.g., BERT) to do zero-shot cross-lingual topic modeling.

Text Analysis in Python for Social Scientists – Discovery and Exploration

Text is everywhere, and it is a fantastic resource for social scientists. However, because it is so abundant, and because language is so variable, it is often difficult to extract the information we want. There is a whole subfield of AI concerned …

Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview

An increasing number of natural language processing papers address the effect of bias on predictions, introducing mitigation techniques at different parts of the standard NLP pipeline (data and models). However, these works have been conducted …