Home
Projects
People
Publications
Coding Aperitivo
Reading Group
Join us
Contact
Page not found
Perhaps you were looking for one of these?
Latest
DADIT: A Dataset for Demographic Classification of Italian Twitter Users and a Comparison of Prediction Methods
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising 'Alignment' in Large Language Models
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
Visiting Researcher
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions
Leveraging Label Variation in Large Language Models for Zero-Shot TextClassification
Cite
×