Home
Projects
People
Publications
Coding Aperitivo
Reading Group
Join us
Contact
Scott A. Hale
Latest
The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising 'Alignment' in Large Language Models
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Cite
×