Home
Projects
People
Publications
Coding Aperitivo
Reading Group
Join us
Contact
Bertie Vidgen
Latest
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising 'Alignment' in Large Language Models
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks
Cite
×