Cleo Nardo comments on LIMA: Less Is More for Alignment

Cleo Nardo 8 Jun 2023 0:22 UTC
LW: 4 AF: 2
2
AF

In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases;

I’m not sure how well this metric tracks what people care about — performance on particular downstream tasks (e.g. passing a law exam, writing bugless code, automating alignment research, etc)