Singularian2501

Karma: 9

I like reading Machine Learning Paper.

Paper: Identifying the Risks of LM Agents with an LM-Emulated Sandbox—University of Toronto 2023 - Benchmark consisting of 36 high-stakes tools and 144 test cases!

Singularian2501Oct 9, 2023, 12:00 AM

6 points

0 comments1 min readLW link

RAIN: Your Language Models Can Align Themselves without Finetuning—Microsoft Research 2023 - Reduces the adversarial prompt attack success rate from 94% to 19%!

Singularian2501Sep 24, 2023, 4:48 PM

5 points

0 comments1 min readLW link