RSS

Peter S. Park

Karma: 145

AI De­cep­tion: A Sur­vey of Ex­am­ples, Risks, and Po­ten­tial Solutions

Aug 29, 2023, 1:29 AM
54 points
3 comments10 min readLW link

AI can ex­ploit safety plans posted on the Internet

Peter S. ParkDec 4, 2022, 12:17 PM
−15 points
4 comments1 min readLW link

The limited up­side of interpretability

Peter S. ParkNov 15, 2022, 6:46 PM
13 points
11 comments1 min readLW link

Why do we post our AI safety plans on the In­ter­net?

Peter S. ParkNov 3, 2022, 4:02 PM
4 points
4 comments11 min readLW link