RSS

JanB

Karma: 826

Paper in Science: Manag­ing ex­treme AI risks amid rapid progress

JanBMay 23, 2024, 8:40 AM
50 points
2 comments1 min readLW link

I don’t find the lie de­tec­tion re­sults that sur­pris­ing (by an au­thor of the pa­per)

JanBOct 4, 2023, 5:10 PM
97 points
8 comments3 min readLW link

How to Catch an AI Liar: Lie De­tec­tion in Black-Box LLMs by Ask­ing Un­re­lated Questions

Sep 28, 2023, 6:53 PM
187 points
39 comments3 min readLW link1 review