RSS

mattmacdermott

Karma: 1,180

Val­i­dat­ing against a mis­al­ign­ment de­tec­tor is very differ­ent to train­ing against one

mattmacdermottMar 4, 2025, 3:41 PM
29 points
4 comments4 min readLW link

Su­per­in­tel­li­gent Agents Pose Catas­trophic Risks: Can Scien­tist AI Offer a Safer Path?

Feb 24, 2025, 6:31 PM
44 points
15 comments11 min readLW link