Alex K. Chen (parrot) comments on AGI with RL is Bad News for Safety

Alex K. Chen (parrot) 25 Dec 2024 19:18 UTC
1 point
0
Can’t CoT’s be what makes RL safe, however? (if you force the reasoner to self-limit under some recursion depth when it senses that the RL agent might be asking for so much that it makes it unsafe)