Zach Stein-Perlman comments on What’s up with “Responsible Scaling Policies”?

Zach Stein-Perlman 29 Oct 2023 15:16 UTC
2 points
0
there are clearly some training setups that seem more dangerous than other training setups . . . .
Like, as an example, my guess is systems where a substantial chunk of the compute was spent on training with reinforcement learning in environments that reward long-term planning and agentic resource acquisition (e.g. many video games or diplomacy or various simulations with long-term objectives) sure seem more dangerous.
Any recommended reading on which training setups are safer? If none exist, someone should really write this up.