Nathaniel comments on A Shutdown Problem Proposal

Nathaniel 22 Jan 2024 18:10 UTC
1 point
0
I think the initial (2-agent) model only has two time steps, ie one opportunity for the button to be pressed. The goal is just for the agent to be corrigible for this single button-press opportunity.