Btw was your goal to show me the link or to learn whether I have seen it before? If the former, then I don’t need to respond. If the latter, then you want my response I guess.
It was partially to point out that you can get self-modification hazards with a substantially less complex setup than your proposal with a little hand-engineering of the agents; since none of the AI safety gridworld problems could be said to be rigorously solved, there’s no need for more realistic self-modification environments.
Have you seen “AI Safety Gridworlds”, Leike et al 2017?
I haven’t, thanks.
Btw was your goal to show me the link or to learn whether I have seen it before? If the former, then I don’t need to respond. If the latter, then you want my response I guess.
The “Whisky and Gold” environment is particularly relevant
It was partially to point out that you can get self-modification hazards with a substantially less complex setup than your proposal with a little hand-engineering of the agents; since none of the AI safety gridworld problems could be said to be rigorously solved, there’s no need for more realistic self-modification environments.