Another small nitpick: the difference, if any, between proxy alignment and corrigibility isn’t explained. The concept of proxy alignment is introduced in subsection “The concept” without first defining it.
Thanks. (I put “look at your comments” on my todo list when you posted them a week ago, then totally forgot, so it’s nice to have a reminder.)
Instead of “always go left”, how about “always go along one wall”?
With respect to proxy vs. corrigibility, I’ll have to try if I can figure out whether I had a good reason to use both terms there because right now it seems like introducing corrigibility is unnecessary. I don’t think there is a difference.
Instead of “always go left”, how about “always go along one wall”?
Yeah, maybe better, though still doesn’t quite capture the “backing up” part of the algorithm. Maybe “I explore all paths through the maze, taking left hand turns first, backing up if I reach a dead end”… that’s a bit verbose though.
though still doesn’t quite capture the “backing up” part of the algorithm
It doesn’t? Isn’t it exactly the same, at least provided the wall is topologically connected? I believe in the example I’ve drawn, going along one wall is identical to depth first search.
Edit: or do you just mean that even though you take the same steps, the two feel different because retreating =/= going further along the wall
Another small nitpick: the difference, if any, between proxy alignment and corrigibility isn’t explained. The concept of proxy alignment is introduced in subsection “The concept” without first defining it.
Thanks. (I put “look at your comments” on my todo list when you posted them a week ago, then totally forgot, so it’s nice to have a reminder.)
Instead of “always go left”, how about “always go along one wall”?
With respect to proxy vs. corrigibility, I’ll have to try if I can figure out whether I had a good reason to use both terms there because right now it seems like introducing corrigibility is unnecessary. I don’t think there is a difference.
Yeah, maybe better, though still doesn’t quite capture the “backing up” part of the algorithm. Maybe “I explore all paths through the maze, taking left hand turns first, backing up if I reach a dead end”… that’s a bit verbose though.
Gotcha
It doesn’t? Isn’t it exactly the same, at least provided the wall is topologically connected? I believe in the example I’ve drawn, going along one wall is identical to depth first search.
Edit: or do you just mean that even though you take the same steps, the two feel different because retreating =/= going further along the wall
Yeah, this — I now see what you were getting at!