Yep. Good post. Important stuff. I think we’re still struggling to understand all of this fully, and work on indifference seems like the most relevant stuff.
My current take is that as long as there is any “black-box” part of the algorithm which is optimizing for performance, then it may end up behaving like an optimizer_2, since the black box can pick up on arbitrary effective strategies.
(in partial RE to Rohin below): I wouldn’t necessarily say that such an algorithm knows about its environment (i.e. has a good model), it may simply have stumbled upon an effective strategy for interacting with it (i.e. have a good policy).
Yep. Good post. Important stuff. I think we’re still struggling to understand all of this fully, and work on indifference seems like the most relevant stuff.
My current take is that as long as there is any “black-box” part of the algorithm which is optimizing for performance, then it may end up behaving like an optimizer_2, since the black box can pick up on arbitrary effective strategies.
(in partial RE to Rohin below): I wouldn’t necessarily say that such an algorithm knows about its environment (i.e. has a good model), it may simply have stumbled upon an effective strategy for interacting with it (i.e. have a good policy).