I think there may be another leftover from the old setup:
We are interested in creating agents that robustly do not press the button.
Shouldn’t this be interested in creating agents that robustly do press the button? I.e. then they’re reliably myopic. Or am I misunderstanding something?
I think there may be another leftover from the old setup:
Shouldn’t this be interested in creating agents that robustly do press the button? I.e. then they’re reliably myopic. Or am I misunderstanding something?
Yep, thanks. Fixed.