when everything that can go wrong is the agent breaking the vase, and breaking the vase allows higher utility solutions
What does “breaking the vase” refer to here?
I would assume this is an allusion to the scene in The Matrix with Neo and the Oracle (where there’s a paradox about whether Neo would have broken the vase if the Oracle hadn’t said, “Don’t worry about the vase,” causing Neo to turn around to look for the vase and then bump into it), but I’m having trouble seeing how that relates to sampling and search.
“Breaking the vase” is a reference to an example that people sometimes give for an accident that happens in reinforcement learning due to the reward function not being fully aligned with what we want. The scenario is a robot that navigates in a room with a vase, and while we care about the vase, the reward function that we provided does not account for it, and so the robot just knocks it over because it is on the shortest path to somewhere.
What does “breaking the vase” refer to here?
I would assume this is an allusion to the scene in The Matrix with Neo and the Oracle (where there’s a paradox about whether Neo would have broken the vase if the Oracle hadn’t said, “Don’t worry about the vase,” causing Neo to turn around to look for the vase and then bump into it), but I’m having trouble seeing how that relates to sampling and search.
Presumably this:
https://www.lesswrong.com/posts/H7KB44oKoSjSCkpzL/worrying-about-the-vase-whitelisting
“Breaking the vase” is a reference to an example that people sometimes give for an accident that happens in reinforcement learning due to the reward function not being fully aligned with what we want. The scenario is a robot that navigates in a room with a vase, and while we care about the vase, the reward function that we provided does not account for it, and so the robot just knocks it over because it is on the shortest path to somewhere.