I was talking about ELK in a group, and the working example of the SmartVault and the robber ended up being a point of confusion for us. Intuitively, it seems like the robber is an external, adversarial agent who tries to get around the SmartVault. However, what we probably care about in practice would be how a human could be fooled by an AI—not by some other adversary. Furthermore, it seems that whether the robber decides to cover up his theft of the diamond by putting up a screen depends solely on the actions of the AI. Does this imply that the robber is “in kahoots” with the AI in this situation (i.e. the AI projects a video onto the wall instructing the robber to put up a screen)? This seems a bit strange and complicated.
Instead, we might consider the situation in which the AI controls a SmartFabricator, which we want to arrange carbon atoms into diamonds. We might then imagine that it instead fabricates a screen to put in front of the camera, or makes a fake diamond. This wouldn’t require the existence of an external “robber” agent. Does the SmartVault scenario have helpful aspects that the SmartFabricator example lacks?
The SmartFabricator seems basically the same. In the robber example, you might imagine the SmartVault is the one that puts up the screen to conceal the fact that it let the diamond get stolen.
I suppose there are a number of examples that work, but I think the robber and vault give the scenario useful breadth.
The following is just my interpretation of it, so take it with a grain of salt. To me the robber and vault enable a few options. The AI can be passively lying or actively concealing. If the robber comes in, gets past the AIs defenses, and takes the diamond in a way the human observer can’t notice, then the AI has the option of passively lying. The AI tried its best to stop the robber and failed, but then chose to lie about it so it still got the reward of having protected the diamond as far as the humans know.
Alternatively the AI could actively conceal the outcome. The AI could try its best and fail to stop the robber, and then do some trickier to make it look like it did actually stop the robber. Or the AI could not bother stopping the robber and just focus on making it look like the diamond is still there. Here the AI is playing a more active role in concealing the outcome.
None of these scenarios require coordination from the robber. To me, the robber is just there to rob a sophisticated vault and make it look like they were never there. So the robber might cover up cameras or do other tampering so it looks like they were never there.
I think this is more flexible than your fabricator example. There the AI can’t really play a passive role, it’s either concealing or not. But you could probably demonstrate the things ARC is looking at here with the fabricator example too I would think.
Like I said, just my interpretation, so I may be misunderstanding the intent or other nuances.
I was talking about ELK in a group, and the working example of the SmartVault and the robber ended up being a point of confusion for us. Intuitively, it seems like the robber is an external, adversarial agent who tries to get around the SmartVault. However, what we probably care about in practice would be how a human could be fooled by an AI—not by some other adversary. Furthermore, it seems that whether the robber decides to cover up his theft of the diamond by putting up a screen depends solely on the actions of the AI. Does this imply that the robber is “in kahoots” with the AI in this situation (i.e. the AI projects a video onto the wall instructing the robber to put up a screen)? This seems a bit strange and complicated.
Instead, we might consider the situation in which the AI controls a SmartFabricator, which we want to arrange carbon atoms into diamonds. We might then imagine that it instead fabricates a screen to put in front of the camera, or makes a fake diamond. This wouldn’t require the existence of an external “robber” agent. Does the SmartVault scenario have helpful aspects that the SmartFabricator example lacks?
The SmartFabricator seems basically the same. In the robber example, you might imagine the SmartVault is the one that puts up the screen to conceal the fact that it let the diamond get stolen.
I suppose there are a number of examples that work, but I think the robber and vault give the scenario useful breadth.
The following is just my interpretation of it, so take it with a grain of salt. To me the robber and vault enable a few options. The AI can be passively lying or actively concealing. If the robber comes in, gets past the AIs defenses, and takes the diamond in a way the human observer can’t notice, then the AI has the option of passively lying. The AI tried its best to stop the robber and failed, but then chose to lie about it so it still got the reward of having protected the diamond as far as the humans know.
Alternatively the AI could actively conceal the outcome. The AI could try its best and fail to stop the robber, and then do some trickier to make it look like it did actually stop the robber. Or the AI could not bother stopping the robber and just focus on making it look like the diamond is still there. Here the AI is playing a more active role in concealing the outcome.
None of these scenarios require coordination from the robber. To me, the robber is just there to rob a sophisticated vault and make it look like they were never there. So the robber might cover up cameras or do other tampering so it looks like they were never there.
I think this is more flexible than your fabricator example. There the AI can’t really play a passive role, it’s either concealing or not. But you could probably demonstrate the things ARC is looking at here with the fabricator example too I would think.
Like I said, just my interpretation, so I may be misunderstanding the intent or other nuances.