I don’t think that would happen. But imagine that somehow it does happen, the regularization is too strong and the dataset doesn’t include any examples where the camera was hacked, so our model predicts both the activations of the physical diamond and the diamond image independently, what then? Try to think about any toy model of a scenario like that, any simple enough that we can analyze exactly. The simplest is that the variable on which we are conditioning the generation is an “uniformly” distributed scalar to which we apply two linear transformations to predict two values (which are meant to stand for the two diamonds) and then add gaussian noise. Given two observed values I’m pretty sure (I didn’t actually do the math but it seems obvious) that the reconstructed initial value is a weighted average of what would be predicted by either value independently. I expect that something analogous would happen in more realistic scenarios. Is this an acceptable behavior? I think so, or at least much better than any known alternative. If we used an AI to optimize the world so that the answer to “Is the diamond in the vault?” is “Yes”, it would make sure that both the real diamond and the one in the image stay in place.
I don’t think that would happen. But imagine that somehow it does happen, the regularization is too strong and the dataset doesn’t include any examples where the camera was hacked, so our model predicts both the activations of the physical diamond and the diamond image independently, what then? Try to think about any toy model of a scenario like that, any simple enough that we can analyze exactly. The simplest is that the variable on which we are conditioning the generation is an “uniformly” distributed scalar to which we apply two linear transformations to predict two values (which are meant to stand for the two diamonds) and then add gaussian noise. Given two observed values I’m pretty sure (I didn’t actually do the math but it seems obvious) that the reconstructed initial value is a weighted average of what would be predicted by either value independently. I expect that something analogous would happen in more realistic scenarios. Is this an acceptable behavior? I think so, or at least much better than any known alternative. If we used an AI to optimize the world so that the answer to “Is the diamond in the vault?” is “Yes”, it would make sure that both the real diamond and the one in the image stay in place.