Stupid question: because we already know the goal (“keep the diamond intact and in the vault”) what prevents us from bypassing the sensors and just directly evaluating the AI based on whether or not the diamond is in the room? Granted, this only works in simulated training, but as long as the AI doesn’t know whether or not it’s in deployment (an adversarial training process might help here) that won’t matter.
As any goal we could have is a subset of the possible states of the area we care about, verifying whether or not our goal is achieved should be easier than making the simulation the AI is being trained with. Thus, evaluating the goal directly instead of trying to evaluate our perception of the goal might be a viable strategy for improving the training process (unless I’ve completely misunderstood this, which is likely).
The hard part is building a simulation so good that an AI transfers perfectly from the simulation to the real world. This is already extremely difficult for simple robots (I actually worked on sim-to-real transfer as an intern at OpenAI), and in general the problem gets harder the smarter your AI gets (since it can “notice” more and more possible mismatches between your simulations are reality).
Stupid question: because we already know the goal (“keep the diamond intact and in the vault”) what prevents us from bypassing the sensors and just directly evaluating the AI based on whether or not the diamond is in the room? Granted, this only works in simulated training, but as long as the AI doesn’t know whether or not it’s in deployment (an adversarial training process might help here) that won’t matter.
As any goal we could have is a subset of the possible states of the area we care about, verifying whether or not our goal is achieved should be easier than making the simulation the AI is being trained with. Thus, evaluating the goal directly instead of trying to evaluate our perception of the goal might be a viable strategy for improving the training process (unless I’ve completely misunderstood this, which is likely).
The hard part is building a simulation so good that an AI transfers perfectly from the simulation to the real world. This is already extremely difficult for simple robots (I actually worked on sim-to-real transfer as an intern at OpenAI), and in general the problem gets harder the smarter your AI gets (since it can “notice” more and more possible mismatches between your simulations are reality).