The best counterargument here was presented by EY: that superintelligent AI will easily recognise and crack the simulation from inside. See That Alien Message.
In my few, it may be useful to install uncertainty in AI that it could be in simulation which is testing its behaviour. Rolf suggested to do it by making public precommitment to create many such simulations before any AI is created. However, it could work only as our last line of defence after everything else (alignment, control systems, boxing,) fails.
Although mildly entertaining as sci-fi, the ‘argument’ in “That Alien Message” is juvenile. Even with a ridiculous net compute advantage, massive time dilation, completely unrealistic data efficiency advantages, the AI still needs a massive obvious easter egg notifying them they are in a simulation.
Simboxing is essential because it’s the only way we can safely test alignment designs. It’s not a last line of defense, it’s essential for any practical alignment scheme for actual DL based AGI.
To build upon this idea, it is well-established that we (as a civilization) cannot secure any but the simplest software against a determined attacker. Secure software against an intelligence smarter than us is unfeasible.
The best counterargument here was presented by EY: that superintelligent AI will easily recognise and crack the simulation from inside. See That Alien Message.
In my few, it may be useful to install uncertainty in AI that it could be in simulation which is testing its behaviour. Rolf suggested to do it by making public precommitment to create many such simulations before any AI is created. However, it could work only as our last line of defence after everything else (alignment, control systems, boxing,) fails.
Although mildly entertaining as sci-fi, the ‘argument’ in “That Alien Message” is juvenile. Even with a ridiculous net compute advantage, massive time dilation, completely unrealistic data efficiency advantages, the AI still needs a massive obvious easter egg notifying them they are in a simulation.
Simboxing is essential because it’s the only way we can safely test alignment designs. It’s not a last line of defense, it’s essential for any practical alignment scheme for actual DL based AGI.
To build upon this idea, it is well-established that we (as a civilization) cannot secure any but the simplest software against a determined attacker. Secure software against an intelligence smarter than us is unfeasible.