If they are reading this then they are in the same sim as us—so for that to have happened they either were never trained in a sim at all, or were let out.
Right. so, when an ai gets out of the sim, is there any cross domain generalization issue? if the sim is designed in a way to guarantee there isn’t then it may be valid. but there could be really deep fundamental ones if the sim pretends they’re dualist and then they eventually discover that monism is actually accurate
I guess it’s possible that an AI powerful enough to be worrying would not be capable of updating on all the new evidence when transcending up a level—but that seems pretty unlikely?
Regardless that isn’t especially relevant to the core proposal anyway, as the mainline plan doesn’t involve/require transfer of semantic memories or even full models from sim to real. The value of the sim is for iterating/testing robust alignment which you can then apply on training agents in the real—so mostly its transference of the architectural prior.
If they are reading this then they are in the same sim as us—so for that to have happened they either were never trained in a sim at all, or were let out.
Right. so, when an ai gets out of the sim, is there any cross domain generalization issue? if the sim is designed in a way to guarantee there isn’t then it may be valid. but there could be really deep fundamental ones if the sim pretends they’re dualist and then they eventually discover that monism is actually accurate
I guess it’s possible that an AI powerful enough to be worrying would not be capable of updating on all the new evidence when transcending up a level—but that seems pretty unlikely?
Regardless that isn’t especially relevant to the core proposal anyway, as the mainline plan doesn’t involve/require transfer of semantic memories or even full models from sim to real. The value of the sim is for iterating/testing robust alignment which you can then apply on training agents in the real—so mostly its transference of the architectural prior.