A common concern around here seems to be that, without massive and delicate breakthroughs in our understanding of human values, any superintelligence will destroy all value by becoming some sort of paperclip optimizer. This is what Eliezer claims in Value is Fragile. Therefore, any vision of the future that manages to do better than this without requiring huge philosophical breakthroughs (in particular, a future that doesn’t know how to implement CEV before the Singularity happens) is encouraging to me as a proof of concept for how the future might be more likely to go well.
In a future where uploading minds into virtual worlds becomes possible before an AI takeover, there might well be a way to salvage quite a lot of human value with a very comparatively simple utility function: simply create a big virtual world and upload lots of people into it, then have the AI’s whole goal be to run this simulation for as long as possible.
This idea of “just run this program” seems a lot more robust and more likely to work and less likely to be exploited than attempting to maximize some utility function meant to represent human values, and the result would probably be better than what would happen if the latter went wrong. I suspect it would be well within the capability of a society which can upload minds to create a virtual world for these minds where the only scarce resource is computation cycles and there is no way to forcibly detain someone, so this virtual world would not have many of the problems our current world has.
This is far from a perfect outcome, of course. The AI would likely destroy everything it touches for resources, killing everyone not fortunate enough to get uploaded. And there are certainly other problems with any idea of “virtual utopia” we could come up with. But this idea gives me hope because it might be improved upon, and because it is a way that we don’t lose everything even if CEV proves too hard of a problem to solve before Singularity.
A common concern around here seems to be that, without massive and delicate breakthroughs in our understanding of human values, any superintelligence will destroy all value by becoming some sort of paperclip optimizer. This is what Eliezer claims in Value is Fragile. Therefore, any vision of the future that manages to do better than this without requiring huge philosophical breakthroughs (in particular, a future that doesn’t know how to implement CEV before the Singularity happens) is encouraging to me as a proof of concept for how the future might be more likely to go well.
In a future where uploading minds into virtual worlds becomes possible before an AI takeover, there might well be a way to salvage quite a lot of human value with a very comparatively simple utility function: simply create a big virtual world and upload lots of people into it, then have the AI’s whole goal be to run this simulation for as long as possible.
This idea of “just run this program” seems a lot more robust and more likely to work and less likely to be exploited than attempting to maximize some utility function meant to represent human values, and the result would probably be better than what would happen if the latter went wrong. I suspect it would be well within the capability of a society which can upload minds to create a virtual world for these minds where the only scarce resource is computation cycles and there is no way to forcibly detain someone, so this virtual world would not have many of the problems our current world has.
This is far from a perfect outcome, of course. The AI would likely destroy everything it touches for resources, killing everyone not fortunate enough to get uploaded. And there are certainly other problems with any idea of “virtual utopia” we could come up with. But this idea gives me hope because it might be improved upon, and because it is a way that we don’t lose everything even if CEV proves too hard of a problem to solve before Singularity.