Human values are fairly complex and fragile. Most human values are focused around points in mind design space that are similar to ours. We should expect a randomly generated AI to not be a good successor. Any good successor would have to result from some process that approximately copies our values. This could be rerunning evolution to create beings with values similar to ours, or it could be an attempt at alignment that almost worked.
I’m not sure what simulating our civilization is supposed to achieve? If it works, we would get beings who were basically human. This would double the population, and get you some digital minds. Much the same thing could be achieved by developing mind uploading and a pro-natal culture. Neither will greatly help us to build an aligned super intelligence, or stop people building an unaligned one.
On the partially aligned AI, this just means we don’t need to get AI perfectly aligned for the future to be good, but the closer we get, the better it gets. An AI that’s running a hard coded set of moral rules wont be as good as one that lets us think about what we want to do, but if those rules are chosen well, they could still describe most of human value. (eg CelestAI from friendship is optimal)
Human values are fairly complex and fragile. Most human values are focused around points in mind design space that are similar to ours. We should expect a randomly generated AI to not be a good successor. Any good successor would have to result from some process that approximately copies our values. This could be rerunning evolution to create beings with values similar to ours, or it could be an attempt at alignment that almost worked.
I’m not sure what simulating our civilization is supposed to achieve? If it works, we would get beings who were basically human. This would double the population, and get you some digital minds. Much the same thing could be achieved by developing mind uploading and a pro-natal culture. Neither will greatly help us to build an aligned super intelligence, or stop people building an unaligned one.
On the partially aligned AI, this just means we don’t need to get AI perfectly aligned for the future to be good, but the closer we get, the better it gets. An AI that’s running a hard coded set of moral rules wont be as good as one that lets us think about what we want to do, but if those rules are chosen well, they could still describe most of human value. (eg CelestAI from friendship is optimal)