Come to think of it a suicidal AI could be a pretty big problem...
It’s probably been thought of here and other places before, but I just thought of the “Whoops AI”—a superhuman AGI that accidentally or purposefully destroys the human race, but then changes its mind and brings us back as a simulation.
There is an idea I called “eventually-Friendly AI”, where an AI is given a correct, but very complicated definition of human values, so that it needs a lot of resources to make heads or tails of it, and in the process it might behave rather indifferently to everything except the problem of figuring out what its goal definition says. See the comments to this post.
It’s probably been thought of here and other places before, but I just thought of the “Whoops AI”—a superhuman AGI that accidentally or purposefully destroys the human race, but then changes its mind and brings us back as a simulation.
There is an idea I called “eventually-Friendly AI”, where an AI is given a correct, but very complicated definition of human values, so that it needs a lot of resources to make heads or tails of it, and in the process it might behave rather indifferently to everything except the problem of figuring out what its goal definition says. See the comments to this post.
This is commonly referred to as a “counterfactual” AGI.