One issue here is that worlds with an “almost-friendly” AI (one whose friendliness was botched in some respect) may end up looking like siren or marketing worlds.
In that case, worlds as bad as sirens will be rather too common in the search space (because AIs with botched friendliness are more likely than AIs with true friendliness) and a satisficing approach won’t work.
One issue here is that worlds with an “almost-friendly” AI (one whose friendliness was botched in some respect) may end up looking like siren or marketing worlds.
In that case, worlds as bad as sirens will be rather too common in the search space (because AIs with botched friendliness are more likely than AIs with true friendliness) and a satisficing approach won’t work.
Interesting thought there...