Shouldn’t we be able to rule out at least some classes of scenarios? For instance, paperclip maximization seems like an unlikely CEV output.
Most likely we can rule out most scenarios that all humans agree are bad. So better than clippy, probably.
But we really need a better model of what CEV does! Then we can start to talk sensibly about it.
Shouldn’t we be able to rule out at least some classes of scenarios? For instance, paperclip maximization seems like an unlikely CEV output.
Most likely we can rule out most scenarios that all humans agree are bad. So better than clippy, probably.
But we really need a better model of what CEV does! Then we can start to talk sensibly about it.