I agree on all points. It seems “bounded utility” might be a better term than “disutility”. The main point is that a halting condition triggered by success, and a system that is essentially trying to find the conditions where it can shut itself off, seems less likely to go horribly wrong than an unbounded search for ever more utility.
This is not an attempt to solve Friendly AI. I just figure a simple hard-coded limit to how much of anything a learning machine could want chops off a couple of avenues for disaster.
I agree on all points. It seems “bounded utility” might be a better term than “disutility”. The main point is that a halting condition triggered by success, and a system that is essentially trying to find the conditions where it can shut itself off, seems less likely to go horribly wrong than an unbounded search for ever more utility.
This is not an attempt to solve Friendly AI. I just figure a simple hard-coded limit to how much of anything a learning machine could want chops off a couple of avenues for disaster.