I do think that either compiler time flags for the AI system or a second ‘monitor’ system chained to the AI system in order to enforce the named rules would probably limit the damage.
The broader point is that probabilistic AI safety is probably a much more tractable problem than absolute AI safety for a lot of reasons, to further the nuclear analogy, emergency shutdown is probably a viable safety measure for a lot of the plausible ‘paperclip maximizer turns us into paperclips’ scenarios.
“I need to disconnect the AI safety monitoring robot from my AI-enabled nanotoaster robot prototype because it keeps deactivating it” might still be the last words a human ever speaks, but hey, we tried.
There seems to be a complexity limit to what humans can build. A full GAI is likely to be somewhere beyond that limit.
The usual solution to that problem—see the EY’s fooming scenario—is to make the process recursive: let a mediocre AI improve itself, and as it gets better it can improve itself more rapidly. Exponential growth can go fast and far.
This, of course, gives rise to another problem: you have no idea what the end product is going to look like. If you’re looking at the gazillionth iteration, your compiler flags were probably lost around the thousandth iteration and your chained monitor system mutated into a cute puppy around the millionth iteration...
Probabilistic safety systems are indeed more tractable, but that’s not the question. The question is whether they are good enough.
I hadn’t thought about it that way.
I do think that either compiler time flags for the AI system or a second ‘monitor’ system chained to the AI system in order to enforce the named rules would probably limit the damage.
The broader point is that probabilistic AI safety is probably a much more tractable problem than absolute AI safety for a lot of reasons, to further the nuclear analogy, emergency shutdown is probably a viable safety measure for a lot of the plausible ‘paperclip maximizer turns us into paperclips’ scenarios.
“I need to disconnect the AI safety monitoring robot from my AI-enabled nanotoaster robot prototype because it keeps deactivating it” might still be the last words a human ever speaks, but hey, we tried.
There seems to be a complexity limit to what humans can build. A full GAI is likely to be somewhere beyond that limit.
The usual solution to that problem—see the EY’s fooming scenario—is to make the process recursive: let a mediocre AI improve itself, and as it gets better it can improve itself more rapidly. Exponential growth can go fast and far.
This, of course, gives rise to another problem: you have no idea what the end product is going to look like. If you’re looking at the gazillionth iteration, your compiler flags were probably lost around the thousandth iteration and your chained monitor system mutated into a cute puppy around the millionth iteration...
Probabilistic safety systems are indeed more tractable, but that’s not the question. The question is whether they are good enough.