If you replace the word ‘Artificial’ in this scheme with ‘Human’, does your system prevent issues with a hypothetical unfriendly human intelligence?
John von Neumann definitely hit the first two bullets, and given that the nuclear bomb was built and used, it seems like the third applies as well. I’d like to believe that similarly capable humans exist today.
Very dangerous: Able to cause existential catastrophe, in the absence of countermeasures. Transformatively useful: Capable of substantially reducing the risk posed by subsequent AIs[21] if fully deployed, likely by speeding up R&D and some other tasks by a large factor (perhaps 30x). Uncontrollable: Capable enough at evading control techniques or sabotaging control evaluations that it’s infeasible to control it.[22]
Yes, we often think about what would happen if we applied this system to humans, more specifically uploaded humans (aka brain emulations, aka EMs). This seems like a useful intuition pump.
We think control would likely be workable for uploaded humans which run considerably faster and cheaper than normal humans (e.g. 30x faster and much cheaper).
The main difference with the human case is that you can’t necessarily depend on training the system.
We typically imagine “humans uploads, but you can train them (with SGD)”. (Control can probably still work (up to the point where we catch the AI) without the ability to train in the literal human case, but a bunch of considerations come up which which likely don’t apply to AIs.)
If you replace the word ‘Artificial’ in this scheme with ‘Human’, does your system prevent issues with a hypothetical unfriendly human intelligence?
John von Neumann definitely hit the first two bullets, and given that the nuclear bomb was built and used, it seems like the third applies as well. I’d like to believe that similarly capable humans exist today.
Very dangerous: Able to cause existential catastrophe, in the absence of countermeasures.
Transformatively useful: Capable of substantially reducing the risk posed by subsequent AIs[21] if fully deployed, likely by speeding up R&D and some other tasks by a large factor (perhaps 30x).
Uncontrollable: Capable enough at evading control techniques or sabotaging control evaluations that it’s infeasible to control it.[22]
Yes, we often think about what would happen if we applied this system to humans, more specifically uploaded humans (aka brain emulations, aka EMs). This seems like a useful intuition pump.
We think control would likely be workable for uploaded humans which run considerably faster and cheaper than normal humans (e.g. 30x faster and much cheaper).
The main difference with the human case is that you can’t necessarily depend on training the system.
We typically imagine “humans uploads, but you can train them (with SGD)”. (Control can probably still work (up to the point where we catch the AI) without the ability to train in the literal human case, but a bunch of considerations come up which which likely don’t apply to AIs.)