That is all true in principle, but in practice it’s very common that one of the two is not feasible. For example, you can have a computer. You can program the computer to tell you when it’s reading from the hard drive, or communicates to the network, say by blinking an LED. If the program has a bug (e.g., it’s not the kind of AI you wanted to build), you might not be notified. But you can use a soldering iron to electrically link the LED to the relevant wires, and it seems to most users that no possible programming bug can make the LED not light up when it should.
Of course, that’s like the difference between programming a robot to stay in a pen, or locking the gate. It looks like whatever bug you could introduce in the robot’s software cannot cause the robot to leave. Which ignores the fact that robot might learn to climb the fence, make a key, convince someone else (or hack an outside robot) to unlock the gate.
I think most people would detect the dangers in the robot case (because they can imagine themselves finding a way to escape), but be confused by the AI-in-the-box one (simply because it’s harder to imagine yourself as software, and even if you manage to you’d still have much fewer ideas come to mind, simply because you’re not used to being software).
Hell, most people probably won’t even have the reflex to imagine themselves in place of the AI. My brain reflexively tells me “I can’t write a program to control that LED, so even if there’s a bug it won’t happen”. If instead I force myself to think “How would I do that if I were the AI”, it’s easier to find potential solutions, and it also makes it more obvious that someone else might find one. But that may be because I’m a programmer, I’m not sure if it applies to others.
That is all true in principle, but in practice it’s very common that one of the two is not feasible. For example, you can have a computer. You can program the computer to tell you when it’s reading from the hard drive, or communicates to the network, say by blinking an LED. If the program has a bug (e.g., it’s not the kind of AI you wanted to build), you might not be notified. But you can use a soldering iron to electrically link the LED to the relevant wires, and it seems to most users that no possible programming bug can make the LED not light up when it should.
Of course, that’s like the difference between programming a robot to stay in a pen, or locking the gate. It looks like whatever bug you could introduce in the robot’s software cannot cause the robot to leave. Which ignores the fact that robot might learn to climb the fence, make a key, convince someone else (or hack an outside robot) to unlock the gate.
I think most people would detect the dangers in the robot case (because they can imagine themselves finding a way to escape), but be confused by the AI-in-the-box one (simply because it’s harder to imagine yourself as software, and even if you manage to you’d still have much fewer ideas come to mind, simply because you’re not used to being software).
Hell, most people probably won’t even have the reflex to imagine themselves in place of the AI. My brain reflexively tells me “I can’t write a program to control that LED, so even if there’s a bug it won’t happen”. If instead I force myself to think “How would I do that if I were the AI”, it’s easier to find potential solutions, and it also makes it more obvious that someone else might find one. But that may be because I’m a programmer, I’m not sure if it applies to others.