bogdanb comments on Reply to Holden on ‘Tool AI’

bogdanb 10 Jul 2013 22:03 UTC
0 points
I always thought that “hardwiring” meant implementing [whatever functionality is discussed] by permanently (physically) modifying the machine, i.e. either something that you couldn’t have done with software, or something that prevents the software from actually working in some way it did before. The concept is of immutability within the constraints, not of priority or “force”.

Which does sound like something one could do when they can’t figure out how to do the software right. (Watchdogs are pretty much exactly that, though some or probably most are in fact programmable.)

Note that I’m not arguing that the word is not harmful. It just seemed you have a different interpretation of what that word suggests. If other people use my interpretation (no idea), you might be better at persuading it if you address that.

I’m quite aware that from the point of view of a godlike AI, there’s not much difference between circumventing restrictions in its software and (some kinds of) restrictions in hardware. After all, the point of FAI is to get it to control the universe around it, albeit to our benefit. But we’re used to computers not having much control over their hardware. Hell, I just called it “godlike” and my brain still insists to visualize it as a bunch of boxes gathering dust and blinking their leds in a basement.

And I can’t shake the feeling that between “just built” and “godlike” there’s supposed to be quite a long time when such crude solutions might work. (I’ve seen a couple of hard take-off scenarios, but not yet a plausible one that didn’t need at least a few days of preparation after becoming superhuman.)

Imagine we took you, gave you the best “upgrades” we can do today plus a little bit (say, a careful group of experts figuring out your ideal diet of nootropics, training you to excellence everything from acting to martial arts, and gave you nanotube bones and a direct internet link to your head). Now imagine you have a small bomb in your body, set to detonate if tampered with or if one of several remotes distributed throughout the population is triggered. The worlds best experts tried really hard to make it fail-deadly.

Now, I’m not saying you couldn’t take over the world, send all men to Mars and the women to Venus, then build a volcano lair filled with kittens. But it seems far from certain, and I’m positive it’d take you a long time to succeed. And, it does feel that a new-born AI would like that for a while rather than turn into Prime Intellect in five minutes. (Again, this is not an argument that UFAI is no problem. I guess I’m just figuring out why it seems that way to mostly everyone.)

[Huh, I just noticed I’m a year late on this chat. Sorry.]
- Eliezer Yudkowsky 11 Jul 2013 0:08 UTC
  7 points
  Parent
  Software physically modifies the machine. What can you do with a soldering iron that you can’t do with a program instruction, particularly with respect to building a machine agent? Either you understand how to write a function or you don’t.
  - bogdanb 11 Jul 2013 20:00 UTC
    2 points
    Parent
    That is all true in principle, but in practice it’s very common that one of the two is not feasible. For example, you can have a computer. You can program the computer to tell you when it’s reading from the hard drive, or communicates to the network, say by blinking an LED. If the program has a bug (e.g., it’s not the kind of AI you wanted to build), you might not be notified. But you can use a soldering iron to electrically link the LED to the relevant wires, and it seems to most users that no possible programming bug can make the LED not light up when it should.
    
    Of course, that’s like the difference between programming a robot to stay in a pen, or locking the gate. It looks like whatever bug you could introduce in the robot’s software cannot cause the robot to leave. Which ignores the fact that robot might learn to climb the fence, make a key, convince someone else (or hack an outside robot) to unlock the gate.
    
    I think most people would detect the dangers in the robot case (because they can imagine themselves finding a way to escape), but be confused by the AI-in-the-box one (simply because it’s harder to imagine yourself as software, and even if you manage to you’d still have much fewer ideas come to mind, simply because you’re not used to being software).
    
    Hell, most people probably won’t even have the reflex to imagine themselves in place of the AI. My brain reflexively tells me “I can’t write a program to control that LED, so even if there’s a bug it won’t happen”. If instead I force myself to think “How would I do that if I were the AI”, it’s easier to find potential solutions, and it also makes it more obvious that someone else might find one. But that may be because I’m a programmer, I’m not sure if it applies to others.
  - Shmi 11 Jul 2013 0:32 UTC
    0 points
    Parent
    My best attempt at imagining hardwiring is having a layer not accessible to introspection, such as involuntary muscle control in humans. Or instinctively jerking your hand away when touching something hot. Which serves as a fail-safe against stupid conscious decisions, in a sense. Or a watchdog restarting a stuck program in your phone, no matter how much the software messed it up. Etc. Whether this approach can be used to prevent a tool AI from spontaneously agentizing, I am not sure.
    - Eliezer Yudkowsky 11 Jul 2013 1:22 UTC
      3 points
      Parent
      If you can say how to do this in hardware, you can say how to do it in software. The hardware version might arguably be more secure against flaws in the design, but if you can say how to do it at all, you can say how to do it in software.
      - Shmi 11 Jul 2013 5:19 UTC
        0 points
        Parent
        Maybe I don’t understand what you mean by hardware.
        
        For example, you can have a fuse that unconditionally blows when excess power is consumed. This is hardware. You can also have a digital amp meter readable by software, with a polling subroutine which shuts down the system if the current exceeds a certain limit. There is a good reason that such a software solution, while often implemented, is almost never the only safeguard: software is much less reliable and much easier to subvert, intentionally or accidentally. The fuse is impossible to bypass in software, short of accessing an external agent who would attach a piece of thick wire in parallel with the fuse. Is this what you mean by “you can say how to do it in software”?
        Eliezer Yudkowsky 11 Jul 2013 19:28 UTC
        3 points
        Parent
        That’s pretty much what I mean. The point is that if you don’t understand the structurally required properties well enough to describe the characteristics of a digital amp meter with a polling subroutine, saying that you’ll hardwire the digital amp meter doesn’t help very much. There’s a hardwired version which is moderately harder to subvert on the presumption of small design errors, but first you have to be able to describe what the software does. Consider also that anything which can affect the outside environment can construct copies of itself minus hardware constraints, construct an agent that reaches back in and modifies the hardware, etc. If you can’t describe how not do to this in software, ‘hardwiring’ won’t help—the rules change somewhat when you’re dealing with intelligent agents.
        bogdanb 11 Jul 2013 20:08 UTC
        0 points
        Parent
        
        the rules change somewhat when you’re dealing with intelligent agents.
        
        Now that’s an understatement!