Shmi comments on Reply to Holden on ‘Tool AI’

Shmi 11 Jul 2013 0:32 UTC
0 points
0
My best attempt at imagining hardwiring is having a layer not accessible to introspection, such as involuntary muscle control in humans. Or instinctively jerking your hand away when touching something hot. Which serves as a fail-safe against stupid conscious decisions, in a sense. Or a watchdog restarting a stuck program in your phone, no matter how much the software messed it up. Etc. Whether this approach can be used to prevent a tool AI from spontaneously agentizing, I am not sure.
- Eliezer Yudkowsky 11 Jul 2013 1:22 UTC
  3 points
  0
  Parent
  If you can say how to do this in hardware, you can say how to do it in software. The hardware version might arguably be more secure against flaws in the design, but if you can say how to do it at all, you can say how to do it in software.
  - Shmi 11 Jul 2013 5:19 UTC
    0 points
    0
    Parent
    Maybe I don’t understand what you mean by hardware.
    
    For example, you can have a fuse that unconditionally blows when excess power is consumed. This is hardware. You can also have a digital amp meter readable by software, with a polling subroutine which shuts down the system if the current exceeds a certain limit. There is a good reason that such a software solution, while often implemented, is almost never the only safeguard: software is much less reliable and much easier to subvert, intentionally or accidentally. The fuse is impossible to bypass in software, short of accessing an external agent who would attach a piece of thick wire in parallel with the fuse. Is this what you mean by “you can say how to do it in software”?
    - Eliezer Yudkowsky 11 Jul 2013 19:28 UTC
      3 points
      0
      Parent
      That’s pretty much what I mean. The point is that if you don’t understand the structurally required properties well enough to describe the characteristics of a digital amp meter with a polling subroutine, saying that you’ll hardwire the digital amp meter doesn’t help very much. There’s a hardwired version which is moderately harder to subvert on the presumption of small design errors, but first you have to be able to describe what the software does. Consider also that anything which can affect the outside environment can construct copies of itself minus hardware constraints, construct an agent that reaches back in and modifies the hardware, etc. If you can’t describe how not do to this in software, ‘hardwiring’ won’t help—the rules change somewhat when you’re dealing with intelligent agents.
      - bogdanb 11 Jul 2013 20:08 UTC
        0 points
        0
        Parent
        
        the rules change somewhat when you’re dealing with intelligent agents.
        
        Now that’s an understatement!