Vladimir_Nesov comments on Three Approaches to “Friendliness”

Vladimir_Nesov 17 Jul 2013 21:18 UTC
5 points
0
My point was that when we expand on “black box metaphilosophical AI”, it seems to become much less mysterious than the whole problem, we only need to solve decision theory and powerful optimization and maybe (wait for) WBE. If we can pack a morality/philosophy research team into the goal definition, the solution of the friendliness part can be deferred almost completely to after the current risks are eliminated, at which point the team will have a large amount of time to solve it.

(I agree that building smarter humans is a potentially workable point of intervention. This needs a champion to at least outline the argument, but actually making this happen will be much harder.)
- Wei Dai 17 Jul 2013 21:49 UTC
  4 points
  0
  Parent
  
  My point was that when we expand on “black box metaphilosophical AI”, it seems to become much less mysterious than the whole problem, we only need to solve decision theory and powerful optimization and maybe (wait for) WBE.
  
  I think I understand the basic motivation for pursuing this approach, but what’s your response to the point I made in the post, that such an AI has to achieve superhuman levels of optimizing power, in order to acquire enough computing power to run the WBE, before it can start producing philosophical solutions, and therefore there’s no way for us to safely test it to make sure that the “black box” would produce sane answers as implemented? It’s hard for me to see how we can get something this complicated right on the first try.
  - Vladimir_Nesov 17 Jul 2013 22:10 UTC
    4 points
    0
    Parent
    The black box is made of humans and might be tested the usual way when (human-designed) WBE tech is developed. The problem of designing its (long term) social organisation might also be deferred to the box. The point of the box is that it can be made safe from external catastrophic risks, not that it represents any new progress towards FAI.
    
    The AI doesn’t produce philosophical answers, the box does, and the box doesn’t contain novel/dangerous things like AIs. This only requires solving the separate problems of having AI care about evaluating a program, and preparing a program that contains people who would solve the remaining problems (and this part doesn’t involve AI). The AI is something that can potentially be theoretically completely understood and it can be very carefully tested under controlled conditions, to see that it does evaluate simpler black boxes that we also understand. Getting decision theory wrong seems like a more elusive risk.
    - Wei Dai 17 Jul 2013 22:30 UTC
      6 points
      0
      Parent
      
      The black box is made of humans and might be tested the usual way when (human-designed) WBE tech is developed.
      
      Ok, I think I misunderstood you earlier, and thought that your idea was similar to Paul Christiano’s, where the FAI would essentially develop the WBE tech instead of us. I had also suggested waiting for WBE tech before building FAI (although due to a somewhat different motivation), and in response someone (maybe Carl Shulman?) argued that brain-inspired AGI or low-fidelity brain emulations would likely be developed before high-fidelity brain emulations, which means the FAI would probably come too late if it waited for WBE. This seems fairly convincing to me.
      - Vladimir_Nesov 17 Jul 2013 22:44 UTC
        4 points
        0
        Parent
        Waiting for WBE is risky in many ways, but I don’t see a potentially realistic plan that doesn’t go through it, even if we have (somewhat) smarter humans. This path (and many variations, such as a WBE superorg just taking over “manually” and not leaving anyone else with access to physical world) I can vaguely see working, solving the security/coordination problem, if all goes right; other paths seem much more speculative to me (but many are worth trying, given resources; if somehow possible to do reliably, AI-initiated WBE when there is no human-developed WBE would be safer).
- Yosarian2 24 Jul 2013 17:23 UTC
  0 points
  0
  Parent
  
  (I agree that building smarter humans is a potentially workable point of intervention. This needs a champion to at least outline the argument, but actually making this happen will be much harder.)
  
  It seems fairly clear to me that we will have the ability to “build smarter humans” within the next few decades, just because there are so many different possible research paths that could get us to that goal, all of which look promising.
  
  There’s starting to be some good research done right now on which genes correlate with intelligence. It looks like a very complicated subject, with thousands of genes contributing; nonetheless, =that would be enough to make it possible to do pre-implantation genetic screening to select “smarter” babies with current day technology, and it doesn’t put us that far from actually genetically engineering fertilized eggs before implantation, or possibly even doing genetic therapies to adults (although, of course, that’s inherently dodgier, and is likely to have a smaller effect).
  
  Other likely paths to IA include:
  
  -We’re making a lot of progress on brain-computer interfaces right now, of all types.
  
  -Brain stimulation also seems to have a lot of potential; it was already shown to improve people’s ability to learn math in school in published research.
  
  -Nootropic drugs also may some potential, although we aren’t really throwing a lot of research in that direction right now. It is worth mentioning, though, that one possible outcome to that research on genes correlated with intelligence might be to figure out what proteins those genes code for and figure out drugs that have the same effect.
  
  -Looking at the more cybernetic side, a scientist has recently managed to create an implantable chip that could connect with the brain of a rat and both store memories and give them back to the mouse directly, basically an artificial hippocampus. http://www.technologyreview.com/featuredstory/513681/memory-implants/
  
  -The sudden focus on brain research and modeling in the US and the UK is also likely to have significant impacts
  
  -There’s other, more futuristic possible technologies here as well (nanotech, computer exocortex, ect). Not as likely to happen in the time frame we’re talking about, though.
  
  Anyway, unless GAI comes much sooner then I expect it to, I would expect that some of the things on that list are likely to happen before GAI. Many of them we’re already quite close to, and there’s enough different paths to get to enhanced human intelligence that I put a low probability on all of them being dead ends. I think there’s a very good chance that we’ll develop some kind of way to increase human intelligence first, before any kind of true GAI becomes possible, especially if we put more effort into research in that direction.
  
  The real question, I think, is how much of intelligence boost any of that that going to give us, and if that’s going to be enough to make FAI problems easier to solve, and I’m not sure if that’s answerable at this point.