jacob_cannell comments on MIRI’s Approach

jacob_cannell Jul 30, 2015, 7:34 PM
11 points

The problem with deep neural networks is not that they lack theoretical foundations. It’s that most of the people going “WOW SO COOL” at deep neural networks can’t be bothered to understand the theoretical foundations. The “deep learning cabal” of researchers (out of Toronto, IIRC), and the Switzerland Cabal of Schmidhuber-Hutter-and-Legg fame, all know damn well what they are doing on an analytical level.

This isn’t really a problem, because—as you point out—the formidable researchers all “know damn well what they are doing on an analytical level”.

Thus the argument that there are people using DL without understanding it—and moreover that this is dangerous—is specious and weak because these people are not the ones actually likely to develop AGI let alone superintelligence.

Why not test safety long before the system is superintelligent?

Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.

Ah—the use of guarantees belies the viewpoint problem. Instead of thinking of ‘safety’ or ‘alignment’ as some absolute binary property we can guarantee, it is more profitable to think of a complex distribution over the relative amounts of ‘safety’ or ‘alignment’ in an AI population (and any realistic AI project will necessarily involve a population due to scaling constraints). Strong guarantees may be impossible, but we can at least influence or steer the distribution by selecting for agent types that are more safe/altruistic. We can develop a scaling theory of if, how, and when these desirable properties change as agents grow in capability.

In other words—these issues are so incredibly complex that we can’t really develop any good kind of theory without alot of experimental data to back it up.

Also—I should point out that one potential likely result of ANN based AGI is the creation of partial uploads through imitation and reverse reinforcement learning—agents which are intentionally close in mindspace to their human ‘parent’ or ‘model’.
- [deleted]Jul 30, 2015, 11:24 PM
  3 points
  Parent
  
  Thus the argument that there are people using DL without understanding it—and moreover that this is dangerous—is specious and weak because these people are not the ones actually likely to develop AGI let alone superintelligence.
  
  Yes, but I don’t think that’s an argument anyone has actually made. Nobody, to my knowledge, sincerely believes that we are right around the corner from superintelligent, self-improving AGI built out of deep neural networks, such that any old machine-learning professor experimenting with how to get a lower error rate in classification tasks is going to suddenly get the Earth covered in paper-clips.
  
  Actually, no, I can think of one person who believed that: a radically underinformed layperson on reddit who, for some strange reason, believed that LessWrong is the only site with people doing “real AI” and that “[machine-learning researchers] build optimizers! They’ll destroy us all!”
  
  Hopefully he was messing with me. Nobody else has ever made such ridiculous claims.
  
  Sorry, wait, I’m forgetting to count sensationalistic journalists as people again. But that’s normal.
  
  Instead of thinking of ‘safety’ or ‘alignment’ as some absolute binary property we can guarantee, it is more profitable to think of a complex distribution over the relative amounts of ‘safety’ or ‘alignment’ in an AI population
  
  No, “guarantees” in this context meant PAC-style guarantees: “We guarantee that with probability 1-\delta, the system will only ‘go wrong’ from what its sample data taught it 1-\epsilon fraction of the time.” You then need to plug in the epsilons and deltas you want and solve for how much sample data you need to feed the learner. The links for intro PAC lectures in the other comment given to you were quite good, by the way, although I do recommend taking a rigorous introductory machine learning class (new grad-student level should be enough to inflict the PAC foundations on you).
  
  we can at least influence or steer the distribution by selecting for agent types that are more safe/altruistic
  
  “Altruistic” is already a social behavior, requiring the agent to have a theory of mind and care about the minds it believes it observes in its environment. It also assumes that we can build in some way to learn what the hypothesized minds want, learn how they (ie: human beings) think, and separate the map (of other minds) from the territory (of actual people).
  
  Note that “don’t disturb this system over there (eg: a human being) because you need to receive data from it untainted by your own causal intervention in any way” is a constraint that at least I, personally, do not know how to state in computational terms.
  - YVLIAZ Sep 7, 2015, 6:14 AM
    0 points
    Parent
    I think you are overhyping the PAC model. It surely is an important foundation for probabilistic guarantees in machine learning, but there are some serious limitations when you want to use it to constrain something like an AGI:
    
    It only deals with supervised learning
    
    Simple things like finite automata are not learnable, but in practice it seems like humans pick them up fairly easily.
    
    It doesn’t deal with temporal aspects of learning.
    
    However, there are some modification of the PAC model that can ameliorate these problems, like learning with membership queries (item 2).
    
    It’s also perhaps a bit optimistic to say that PAC-style bounds on a possibly very complex system like an AGI would be “quite doable”. We don’t even know, for example, whether DNF is learnable in polynomial time under the distribution free assumption.
    - [deleted]Sep 7, 2015, 2:06 PM
      0 points
      Parent
      I would definitely call it an open research problem to provide PAC-style bounds for more complicated hypothesis spaces and learning settings. But that doesn’t mean it’s impossible or un-doable, just that it’s an open research problem. I want a limitary theorem proved before I go calling things impossible.