Vladimir_Nesov comments on Singularity FAQ

Vladimir_Nesov 20 Apr 2011 1:17 UTC
10 points
This needs more work, my nitpick detector trips off on every other sentence. If you will be willing to heavily revise, I’ll compile a more detailed list (or revise some sections myself).

Examples (starting from the end, skipping some):
- “but that stable eventual goal may be very difficult to predict in advance”—you don’t predict goals, you make them a certain way.
- “We must also figure out how to build a general intelligence that satisfies a goal at all, and that stably retains that goal as it edits its own code to make itself smarter. This task is perhaps the primary difficulty in designing friendly AI.”—last sentence unwarranted.
- “Eliezer Yudkowsky has proposed[57] Coherent Extrapolated Volition as a solution to two problems facing Friendly AI design:”—not just these two, bad wording
- “We have already seen how simple rule-based and utilitarian designs for Friendly AI will fail. ”—should be a link/reference, a FAQ can be entered at any question.
- “The second problem is that a superintelligence may generalize the wrong principles due to coincidental patterns in the training data”—a textbook error in machine learning methodology is a bad match for a fundamental problem, unless argued as being such in this particular case.
- “But even if humans could be made to agree on all the training cases, two problems remain.”—just two? Bad wording.
- “The first problem is that training on cases from our present reality may not result in a machine that will make correct ethical decisions in a world radically reshaped by superintelligence.”—the same can be said of humans (correctly, but as a result it doesn’t work as a simple distinguishing argument).
- “Let’s consider the likely consequences of some utilitarian designs for Friendly AI.”—“utilitarian”: a potentially new term without any introduction, even with a link, is better to be avoided.
- “An AI designed to minimize human suffering would simply kill all humans”—could/might would be better.
- “caters to the complex and demanding wants of humanity”—this statement is repeated about 5 times in close forms, should change the wording somehow.
- “by wiring humans into Nozick’s experience machines. ”—an even more opaque term without explanation.
- “Either option would be easier for the AI to achieve than maintaining a utopian society catering to the complexity of human (and animal) desires.”—not actually clear (from my point of view, not simulated naive point of view). The notion of “default route” in foreign minds can be quite strange, and you don’t need much complexity in generating principle for a fractal to appear diverse. (There are clearly third alternatives that shelve both considered options, which also makes the comparison not terribly well-defined.)
- “It’s not just a problem of specifying goals, either. It is hard to predict how goals will change in a self-modifying agent. No current mathematical decision theory can predict the decisions of a self-modifying agent.”—again, these things are there to be decided upon, not “predicted”
- etc.
- lukeprog 20 Apr 2011 15:27 UTC
  2 points
  Parent
  
  but that stable eventual goal may be very difficult to predict in advance
  
  No, the point of that section is that there are many AI designs in which we can’t explicitly make goals.
  
  This task is perhaps the primary difficulty in designing friendly AI.
  
  Some at SIAI disagree. I’ve already qualified with ‘perhaps’.
  
  not just these two, bad wording
  
  Fixed.
  
  should be a link/reference, a FAQ can be entered at any question
  
  Alas, I think no such documents exist. But luckily, the sentence is unneeded.
  
  a textbook error in machine learning methodology is a bad match for a fundamental problem, unless argued as being such in this particular case
  
  I disagree. A textbook error in machine learning that has not yet been solved is good match for a fundamental problem.
  
  just two? Bad wording.
  
  Fixed.
  
  the same can be said of humans (correctly, but as a result it doesn’t work as a simple distinguishing argument)
  
  Again, I’m not claiming that these aren’t also problems elsewhere.
  
  “utilitarian”: a potentially new term without any introduction, even with a link, is better to be avoided
  
  Maybe. If you can come up with a concise way to get around it, I’m all ears.
  
  could/might would be better
  
  Agreed.
  
  this statement is repeated about 5 times in close forms, should change the wording somehow
  
  Why? I’ve already varied the wording, and the point of a FAQ with link anchors is that not everybody will read the whole FAQ from start to finish. I repeat the phrase ‘machine superintelligence’ in variations a lot, too.
  
  an even more opaque term without explanation
  
  Hence, the link, for people who don’t know.
  
  not actually clear (from my point of view, not simulated naive point of view)
  
  Changed to ‘might’.
  
  again, these things are there to be decided upon, not “predicted”
  
  Fixed.
  
  Thanks for your comments. As you can see I am revising, so please do continue!
  - Vladimir_Nesov 20 Apr 2011 19:31 UTC
    5 points
    Parent
    
    No, the point of that section is that there are many AI designs in which we can’t explicitly make goals.
    
    I know, but you use the word “predict”, which is what I was pointing out.
    
    I disagree. A textbook error in machine learning that has not yet been solved is good match for a fundamental problem.
    
    What do you mean, “has not yet been solved”? This kind of error is routinely being solved in practice, which is why it’s a textbook example.
    
    Again, I’m not claiming that these aren’t also problems elsewhere.
    
    Yes, but that makes it a bad illustration.
    
    Why? I’ve already varied the wording
    
    Because it’s bad prose, it sounds unnatural (YMMV).
    
    Hence, the link, for people who don’t know.
    
    This doesn’t address my argument. I know there is a link and I know that people could click on it, so that’s not what I meant.
    
    (More later, maybe.)