Vladimir_Nesov comments on Evaluating the feasibility of SI’s plan

Vladimir_Nesov 10 Jan 2013 17:32 UTC
3 points
No help in developing FAI theory (decision theory and a way of pointing to human values), probably of little help in developing FAI implementation, although there might be useful methods in common.

FAI requires solving AGI + Friendliness

I don’t believe it works like that. Making a poorly understood AGI doesn’t necessarily help with implementing a FAI (even if you have the theory figured out), as a FAI is not just parameterized by its values, but also defined by the correctness of interpretation of its values (decision theory), which other AGI designs by default won’t have.
- Kaj_Sotala 11 Jan 2013 12:58 UTC
  3 points
  Parent
  
  although there might be useful methods in common.
  
  Indeed—for example, on the F front, computational models of human ethical reasoning seem like something that could help increase the safety of all kinds of AGI projects and also be useful for Friendliness theory in general, and some of them could conceivably be developed in the context of heuristic AGI. Likewise, for the AGI aspect, it seems like there should be all kinds of machine learning techniques and advances in probability theory (for example) that would be equally useful for pretty much any kind of AGI—after all, we already know that an understanding of e.g. Bayes’ theorem and expected utility will be necessary for pretty much any kind of AGI implementation, so why should we assume that all of the insights that will be useful in many kinds of contexts would have been developed already?
  
  Making a poorly understood AGI doesn’t necessarily help with implementing a FAI (even if you have the theory figured out)
  
  Right, by the above I meant to say “the right kind of AGI + Friendliness”; I certainly agree that there are many conceivable ways of building AGIs that would be impossible to ever make Friendly.