dxu comments on On presenting the case for AI risk

dxu 13 Mar 2022 5:41 UTC
13 points
Strong upvote; agree with most / all of what you wrote. Having said that:
Isn’t this immediately falsified by human beings? … And isn’t it a bit concerning if your alleged generalization breaks down hardest on the most relevant data point we have for trying to predict the impact of automating general intelligence?
I’m not sure how Conor would reply to this, but my models of Paul Christiano and Robin Hanson have some things to say in response. My Paul model says:
Humans were preceded on the evolutionary tree by a number of ancestors, each of which was only slightly worse along the relevant dimensions. It’s true that humans crossed something like a supercriticality threshold, which is why they managed to take over the world while e.g. the Neanderthals did not, but the underlying progress curve humans emerged from was in fact highly continuous with humanity’s evolutionary predecessors. Thus, humans do not represent a discontinuity in the relevant sense.
To this my Robin model adds:
In fact, even calling it a “supercriticality threshold” connotes too much; the actual thing that enabled humans to succeed where their ancestors did not, was not their improved (individual) intelligence relative to said ancestors, but their ability to transmit discoveries from one generation to the next. This ability, “cultural evolution”, permits faster iteration on successful strategies than does the mutation-and-selection procedure employed by natural selection, and thus explains the success of early humans—but it does not permit for a new-and-improved AGI to come along and obsolete humans in the blink of an eye.
Of course, I have to give my Eliezer model (who I agree with more than either of the above) a chance to reply:
Paul: It’s all well and good to look back in hindsight and note that some seemingly discontinuous outcome emerged from a continuous underlying process, but this does not weaken the point—if anything, it strengthens it. The fact that a small, continuous change to underlying genetic parameters resulted in a massive increase in fitness shows that the function mapping design space to outcome space is extremely jumpy, which means that roughly continuous progress in design space does not imply a similarly continuous rate of change in real-world impact; and the latter is what matters for AGI.
Robin: From an empirical standpoint, AlphaGo Zero is already quite a strong mark against the “cultural evolution” hypothesis. But from a more theoretical standpoint, note that (according to your own explanation!) the reason “cultural evolution” outcompetes natural selection is because the former iterates more quickly than the latter; this means that it is speed of iteration that is the real underlying driver of progress. Then, if there exists a process that permits yet faster iteration, it stands to reason that that process would outcompete “cultural evolution” in precisely the same way. Thinking about “cultural evolution” gives you no evidence either way as to whether such a faster process exists, which essentially means the “cultural evolution” hypothesis tells you nothing about whether / how quickly AGI can surpass the sum total of humanity’s ability / knowledge, after being created.
- Rob Bensinger 13 Mar 2022 6:42 UTC
  4 points
  Parent
  Great comment; you said it better than I could.
  I do want to say:
  The existence of a supercriticality threshold at all already falsifies Connor’s ‘discontinuities can never happen’ model. Once the physical world allows discontinuities, you need to add some new assumption to the world that makes the AGI case avoid this physical feature of territory.
  And all of the options involve sticking your neck out to make at least some speculative claims about CS facts, the nature of intelligence, etc.; none of the options let you stop at boat-size comparisons. And if boat-size comparisons were your crux, it’s odd at best if you immediately discover a new theory of intelligence that lets you preserve your old conclusion about AI progress curves, the very moment your old reason for believing that goes away.