Quintin Pope comments on My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin Pope 21 Mar 2023 9:06 UTC
25 points
4
The “strongest” foot I could put forwards is my response to “On current AI not being self-improving:”, where I’m pretty sure you’re just wrong.
However, I’d be most interested in hearing your response to the parts of this post that are about analogies to evolution, and why they’re not that informative for alignment, which start at:
Yudkowsky argues that we can’t point an AI’s learned cognitive faculties in any particular direction because the “hill-climbing paradigm” is incapable of meaningfully interfacing with the inner values of the intelligences it creates.
and end at:
Yudkowsky tries to predict the inner goals of a GPT-like model.
However, the discussion of evolution is much longer than the discussion on self-improvement in current AIs, so look at whichever you feel you have time for.
What links here?
- dxu's comment on My Objections to “We’re All Gonna Die with Eliezer Yudkowsky” by Quintin Pope (22 Mar 2023 0:13 UTC; 13 points)
- Eliezer Yudkowsky 21 Mar 2023 9:29 UTC
  21 points
  32
  Parent
  The “strongest” foot I could put forwards is my response to “On current AI not being self-improving:”, where I’m pretty sure you’re just wrong.
  You straightforwardly completely misunderstood what I was trying to say on the Bankless podcast: I was saying that GPT-4 does not get smarter each time an instance of it is run in inference mode.
  And that’s that, I guess.
  What links here?
  - Quintin Pope 21 Mar 2023 10:07 UTC
    44 points
    12
    Parent
    I’ll admit it straight up did not occur to me that you could possibly be analogizing between a human’s lifelong, online learning process, and a single inference run of an already trained model. Those are just completely different things in my ontology.
    Anyways, thank you for your response. I actually do think it helped clarify your perspective for me.
    Edit: I have now included Yudkowsky’s correction of his intent in the post, as well as an explanation of why I think his corrected argument is still wrong.