Garrett Baker comments on Evaluating the historical value misspecification argument

Garrett Baker 5 Oct 2023 21:31 UTC
8 points
6
I do not necessarily disagree or agree, but I do not know which source you derive “very clearly” from. So do you have any memory which could help me locate that text?
- Matthew Barnett 6 Oct 2023 3:21 UTC
  26 points
  −3
  Parent
  Here’s a comment from Eliezer in 2010,
  I think controlling Earth’s destiny is only modestly harder than understanding a sentence in English.
  Well said. I shall have to try to remember that tagline.
  I think this provides some support for the claim, “Historically [Eliezer] very clearly thought that a major part of the problem is that AIs would not understand human concepts and preferences until after or possibly very slightly before achieving superintelligence.” At the very least, the two claims are consistent.
  - Rob Bensinger 6 Oct 2023 7:25 UTC
    29 points
    24
    Parent
    I think this provides some support
    ??? What?? It’s fine to say that this is a falsified prediction, but how does “Eliezer expected less NLP progress pre-ASI” provide support for “Eliezer thinks solving NLP is a major part of the alignment problem”?
    I continue to be baffled at the way you’re doing exegesis here, happily running with extremely tenuous evidence for P while dismissing contemporary evidence for not-P, and seeming unconcerned about the fact that Eliezer and Nate apparently managed to secretly believe P for many years without ever just saying it outright, and seeming equally unconcerned about the fact that Eliezer and Nate keep saying that your interpretation of what they said is wrong. (Which I also vouch for from having worked with them for ten years, separate from the giant list of specific arguments I’ve made. Good grief.)
    At the very least, the two claims are consistent.
    ?? “Consistent” is very different from “supports”! Every off-topic claim by EY is “consistent” with Gallabytes’ assertion.
    - Matthew Barnett 6 Oct 2023 7:52 UTC
      25 points
      18
      Parent
      ??? What?? It’s fine to say that this is a falsified prediction, but how does “Eliezer expected less NLP progress pre-ASI” provide support for “Eliezer thinks solving NLP is a major part of the alignment problem”?
      ETA: first of all, the claim was “Historically [Eliezer] very clearly thought that a major part of the problem is that AIs would not understand human concepts and preferences until after or possibly very slightly before achieving superintelligence.” which is semantically different than “Eliezer thinks solving NLP is a major part of the alignment problem”.
      All I said is that it provides “some support” and I hedged in the next sentence. I don’t think it totally vindicates the claim. However, I think the fact that Eliezer seems to have not expected NLP to be solved until very late might easily explain why he illustrated alignment using stories like a genie throwing your mother out of a building because you asked to get your mother away from the building. Do you really disagree?
      I continue to be baffled at the way you’re doing exegesis here, happily running with extremely tenuous evidence for P while dismissing contemporary evidence for not-P, and seeming unconcerned about the fact that Eliezer and Nate apparently managed to secretly believe X for many years without ever just saying it outright, and seeming equally unconcerned about the fact that Eliezer and Nate keep saying that your interpretation of what they said is wrong.
      This was one case, and I said “some support”. The evidence in my post was quite a bit stronger IMO. Basically all the statements I made about how MIRI thought value specification would both be hard and an important part of alignment are supported by straightforward quotations. The real debate mostly seems to comes down to whether by “value specification” MIRI people were including problems of inner alignment, which seems implausible to me, and at least ambiguous even under very charitable interpretations.
      By contrast, you, Eliezer, and Nate all flagrantly misinterpreted me as saying that MIRI people thought that AI wouldn’t understand human values even though I explicitly and very clearly said otherwise in the post more than once. I see these as larger errors than me misinterpreting Eliezer in this narrow case.