Rob Bensinger comments on How do you feel about LessWrong these days? [Open feedback thread]

Rob Bensinger 16 Jun 2024 19:17 UTC
18 points
13
I feel pretty frustrated at how rarely people actually bet or make quantitative predictions about existential risk from AI. EG my recent attempt to operationalize a bet with Nate went nowhere. Paul trying to get Eliezer to bet during the MIRI dialogues also went nowhere, or barely anywhere—I think they ended up making some random bet about how long an IMO challenge would take to be solved by AI. (feels pretty weak and unrelated to me. lame. but huge props to Paul for being so ready to bet, that made me take him a lot more seriously.)
This paragraph doesn’t seem like an honest summary to me. Eliezer’s position in the dialogue, as I understood it, was:
- The journey is a lot harder to predict than the destination. Cf. “it’s easier to use physics arguments to predict that humans will one day send a probe to the Moon, than it is to predict when this will happen or what the specific capabilities of rockets five years from now will be”. Eliezer isn’t claiming to have secret insights about the detailed year-to-year or month-to-month changes in the field; if he thought that, he’d have been making those near-term tech predictions already back in 2010, 2015, or 2020 to show that he has this skill.
- From Eliezer’s perspective, Paul is claiming to know a lot about the future trajectory of AI, and not just about the endpoints: Paul thinks progress will be relatively smooth and continuous, and thinks it will get increasingly smooth and continuous as time passes and more resources flow into the field. Eliezer, by contrast, expects the field to get choppier as time passes and we get closer to ASI.
- A way to bet on this, which Eliezer repeatedly proposed but wasn’t able to get Paul to do very much, would be for Paul to list out a bunch of concrete predictions that Paul sees as “yep, this is what smooth and continuous progress looks like”. Then, even though Eliezer doesn’t necessarily have a concrete “nope, the future will go like X instead of Y” prediction, he’d be willing to bet against a portfolio of Paul-predictions: when you expect the future to be more unpredictable, you’re willing to at least weakly bet against any sufficiently ambitious pool of concrete predictions.
- (Also, if Paul generated a ton of predictions like that, an occasional prediction might indeed make Eliezer go “oh wait, I do have a strong prediction on that question in particular; I didn’t realize this was one of our points of disagreement”. I don’t think this is where most of the action is, but it’s at least a nice side-effect of the person-who-thinks-this-tech-is-way-more-predictable spelling out predictions.)
Eliezer was also more interested in trying to reach mutual understanding of the views on offer, as opposed to bet let’s bet on things immediately never mind the world-views. But insofar as Paul really wanted to have the bets conversation instead, Eliezer sunk an awful lot of time into trying to find operationalizations Paul and he could bet on, over many hours of conversation.
If your end-point take-away from that (even after actual bets were in fact made, and tons of different high-level predictions were sketched out) is “wow how dare Eliezer be so unwilling to make bets on anything”, then I feel a lot less hope that world-models like Eliezer’s (“long-term outcome is more predictable than the detailed year-by-year tech pathway”) are going to be given a remotely fair hearing.
(Also, in fairness to Paul, I’d say that he spent a bunch of time working with Eliezer to try to understand the basic methodologies and foundations for their perspectives on the world. I think both Eliezer and Paul did an admirable job going back and forth between the thing Paul wanted to focus on and the thing Eliezer wanted to focus on, letting us look at a bunch of different parts of the elephant. And I don’t think it was unhelpful for Paul to try to identify operationalizations and bets, as part of the larger discussion; I just disagree with TurnTrout’s summary of what happened.)
- TurnTrout 14 Dec 2024 0:43 UTC
  15 points
  −7
  Parent
  Your comments’ points seem like further evidence for my position. That said, your comment appears to serve the function of complicating the conversation, and that happens to have the consequence of diffusing the impact of my point. I do not allege that you are doing so on purpose, but I think it’s important to notice. I would have been more convinced by a reply of “no, you’re wrong, here’s the concrete bet(s) EY made or was willing to make but Paul balked.”
  I will here repeat a quote^[1] which seems relevant:
  [Christiano][12:29]
  my desire to bet about “whatever you want” was driven in significant part by frustration with Eliezer repeatedly saying things like “people like Paul get surprised by reality” and me thinking that’s nonsense
  The journey is a lot harder to predict than the destination. Cf. “it’s easier to use physics arguments to predict that humans will one day send a probe to the Moon, than it is to predict when this will happen or what the specific capabilities of rockets five years from now will be”. Eliezer isn’t claiming to have secret insights about the detailed year-to-year or month-to-month changes in the field; if he thought that, he’d have been making those near-term tech predictions already back in 2010, 2015, or 2020 to show that he has this skill
  First of all, I disagree with the first claim and am irritated that you stated it as a fact instead of saying “I think that...”. My overall take-away from this paragraph, as pertaining to my point, is that you’re pointing out that Eliezer doesn’t make predictions because he can’t / doesn’t have epistemic alpha. That accords with my point of “EY was unwilling to bet.”
  From Eliezer’s perspective, Paul is claiming to know a lot about the future trajectory of AI, and not just about the endpoints: Paul thinks progress will be relatively smooth and continuous, and thinks it will get increasingly smooth and continuous as time passes and more resources flow into the field. Eliezer, by contrast, expects the field to get choppier as time passes and we get closer to ASI.
  My takeaway, as it relates to my quoted point: Either Eliezer’s view makes no near-term falsifiable predictions which differ from the obvious ones, or only makes meta-predictions which are hard to bet on. Sounds to my ears like his models of alignment don’t actually constrain his moment-to-moment anticipations, in contrast to my own, which once destroyed my belief in shard theory on a dime (until I realized I’d flipped the data, and undid the update). This perception of “the emperor has no constrained anticipations” I have is a large part of what I am criticizing.
  A way to bet on this, which Eliezer repeatedly proposed but wasn’t able to get Paul to do very much, would be for Paul to list out a bunch of concrete predictions that Paul sees as “yep, this is what smooth and continuous progress looks like”. Then, even though Eliezer doesn’t necessarily have a concrete “nope, the future will go like X instead of Y” prediction, he’d be willing to bet against a portfolio of Paul-predictions: when you expect the future to be more unpredictable, you’re willing to at least weakly bet against any sufficiently ambitious pool of concrete predictions.
  So Eliezer offered Paul the opportunity for Paul to unilaterally stick his neck out on a range of concrete predictions, so that Eliezer could judge Paul’s overall predictive performance against some unknown and subjective baseline which Eliezer has in his head, or perhaps against some group of “control” predictors? That sounds like the opposite of “willing to make concrete predictions” and feeds into my point about Paul not being able to get Eliezer to bet.
  Edit: If there was a more formal proposal which actually cashes out into resolution criteria and Brier score updates for both of them, then I’m happier with EY’s stance but still largely unmoved; see my previous comment above about the emperor.
  Eliezer was also more interested in trying to reach mutual understanding of the views on offer, as opposed to bet let’s bet on things immediately never mind the world-views. But insofar as Paul really wanted to have the bets conversation instead, Eliezer sunk an awful lot of time into trying to find operationalizations Paul and he could bet on, over many hours of conversation.
  This paragraph appears to make two points. First, Eliezer was less interested in betting than in having long dialogues. I agree. Second, Eliezer spent a lot of time at least appearing as if he were trying to bet. I agree with that as well. But I don’t give points for “trying” here.
  Giving points for “trying” is in practice “giving points for appearing to try”, as is evident from the literature on specification gaming. Giving points for “appeared to try” opens up the community to invasion by bad actors which gish gallop their interlocutors into giving up the conversation. Prediction is what counts.
  world-models like Eliezer’s (“long-term outcome is more predictable than the detailed year-by-year tech pathway”)
  Nitpick, but that’s not a “world-model.” That’s a prediction.
  even after actual bets were in fact made, and tons of different high-level predictions were sketched out
  Why write this without citing? Please cite and show me the credences and the resolution conditions.
  1. ^
    If anyone entering this thread wishes to read the original dialogue for themselves, please see section 10.3 of https://www.lesswrong.com/posts/fS7Zdj2e2xMqE6qja/more-christiano-cotra-and-yudkowsky-on-ai-progress