Richard_Ngo comments on Optimality is the tiger, and agents are its teeth

Richard_Ngo 17 Jun 2022 0:54 UTC
LW: 8 AF: 6
−1
AF
Meta-level: +1 for actually writing a thing.
Also meta-level: −1 because when I read this I get the sense that you started from a high-level intuition and then constructed a set of elaborate explanations of your intuition, but then phrased it as an argument.
I personally find this frustrating because I keep seeing people being super confident in their high-level intuitive metaphorical view of consequentialism and then never doing the work of actually digging beneath those metaphors. (Less a criticism of this post, more a criticism of everyone upvoting this post.)
In this case, this cashes out in claims like “agency is orthogonal to optimization power” which are clearly false for any reasonable definitions of agency and optimization power, and only seem to make sense when you’re operating at at a level of abstraction that’s far too high to be useful.
- Veedrac 20 Jun 2022 23:16 UTC
  8 points
  1
  Parent
  In this case, this cashes out in claims like “agency is orthogonal to optimization power” which are clearly false for any reasonable definitions of agency and optimization power,
  Could you put this in more words? I assume we’re talking past each other somewhat.
  It’s fairly obvious that going out and touching a thing is generally important if you want to optimize it, and systems that aren’t interested in touching things will be less ready to do that, but this isn’t really what I was trying to point to, and not how I hoped the person who wrote that intended it when they said ‘optimization power’.
  I think there is a very legitimate sense in which optimizing the steps of a plan to do a thing is a separate skill and/or mental propensity to executing that plan (as in, actually sending those signals outside the computer) or wanting it executed, and in which agency is mostly a measure of the latter. So I don’t think it is ‘clearly false for any reasonable definitions of agency and optimization power’.
  Also meta-level: −1 because when I read this I get the sense that you started from a high-level intuition and then constructed a set of elaborate explanations of your intuition, but then phrased it as an argument.
  I personally find this frustrating because I keep seeing people being super confident in their high-level intuitive metaphorical view of consequentialism and then never doing the work of actually digging beneath those metaphors. (Less a criticism of this post, more a criticism of everyone upvoting this post.)
  I’m not sure what the practical difference is between criticizing a post and criticizing people that upvoted it, but to the extent that this is a criticism of the post I wish you had been more explicit about what you are objecting to.
  - Richard_Ngo 21 Jun 2022 0:40 UTC
    1 point
    Parent
    I think there is a very legitimate sense in which optimizing the steps of a plan to do a thing is a separate skill and/or mental propensity to executing that plan (as in, actually sending those signals outside the computer) or wanting it executed, and in which agency is mostly a measure of the latter.
    My main criticism is that, in general, you have to think while you’re executing plans, not just while you’re generating them. The paradigm where you plan every step in advance, and then the “agency” comes in only when executing it, is IMO a very misleading one to think in.
    (This seems related to Eliezer’s argument that there’s only a one-line difference difference between an oracle AGI and an agent AGI. Sure, that’s true in the limit. But thinking about the limit will make you very confused about realistic situations!)
    I’m not sure what the practical difference is between criticizing a post and criticizing people that upvoted it
    It’s something like: “I endorse people following the policy of writing posts like this one, it’s great when people work through their thoughts in this way. I don’t endorse people following the policy of upvoting posts like this one to this extent, because it seems likely that they’re mainly responding to high-level applause lights.”
    to the extent that this is a criticism of the post I wish you had been more explicit about what you are objecting to.
    I’m sympathetic to you wanting more explicit feedback but the fact that this post is so high-level and ungrounded is what makes it difficult for me to give that. To me it reads more like a story than an argument.
    - Veedrac 21 Jun 2022 2:01 UTC
      5 points
      Parent
      The paradigm where you plan every step in advance, and then the “agency” comes in only when executing it, is IMO a very misleading one to think in.
      This isn’t what I’m referring to and it’s not in the example in the story. Actions are generated stepwise on demand. It is the ability to generate stepwise outputs of good quality, of which actions are an instance, that is ‘optimization power’. Being able to think of good next actions conditional on past observations is, at least as I understand the terms, quite different to being an agent enacting those actions.
      (This seems related to Eliezer’s argument that there’s only a one-line difference difference between an oracle AGI and an agent AGI. Sure, that’s true in the limit. But thinking about the limit will make you very confused about realistic situations!)
      I explicitly tried to make the scenario as un-Oracle like as I could, with the system explicitly only producing outputs onscreen that I could explicitly justify being discoverable in reasonable time given the observations it had available.
      I am increasingly feeling like I just failed to communicate what I was trying to say and your criticism doesn’t bear much resemblance to what I had intended to write. I’m happy to take responsibility for not writing as well as I should have, but I’d rather you didn’t cast aspersions at my motivations about it.
      - Richard_Ngo 21 Jun 2022 4:23 UTC
        4 points
        0
        Parent
        I didn’t read the post particularly carefully, it’s totally plausible that I’m misunderstanding the key ideas you were trying to convey. I apologise for phrasing my claims in a way that made it sound like I was skeptical of your motivations; I’m not, and I’m glad you wrote this up.
        I think my concerns still apply to the position you stated in the previous comment, but insofar as the main motivation behind my comment was to generically nudge LW in a certain direction, I’ll try to do this more directly, rather than via poking at individual posts in an opportunistic way.