Thane Ruthenis comments on Value Formation: An Overarching Model

Thane Ruthenis 15 Dec 2022 11:42 UTC
LW: 1 AF: 1
0
AF
I don’t think the GPS “searches over all relevant plans”. As per John’s post:
Consider, for example, a human planning a trip to the grocery store. Typical reasoning (mostly at the subconscious level) might involve steps like:
- There’s a dozen different stores in different places, so I can probably find one nearby wherever I happen to be; I don’t need to worry about picking a location early in the planning process.
- My calendar is tight, so I need to pick an open time. That restricts my options a lot, so I should worry about that early in the planning process.
  <go look at calendar>
- Once I’ve picked an open time in my calendar, I should pick a grocery store nearby whatever I’m doing before/after that time.
- … Oh, but I also need to go home immediately after, to put any frozen things in the freezer. So I should pick a time when I’ll be going home after, probably toward the end of the day.
Notice that this sort of reasoning mostly does not involve babbling and pruning entire plans. The human is thinking mostly at the level of constraints (and associated heuristics) which rule out broad swaths of plan-space. The calendar is a taut constraint, location is a slack constraint, so (heuristic) first find a convenient time and then pick whichever store is closest to wherever I’ll be before/after. The reasoning only deals with a few abstract plan-features (i.e. time, place) and ignores lots of details (i.e. exact route, space in the car’s trunk); more detail can be filled out later, so long as we’ve planned the “important” parts. And rather than “iterate” by looking at many plans, the search process mostly “iterates” by considering subproblems (like e.g. finding an open calendar slot) or adding lower-level constraints to a higher-level plan (like e.g. needing to get frozen goods home quickly).
In particular, I very much do agree the GPS makes use of heuristics like “if you have a cached plan that you think will work, just do that” and “see [how you feel about this idea]^[1] before proceeding” over the course of planning. But it’s not made of heuristics; rather, it’s something like a systematic way of drawing upon the declarative knowledge/knowledge explicitly represented in the world-model, and that knowledge involves a lot of heuristics.
Crucially, part of any “problem specification” would be things like “how much time should I spend on thinking about this?” and “how hard should I optimize the plan for doing it?” and “in how much detail should I track the consequences of this decision?”, and if it’s something minor like getting ice cream, then of course you’d spend very little time and use a lot of cached cognitive shortcuts.
If it’s something major, however, like a life-or-death matter, then you’d do high-intensity planning that aims to track what would actually happen in detail, without relying on prior assumptions and vague feelings^[2].
1. ^
  I. e., which of your shards bid for or against it, and how strongly.
2. ^
  Unless, of course, some of these vague feelings have proven more effective in the past than your explicit attempts at consequences-tracking, in which case you’d knowingly defer to them — you’d “trust your instincts”.
- TurnTrout 15 Dec 2022 22:10 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I don’t think the GPS “searches over all relevant plans”
  OK, but you are positing that there’s an argmin, no? That’s a big part of what I’m objecting to. I anticipate that insofar as you’re claiming grader-optimization problems come back, they come back because there’s an AFAICT inappropriate argmin which got tossed into the analysis via the GPS.
  But it’s not made of heuristics; rather, it’s something like a systematic way of drawing upon the declarative knowledge/knowledge explicitly represented in the world-model, and that knowledge involves a lot of heuristics.
  Sure, sounds reasonable.
  Crucially, part of any “problem specification” would be things like “how much time should I spend on thinking about this?” and “how hard should I optimize the plan for doing it?” and “in how much detail should I track the consequences of this decision?”, and if it’s something minor like getting ice cream, then of course you’d spend very little time and use a lot of cached cognitive shortcuts.
  Noting that I still feel confused after hearing this explanation. What does it mean to ask “how hard should I optimize”?
  If it’s something major, however, like a life-or-death matter, then you’d do high-intensity planning that aims to track what would actually happen in detail, without relying on prior assumptions and vague feelings
  Really? I think that people usually don’t do that in life-or-death scenarios. People panic all the time.
  - Thane Ruthenis 15 Dec 2022 22:38 UTC
    LW: 1 AF: 1
    −2
    AF Parent
    What does it mean to ask “how hard should I optimize”?
    Satisficing threshold, probability of the plan’s success, the plan’s robustness to unexpected perturbations, etc. I suppose the argmin is somewhat misleading: the GPS doesn’t output the best possible plan for achieving some goal in the world outside the agent, it’s solving the problem in the most efficient way possible, which often means not spending too much time and resources on it. I. e., “mental resources spent” is part of the problem specification, and it’s something it tries to minimize too.
    I don’t think this argmin is the central reason for grader-optimization problems here.
    Really? I think that people usually don’t do that in life-or-death scenarios. People panic all the time.
    I’m assuming no time pressure. Or substitute-in “a matter of grave importance that you nonetheless feel capable of resolving”.
    - TurnTrout 15 Dec 2022 23:04 UTC
      LW: 2 AF: 2
      0
      AF Parent
      I don’t think this argmin is the central reason for grader-optimization problems here.
      I’m going to read the rest of the essay, and also I realize you posted this before my four posts on “holy cow argmax can blow all your alignment reasoning out of reality all the way to candyland.” But I want to note that including an argmin in the posited motivational architecture makes me extremely nervous / distrusting. Even if this modeling assumption doesn’t end up being central to your arguments on how shard-agents become wrapper-like, I think this assumption should still be flagged extremely heavily.
      - Thane Ruthenis 15 Dec 2022 23:11 UTC
        LW: 1 AF: 1
        0
        AF Parent
        Mm, I believe that it’s not central because my initial conception of the GPS didn’t include it at all, and everything still worked. I don’t think it serves the same role here as you’re critiquing in the posts you’ve linked; I think it’s inserted at a different abstraction level.
        But sure, I’ll wait for you to finish with the post.