That is: GPS is some function that takes in a world-model and some “problem specification” — some set of nodes in the world-model and their desired values — and output the actions that, if taken in that world-model, would bring the values of these nodes as close to the desired values as possible given the actions available to the agent.
While I appreciate the concreteness, this doesn’t seem very reasonable to me.[1] But maybe I’m misunderstanding![2]
Concretely, imagine I want to buy ice cream. I understand the GPS to receive target specification MGs set to “I have ice cream in my hand in three hours.” I don’t think that I will then quickly argmin expected latent distance by searching over all relevant plans. That would, I think, lead to some crazy outcomes. Solutions at least as ice-cream-effective as “use all my money to hire many friends to buy me ice cream from different places.”
And if we posit that the other variables are also set to reasonable values, I will object “this is not lazy enough (in a programming sense), you’re asking too much of target configurations; I will in fact be surprised if amazing brain-reading indicated that I am in fact tracking complete target latent state specifications when I submit plans to my GPS.”
I will also object that I don’t think that min is at all an effective approach to real-world cognition, nor do I think it’s what gets learned with realistic learning processes. I think the GPS itself will be a set of heuristics (like “check recent memory bank for target-goal-relevant information” and “use generative model to suggest five candidate goal-states and let invocation-subshards bid on them to rank them, selecting the best-of-five”) which are reliably useful across shard-goals (like ice cream and dog-petting).
(Am I engaging with your intended points properly, or am I whooshing?)
Rule of thumb: If I find myself postulating internal motivational circuitry which uses a “max” or a “min”, then I should think very carefully about what is going on and whether that’s appropriate. Almost always, the answer is “no”, and if I don’t catch the “min/max” before it sneaks in, my analysis goes off the rails of reality.
I don’t think the GPS “searches over all relevant plans”. As per John’s post:
Consider, for example, a human planning a trip to the grocery store. Typical reasoning (mostly at the subconscious level) might involve steps like:
There’s a dozen different stores in different places, so I can probably find one nearby wherever I happen to be; I don’t need to worry about picking a location early in the planning process.
My calendar is tight, so I need to pick an open time. That restricts my options a lot, so I should worry about that early in the planning process.
<go look at calendar>
Once I’ve picked an open time in my calendar, I should pick a grocery store nearby whatever I’m doing before/after that time.
… Oh, but I also need to go home immediately after, to put any frozen things in the freezer. So I should pick a time when I’ll be going home after, probably toward the end of the day.
Notice that this sort of reasoning mostly does not involve babbling and pruning entire plans. The human is thinking mostly at the level of constraints (and associated heuristics) which rule out broad swaths of plan-space. The calendar is a taut constraint, location is a slack constraint, so (heuristic) first find a convenient time and then pick whichever store is closest to wherever I’ll be before/after. The reasoning only deals with a few abstract plan-features (i.e. time, place) and ignores lots of details (i.e. exact route, space in the car’s trunk); more detail can be filled out later, so long as we’ve planned the “important” parts. And rather than “iterate” by looking at many plans, the search process mostly “iterates” by considering subproblems (like e.g. finding an open calendar slot) or adding lower-level constraints to a higher-level plan (like e.g. needing to get frozen goods home quickly).
In particular, I very much do agree the GPS makes use of heuristics like “if you have a cached plan that you think will work, just do that” and “see [how you feel about this idea][1] before proceeding” over the course of planning. But it’s not made of heuristics; rather, it’s something like a systematic way of drawing upon the declarative knowledge/knowledge explicitly represented in the world-model, and that knowledge involves a lot of heuristics.
Crucially, part of any “problem specification” would be things like “how much time should I spend on thinking about this?” and “how hard should I optimize the plan for doing it?” and “in how much detail should I track the consequences of this decision?”, and if it’s something minor like getting ice cream, then of course you’d spend very little time and use a lot of cached cognitive shortcuts.
If it’s something major, however, like a life-or-death matter, then you’d do high-intensity planning that aims to track what would actually happen in detail, without relying on prior assumptions and vague feelings[2].
Unless, of course, some of these vague feelings have proven more effective in the past than your explicit attempts at consequences-tracking, in which case you’d knowingly defer to them — you’d “trust your instincts”.
I don’t think the GPS “searches over all relevant plans”
OK, but you are positing that there’s an argmin, no? That’s a big part of what I’m objecting to. I anticipate that insofar as you’re claiming grader-optimization problems come back, they come back because there’s an AFAICT inappropriate argmin which got tossed into the analysis via the GPS.
But it’s not made of heuristics; rather, it’s something like a systematic way of drawing upon the declarative knowledge/knowledge explicitly represented in the world-model, and that knowledge involves a lot of heuristics.
Sure, sounds reasonable.
Crucially, part of any “problem specification” would be things like “how much time should I spend on thinking about this?” and “how hard should I optimize the plan for doing it?” and “in how much detail should I track the consequences of this decision?”, and if it’s something minor like getting ice cream, then of course you’d spend very little time and use a lot of cached cognitive shortcuts.
Noting that I still feel confused after hearing this explanation. What does it mean to ask “how hard should I optimize”?
If it’s something major, however, like a life-or-death matter, then you’d do high-intensity planning that aims to track what would actually happen in detail, without relying on prior assumptions and vague feelings
Really? I think that people usually don’t do that in life-or-death scenarios. People panic all the time.
What does it mean to ask “how hard should I optimize”?
Satisficing threshold, probability of the plan’s success, the plan’s robustness to unexpected perturbations, etc. I suppose the argmin is somewhat misleading: the GPS doesn’t output the best possible plan for achieving some goal in the world outside the agent, it’s solving the problem in the most efficient way possible, which often means not spending too much time and resources on it. I. e., “mental resources spent” is part of the problem specification, and it’s something it tries to minimize too.
I don’t think this argmin is the central reason for grader-optimization problems here.
Really? I think that people usually don’t do that in life-or-death scenarios. People panic all the time.
I’m assuming no time pressure. Or substitute-in “a matter of grave importance that you nonetheless feel capable of resolving”.
I don’t think this argmin is the central reason for grader-optimization problems here.
I’m going to read the rest of the essay, and also I realize you posted this before myfourpostson “holy cow argmax can blow all your alignment reasoning out of reality all the way to candyland.” But I want to note that including an argmin in the posited motivational architecture makes me extremely nervous / distrusting. Even if this modeling assumption doesn’t end up being central to your arguments on how shard-agents become wrapper-like, I think this assumption should still be flagged extremely heavily.
Mm, I believe that it’s not central because my initial conception of the GPS didn’t include it at all, and everything still worked. I don’t think it serves the same role here as you’re critiquing in the posts you’ve linked; I think it’s inserted at a different abstraction level.
But sure, I’ll wait for you to finish with the post.
While I appreciate the concreteness, this doesn’t seem very reasonable to me.[1] But maybe I’m misunderstanding![2]
Concretely, imagine I want to buy ice cream. I understand the GPS to receive target specification MGs set to “I have ice cream in my hand in three hours.” I don’t think that I will then quickly argmin expected latent distance by searching over all relevant plans. That would, I think, lead to some crazy outcomes. Solutions at least as ice-cream-effective as “use all my money to hire many friends to buy me ice cream from different places.”
And if we posit that the other variables are also set to reasonable values, I will object “this is not lazy enough (in a programming sense), you’re asking too much of target configurations; I will in fact be surprised if amazing brain-reading indicated that I am in fact tracking complete target latent state specifications when I submit plans to my GPS.”
I will also object that I don’t think that min is at all an effective approach to real-world cognition, nor do I think it’s what gets learned with realistic learning processes. I think the GPS itself will be a set of heuristics (like “check recent memory bank for target-goal-relevant information” and “use generative model to suggest five candidate goal-states and let invocation-subshards bid on them to rank them, selecting the best-of-five”) which are reliably useful across shard-goals (like ice cream and dog-petting).
(Am I engaging with your intended points properly, or am I whooshing?)
Rule of thumb: If I find myself postulating internal motivational circuitry which uses a “max” or a “min”, then I should think very carefully about what is going on and whether that’s appropriate. Almost always, the answer is “no”, and if I don’t catch the “min/max” before it sneaks in, my analysis goes off the rails of reality.
I haven’t even read much further, just skimmed other parts of the post. I’ll post this now anyways.
I don’t think the GPS “searches over all relevant plans”. As per John’s post:
In particular, I very much do agree the GPS makes use of heuristics like “if you have a cached plan that you think will work, just do that” and “see [how you feel about this idea][1] before proceeding” over the course of planning. But it’s not made of heuristics; rather, it’s something like a systematic way of drawing upon the declarative knowledge/knowledge explicitly represented in the world-model, and that knowledge involves a lot of heuristics.
Crucially, part of any “problem specification” would be things like “how much time should I spend on thinking about this?” and “how hard should I optimize the plan for doing it?” and “in how much detail should I track the consequences of this decision?”, and if it’s something minor like getting ice cream, then of course you’d spend very little time and use a lot of cached cognitive shortcuts.
If it’s something major, however, like a life-or-death matter, then you’d do high-intensity planning that aims to track what would actually happen in detail, without relying on prior assumptions and vague feelings[2].
I. e., which of your shards bid for or against it, and how strongly.
Unless, of course, some of these vague feelings have proven more effective in the past than your explicit attempts at consequences-tracking, in which case you’d knowingly defer to them — you’d “trust your instincts”.
OK, but you are positing that there’s an argmin, no? That’s a big part of what I’m objecting to. I anticipate that insofar as you’re claiming grader-optimization problems come back, they come back because there’s an AFAICT inappropriate argmin which got tossed into the analysis via the GPS.
Sure, sounds reasonable.
Noting that I still feel confused after hearing this explanation. What does it mean to ask “how hard should I optimize”?
Really? I think that people usually don’t do that in life-or-death scenarios. People panic all the time.
Satisficing threshold, probability of the plan’s success, the plan’s robustness to unexpected perturbations, etc. I suppose the argmin is somewhat misleading: the GPS doesn’t output the best possible plan for achieving some goal in the world outside the agent, it’s solving the problem in the most efficient way possible, which often means not spending too much time and resources on it. I. e., “mental resources spent” is part of the problem specification, and it’s something it tries to minimize too.
I don’t think this argmin is the central reason for grader-optimization problems here.
I’m assuming no time pressure. Or substitute-in “a matter of grave importance that you nonetheless feel capable of resolving”.
I’m going to read the rest of the essay, and also I realize you posted this before my four posts on “holy cow argmax can blow all your alignment reasoning out of reality all the way to candyland.” But I want to note that including an argmin in the posited motivational architecture makes me extremely nervous / distrusting. Even if this modeling assumption doesn’t end up being central to your arguments on how shard-agents become wrapper-like, I think this assumption should still be flagged extremely heavily.
Mm, I believe that it’s not central because my initial conception of the GPS didn’t include it at all, and everything still worked. I don’t think it serves the same role here as you’re critiquing in the posts you’ve linked; I think it’s inserted at a different abstraction level.
But sure, I’ll wait for you to finish with the post.