The old argument for coherence implying (worrisome) goal-directedness
Let’s make this more concrete. Does reinforcement learning lead to a) goal directness, b) expected utility maximization?
(I find it easier to imagine an argument for a than b.)
The creature’s ‘preferences’ can’t be in terms of consistent numerical values assigned to everything, because those would be consistent. So what are they?
An array or a vector.** (This can give rise to a weaker sort of coherence - (a, b) < (c, d) if a < c and b < d (and if we’re talking about people, then maybe this follows less as an absolute, but more probabilistically, and more likely for greater differences, perhaps especially relative to the sizes of the numbers being compared. Or that all shows up in proxies or estimates, rather than the thing itself.)) This offers up ‘incoherence’ as not ‘totally incoherent’ but a lack of awareness of how utility is composed of the parts. There’s also the ‘if you’ve had sweets, you care less about getting more sweets for awhile’ paradigm*, in which some ‘incoherence’ might arise from path dependency, while there’s still goal-directedness, that magical essence.
*Which sometimes seems right, and sometimes seems backwards. After eating one cookie/chip, do you want another?
**I’m not sure this is a good model, but it seems like an obvious first attempt at formalizing ‘people want multiple things and might not be sure how they trade off, all of the time, or be making those comparisons’. While the ‘oh no, what if the utility function is preceded by a—so it minimizes what it maximizes’ metaphor might not have been useful, ‘the code may have—where it shouldn’t, and lack—where it should have them (replaced by +) and thus ‘incoherence’ appears but is never quite total or catastrophic’ might be better for describing incoherence.
Let’s make this more concrete. Does reinforcement learning lead to a) goal directness, b) expected utility maximization?
(I find it easier to imagine an argument for a than b.)
An array or a vector.** (This can give rise to a weaker sort of coherence - (a, b) < (c, d) if a < c and b < d (and if we’re talking about people, then maybe this follows less as an absolute, but more probabilistically, and more likely for greater differences, perhaps especially relative to the sizes of the numbers being compared. Or that all shows up in proxies or estimates, rather than the thing itself.)) This offers up ‘incoherence’ as not ‘totally incoherent’ but a lack of awareness of how utility is composed of the parts. There’s also the ‘if you’ve had sweets, you care less about getting more sweets for awhile’ paradigm*, in which some ‘incoherence’ might arise from path dependency, while there’s still goal-directedness, that magical essence.
*Which sometimes seems right, and sometimes seems backwards. After eating one cookie/chip, do you want another?
**I’m not sure this is a good model, but it seems like an obvious first attempt at formalizing ‘people want multiple things and might not be sure how they trade off, all of the time, or be making those comparisons’. While the ‘oh no, what if the utility function is preceded by a—so it minimizes what it maximizes’ metaphor might not have been useful, ‘the code may have—where it shouldn’t, and lack—where it should have them (replaced by +) and thus ‘incoherence’ appears but is never quite total or catastrophic’ might be better for describing incoherence.