I am thinking more like this: I am a scaredy-cat about roller coasters. So I prefer the tea cups to big thunder mountain rail road. And I maintain that preference after choosing the Tea Cups (I don’t regret my decision). However, had I ridden Big Thunder Mountain Rail Road, I would have been able to appreciate that it is awesome, and would have preferred Big Thunder Mountain Rail Road to the Tea Cups.
Since this case seems pretty possible, if the sorts of lessons you are going to draw only apply to hyper-idealized agents who know all their preferences perfectly and whose preferences are stable over time, that is a good thing to note, since the lessons may not apply to those of us with dynamic preference sets.
I dunno, this looks like it’s relatively easily resolved, to me. The confusion is that there are three possible outcome-states, not two. If you go on the roller coaster, you may or may not receive an update that lets you appreciate roller coaster rides. If you do receive it, it’ll allow you to enjoy that ride and all future ones, but there’s no guarantee that you will.
One possible ordering of those three outcomes is:
go on TBMRR and receive update > go on teacups > go on TBMRR but don’t receive update
Your most logical course of action would depend on how much you valued that update, and how likely it was that riding TBMRR would provide it.
There are really two cases here. In the first case, you predict prior to going on either ride that your preferences are stable, but you’re wrong—having been coerced to ride BTMRR, you discover that you prefer it. I don’t believe this case poses any problems for the normative theory that will follow—preference orderings can change with new information as long as those changes aren’t known in advance.
In the second case, you know that whichever choice you make, you will ex post facto be glad that you made that choice and not the other. Can humans be in this state? Maybe. I’m not sure what to think about this.
Well, a couple things. You can in part interpret that as being an underlying preference to do so, but you seem to have akrasia stopping you from actually choosing what you know you actually want.
Or perhaps you actually would prefer not to go on coasters, and consider the “after the fact” to be the same as “after the fact of taking some addictive drug, you might like it, so you wouldn’t want to in the first place”
As far as changing of preferences, you may think of your true preferences as encoded by the underlying algorithm your brain is effectively implementing, the thing that controls how your more visible to yourself preferences change in response to new information, arguments, etc etc etc...
Those underlying underlying preferences are the things that you wouldn’t want to change. You wouldn’t want to take a pill that makes you into the type of person that enjoys committing genocide or whatever, right? But you can predict in advance that if such a pill existed and you took it, then after it rewrote your preferences, you would retroactively prefer genociding. But since you (I assume) don’t want genocides to happen, you wouldn’t want to become the type of person that would want them to happen and would try to make them happen.
(skipping one or two minor caveats in this comment, but you get the idea, right?)
But also, humans tend to be slightly (minor understatement here) irrational. I mean, isn’t the whole project of LW and OB and so on based on the notion of “they way we are is not the way we wish to be. Let us become more rational”? So if something isn’t matching the way people normally behave, well… the problem may be “the way people normally behave”… I believe the usual phrasing is “this is a normative, rather than descriptive theory”
For the most part I think that starts to address it. At the same time, on your last point, there is an important difference between “this is how fully idealized rational agents of a certain sort behave” and “this is how you, a non-fully idealized, partially rational agent should behave, to improve your rationality”.
Someone in perfect physical condition (not just for humans, but for idealized physical beings) has a different optimal workout plan from me, and we should plan differently for various physical activities, even if this person is the ideal towards which I am aiming.
So if we idealize our bayesian models too much, we open up the question: “How does this idealized agent’s behavior relate to how I should behave?” It might be that, were we to design rational agents, it makes sense to use these idealized reasoners as models, but if the goal is personal improvement, we need some way to explain what one might call the Kantian inference from “I am an imperfectly rational being” to “I ought to behave the way such-and-such a perfectly rational being would”.
I am thinking more like this: I am a scaredy-cat about roller coasters. So I prefer the tea cups to big thunder mountain rail road. And I maintain that preference after choosing the Tea Cups (I don’t regret my decision). However, had I ridden Big Thunder Mountain Rail Road, I would have been able to appreciate that it is awesome, and would have preferred Big Thunder Mountain Rail Road to the Tea Cups.
Since this case seems pretty possible, if the sorts of lessons you are going to draw only apply to hyper-idealized agents who know all their preferences perfectly and whose preferences are stable over time, that is a good thing to note, since the lessons may not apply to those of us with dynamic preference sets.
I dunno, this looks like it’s relatively easily resolved, to me. The confusion is that there are three possible outcome-states, not two. If you go on the roller coaster, you may or may not receive an update that lets you appreciate roller coaster rides. If you do receive it, it’ll allow you to enjoy that ride and all future ones, but there’s no guarantee that you will.
One possible ordering of those three outcomes is: go on TBMRR and receive update > go on teacups > go on TBMRR but don’t receive update
Your most logical course of action would depend on how much you valued that update, and how likely it was that riding TBMRR would provide it.
There are really two cases here. In the first case, you predict prior to going on either ride that your preferences are stable, but you’re wrong—having been coerced to ride BTMRR, you discover that you prefer it. I don’t believe this case poses any problems for the normative theory that will follow—preference orderings can change with new information as long as those changes aren’t known in advance.
In the second case, you know that whichever choice you make, you will ex post facto be glad that you made that choice and not the other. Can humans be in this state? Maybe. I’m not sure what to think about this.
Well, a couple things. You can in part interpret that as being an underlying preference to do so, but you seem to have akrasia stopping you from actually choosing what you know you actually want.
Or perhaps you actually would prefer not to go on coasters, and consider the “after the fact” to be the same as “after the fact of taking some addictive drug, you might like it, so you wouldn’t want to in the first place”
As far as changing of preferences, you may think of your true preferences as encoded by the underlying algorithm your brain is effectively implementing, the thing that controls how your more visible to yourself preferences change in response to new information, arguments, etc etc etc...
Those underlying underlying preferences are the things that you wouldn’t want to change. You wouldn’t want to take a pill that makes you into the type of person that enjoys committing genocide or whatever, right? But you can predict in advance that if such a pill existed and you took it, then after it rewrote your preferences, you would retroactively prefer genociding. But since you (I assume) don’t want genocides to happen, you wouldn’t want to become the type of person that would want them to happen and would try to make them happen.
(skipping one or two minor caveats in this comment, but you get the idea, right?)
But also, humans tend to be slightly (minor understatement here) irrational. I mean, isn’t the whole project of LW and OB and so on based on the notion of “they way we are is not the way we wish to be. Let us become more rational”? So if something isn’t matching the way people normally behave, well… the problem may be “the way people normally behave”… I believe the usual phrasing is “this is a normative, rather than descriptive theory”
Or did I misunderstand?
For the most part I think that starts to address it. At the same time, on your last point, there is an important difference between “this is how fully idealized rational agents of a certain sort behave” and “this is how you, a non-fully idealized, partially rational agent should behave, to improve your rationality”.
Someone in perfect physical condition (not just for humans, but for idealized physical beings) has a different optimal workout plan from me, and we should plan differently for various physical activities, even if this person is the ideal towards which I am aiming.
So if we idealize our bayesian models too much, we open up the question: “How does this idealized agent’s behavior relate to how I should behave?” It might be that, were we to design rational agents, it makes sense to use these idealized reasoners as models, but if the goal is personal improvement, we need some way to explain what one might call the Kantian inference from “I am an imperfectly rational being” to “I ought to behave the way such-and-such a perfectly rational being would”.