Maybe I could cancel top surgery? No, that doesn’t work, because then I’d keep having boobs.
How about I go through with top surgery then? No, that’s no good either, because then I’d have to experience surgery and recovery.
Well, let’s see… Maybe… Maybe I could cancel top surgery?
*sigh* Still no, for the same reasons as last time. [...]
What if [good parts of first option, and good parts of second option], but not [bad parts of either option]?
This seems quite similar to my model of how craving (in the Buddhist sense) works and how it causes suffering, in that craving looks like it creates constraints for what features reality should contain, and the mind then tries to use its existing knowledge in order to come up with a plan of action that fulfills all the constraints—even if that turns out to be impossible. I have a more detailed writeup here, which includes a similar “thrashing around” example.
In a sense, “what if [good parts of first option, and good parts of second option], but not [bad parts of either option]” is a good question to ask, since it seems similar in spirit to what one is trying to do when goal factoring: start from the assumption that such an option does exist, and then carry out a search for achieving it. Of course, the “assume that such an option does exist” is a bad idea in cases where no such option does exist.
Still, in some situations it can make sense to just assign very high prior probability on the option existing, even if all the evidence would seem to go against this possibility. The most obvious scenario is one where all the non-fabricated options involve dying. If giving up means death, then in the worst case pursuing a policy of “I am going to assume there’s a way out and keep looking for it until I find one” just means that you’ll die (which would have been the case anyway), but in the best case it lets you find that one very-low probability chance of making it out alive.
In Shut up and do the impossible!, Eliezer also argues for pursuing things like AI alignment even when it seems impossible.
That logic would seem to help explain why brains sometimes put a probability on an option existing that is proportional to how important it is for us to achieve that goal (or put in more Buddhist terms, proportional to how much craving there is). If you crave something enough, the strength of that craving will make you believe that the option for getting the thing exists, so as to motivate you to keep looking at it. But that also means that after your brain has fabricated the option, you might stop looking. Once you assign a high enough prior probability to having the option, you feel like you already have it—and if someone points out that you don’t, that may feel like them taking it away from you, causing a need to lash back at them.
From my earlier writeup:
Recall that under the [predictive processing] framework, goals happen because a part of the brain assumes that they will happen, after which it changes reality to make that belief true. So focusing really hard on a craving for X makes it feel like X will become true, because the craving is literally rewriting an aspect of my subjective reality to make me think that X will become true.
When I focus hard on the craving, I am temporarily guiding my attention away from the parts of my mind which are pointing out the obstacles in the way of X coming true. That is, those parts have less of a chance to incorporate their constraints into the plan that my brain is trying to develop. This momentarily reduces the motion away from this plan, making it seem more plausible that the desired outcome will in fact become real.
Conversely, letting go of this craving, may feel like it is literally making the undesired outcome more real, rather than like I am coming more to terms with reality. This is most obvious in cases where one has a craving for an outcome that is impossible for certain, such as in the case of grieving about a friend’s death. Even after it is certain that someone is dead, there may still be persistent thoughts of if only I had done X, with an implicit additional flavor of if I just want to have done X really hard, things will change, and I can’t stop focusing on this possibility because my friend needs to be alive.
In this form, craving may lead to all kinds of rationalization and biased reasoning: a part of your mind is literally making you believe that X is true, because it wants you to find a strategy where X is true. This hallucinated belief may constrain all of your plans and models about the world in the same sense as getting direct sensory evidence about X being true would constrain your brain’s models. For example, if I have a very strong urge to believe that someone is interested in me, then this may cause me to interpret any of his words and expressions in a way compatible with this belief, regardless of how implausible and far-spread of a distortion this requires.
This seems quite similar to my model of how craving (in the Buddhist sense) works and how it causes suffering, in that craving looks like it creates constraints for what features reality should contain, and the mind then tries to use its existing knowledge in order to come up with a plan of action that fulfills all the constraints—even if that turns out to be impossible. I have a more detailed writeup here, which includes a similar “thrashing around” example.
In a sense, “what if [good parts of first option, and good parts of second option], but not [bad parts of either option]” is a good question to ask, since it seems similar in spirit to what one is trying to do when goal factoring: start from the assumption that such an option does exist, and then carry out a search for achieving it. Of course, the “assume that such an option does exist” is a bad idea in cases where no such option does exist.
Still, in some situations it can make sense to just assign very high prior probability on the option existing, even if all the evidence would seem to go against this possibility. The most obvious scenario is one where all the non-fabricated options involve dying. If giving up means death, then in the worst case pursuing a policy of “I am going to assume there’s a way out and keep looking for it until I find one” just means that you’ll die (which would have been the case anyway), but in the best case it lets you find that one very-low probability chance of making it out alive.
In Shut up and do the impossible!, Eliezer also argues for pursuing things like AI alignment even when it seems impossible.
That logic would seem to help explain why brains sometimes put a probability on an option existing that is proportional to how important it is for us to achieve that goal (or put in more Buddhist terms, proportional to how much craving there is). If you crave something enough, the strength of that craving will make you believe that the option for getting the thing exists, so as to motivate you to keep looking at it. But that also means that after your brain has fabricated the option, you might stop looking. Once you assign a high enough prior probability to having the option, you feel like you already have it—and if someone points out that you don’t, that may feel like them taking it away from you, causing a need to lash back at them.
From my earlier writeup: