Dark Arts of Rationality
Note: the author now disclaims this post, and asserts that his past self was insufficiently skilled in the art of rationality to “take the good and discard the bad” even when you don’t yet know how to justify it. You can, of course, get all the benefits described below, without once compromising your epistemics.
Today, we’re going to talk about Dark rationalist techniques: productivity tools which seem incoherent, mad, and downright irrational. These techniques include:
Willful Inconsistency
Intentional Compartmentalization
Modifying Terminal Goals
I expect many of you are already up in arms. It seems obvious that consistency is a virtue, that compartmentalization is a flaw, and that one should never modify their terminal goals.
I claim that these ‘obvious’ objections are incorrect, and that all three of these techniques can be instrumentally rational.
In this article, I’ll promote the strategic cultivation of false beliefs and condone mindhacking on the values you hold most dear. Truly, these are Dark Arts. I aim to convince you that sometimes, the benefits are worth the price.
Changing your Terminal Goals
In many games there is no “absolutely optimal” strategy. Consider the Prisoner’s Dilemma. The optimal strategy depends entirely upon the strategies of the other players. Entirely.
Intuitively, you may believe that there are some fixed “rational” strategies. Perhaps you think that even though complex behavior is dependent upon other players, there are still some constants, like “Never cooperate with DefectBot”. DefectBot always defects against you, so you should never cooperate with it. Cooperating with DefectBot would be insane. Right?
Wrong. If you find yourself on a playing field where everyone else is a TrollBot (players who cooperate with you if and only if you cooperate with DefectBot) then you should cooperate with DefectBots and defect against TrollBots.
Consider that. There are playing fields where you should cooperate with DefectBot, even though that looks completely insane from a naïve viewpoint. Optimality is not a feature of the strategy, it is a relationship between the strategy and the playing field.
Take this lesson to heart: in certain games, there are strange playing fields where the optimal move looks completely irrational.
I’m here to convince you that life is one of those games, and that you occupy a strange playing field right now.
Here’s a toy example of a strange playing field, which illustrates the fact that even your terminal goals are not sacred:
Imagine that you are completely self-consistent and have a utility function. For the sake of the thought experiment, pretend that your terminal goals are distinct, exclusive, orthogonal, and clearly labeled. You value your goals being achieved, but you have no preferences about how they are achieved or what happens afterwards (unless the goal explicitly mentions the past/future, in which case achieving the goal puts limits on the past/future). You possess at least two terminal goals, one of which we will call A
.
Omega descends from on high and makes you an offer. Omega will cause your terminal goal A
to become achieved over a certain span of time, without any expenditure of resources. As a price of taking the offer, you must switch out terminal goal A
for terminal goal B
. Omega guarantees that B
is orthogonal to A
and all your other terminal goals. Omega further guarantees that you will achieve B
using less time and resources than you would have spent on A
. Any other concerns you have are addressed via similar guarantees.
Clearly, you should take the offer. One of your terminal goals will be achieved, and while you’ll be pursuing a new terminal goal that you (before the offer) don’t care about, you’ll come out ahead in terms of time and resources which can be spent achieving your other goals.
So the optimal move, in this scenario, is to change your terminal goals.
There are times when the optimal move of a rational agent is to hack its own terminal goals.
You may find this counter-intuitive. It helps to remember that “optimality” depends as much upon the playing field as upon the strategy.
Next, I claim that such scenarios not restricted to toy games where Omega messes with your head. Humans encounter similar situations on a day-to-day basis.
Humans often find themselves in a position where they should modify their terminal goals, and the reason is simple: our thoughts do not have direct control over our motivation.
Unfortunately for us, our “motivation circuits” can distinguish between terminal and instrumental goals. It is often easier to put in effort, experience inspiration, and work tirelessly when pursuing a terminal goal as opposed to an instrumental goal. It would be nice if this were not the case, but it’s a fact of our hardware: we’re going to do X more if we want to do X for its own sake as opposed to when we force X upon ourselves.
Consider, for example, a young woman who wants to be a rockstar. She wants the fame, the money, and the lifestyle: these are her “terminal goals”. She lives in some strange world where rockstardom is wholly dependent upon merit (rather than social luck and network effects), and decides that in order to become a rockstar she has to produce really good music.
But here’s the problem: She’s a human. Her conscious decisions don’t directly affect her motivation.
In her case, it turns out that she can make better music when “Make Good Music” is a terminal goal as opposed to an instrumental goal.
When “Make Good Music” is an instrumental goal, she schedules practice time on a sitar and grinds out the hours. But she doesn’t really like it, so she cuts corners whenever akrasia comes knocking. She lacks inspiration and spends her spare hours dreaming of stardom. Her songs are shallow and trite.
When “Make Good Music” is a terminal goal, music pours forth, and she spends every spare hour playing her sitar: not because she knows that she “should” practice, but because you couldn’t pry her sitar from her cold dead fingers. She’s not “practicing”, she’s pouring out her soul, and no power in the ’verse can stop her. Her songs are emotional, deep, and moving.
It’s obvious that she should adopt a new terminal goal.
Ideally, we would be just as motivated to carry out instrumental goals as we are to carry out terminal goals. In reality, this is not the case. As a human, your motivation system does discriminate between the goals that you feel obligated to achieve and the goals that you pursue as ends unto themselves.
As such, it is sometimes in your best interest to modify your terminal goals.
Mind the terminology, here. When I speak of “terminal goals” I mean actions that feel like ends unto themselves. I am speaking of the stuff you wish you were doing when you’re doing boring stuff, the things you do in your free time just because they are fun, the actions you don’t need to justify.
This seems like the obvious meaning of “terminal goals” to me, but some of you may think of “terminal goals” more akin to self-endorsed morally sound end-values in some consistent utility function. I’m not talking about those. I’m not even convinced I have any.
Both types of “terminal goal” are susceptible to strange playing fields in which the optimal move is to change your goals, but it is only the former type of goal — the actions that are simply fun, that need no justification — which I’m suggesting you tweak for instrumental reasons.
I’ve largely refrained from goal-hacking, personally. I bring it up for a few reasons:
It’s the easiest Dark Side technique to justify. It helps break people out of the mindset where they think optimal actions are the ones that look rational in a vacuum. Remember, optimality is a feature of the playing field. Sometimes cooperating with DefectBot is the best strategy!
Goal hacking segues nicely into the other Dark Side techniques which I use frequently, as you will see shortly.
I have met many people who would benefit from a solid bout of goal-hacking.
I’ve crossed paths with many a confused person who (without any explicit thought on their part) had really silly terminal goals. We’ve all met people who are acting as if “Acquire Money” is a terminal goal, never noticing that money is almost entirely instrumental in nature. When you ask them “but what would you do if money was no issue and you had a lot of time”, all you get is a blank stare.
Even the LessWrong Wiki entry on terminal values describes a college student for which university is instrumental, and getting a job is terminal. This seems like a clear-cut case of a Lost Purpose: a job seems clearly instrumental. And yet, we’ve all met people who act as if “Have a Job” is a terminal value, and who then seem aimless and undirected after finding employment.
These people could use some goal hacking. You can argue that Acquire Money and Have a Job aren’t “really” terminal goals, to which I counter that many people don’t know their ass from their elbow when it comes to their own goals. Goal hacking is an important part of becoming a rationalist and/or improving mental health.
Goal-hacking in the name of consistency isn’t really a Dark Side power. This power is only Dark when you use it like the musician in our example, when you adopt terminal goals for instrumental reasons. This form of goal hacking is less common, but can be very effective.
I recently had a personal conversation with Alexei, who is earning to give. He noted that he was not entirely satisfied with his day-to-day work, and mused that perhaps goal-hacking (making “Do Well at Work” an end unto itself) could make him more effective, generally happier, and more productive in the long run.
Goal-hacking can be a powerful technique, when correctly applied. Remember, you’re not in direct control of your motivation circuits. Sometimes, strange though it seems, the optimal action involves fooling yourself.
You don’t get good at programming by sitting down and forcing yourself to practice for three hours a day. I mean, I suppose you could get good at programming that way. But it’s much easier to get good at programming by loving programming, by being the type of person who spends every spare hour tinkering on a project. Because then it doesn’t feel like practice, it feels like fun.
This is the power that you can harness, if you’re willing to tamper with your terminal goals for instrumental reasons. As rationalists, we would prefer to dedicate to instrumental goals the same vigor that is reserved for terminal goals. Unfortunately, we find ourselves on a strange playing field where goals that feel justified in their own right win the lion’s share of our attention.
Given this strange playing field, goal-hacking can be optimal.
You don’t have to completely mangle your goal system. Our aspiring musician from earlier doesn’t need to destroy her “Become a Rockstar” goal in order to adopt the “Make Good Music” goal. If you can successfully convince yourself to believe that something instrumental is a means unto itself (e.g. terminal), while still believing that it is instrumental, then more power to you.
This is, of course, an instance of Intentional Compartmentalization.
Intentional Compartmentalization
As soon as you endorse modifying your own terminal goals, Intentional Compartmentalization starts looking like a pretty good idea. If Omega offers to achieve A
at the price of dropping A and adopting B
, the ideal move is to take the offer after finding a way to not actually care about B.
A consistent agent cannot do this, but I have good news for you: You’re a human. You’re not consistent. In fact, you’re great at being inconsistent!
You might expect it to be difficult to add a new terminal goal while still believing that it’s instrumental. You may also run into strange situations where holding an instrumental goal as terminal directly contradicts other terminal goals.
For example, our aspiring musician might find that she makes even better music if “Become a Rockstar” is not among her terminal goals.
This means she’s in trouble: She either has to drop “Become a Rockstar” and have a better chance at actually becoming a rockstar, or she has to settle for a decreased chance that she’ll become a rockstar.
Or, rather, she would have to settle for one of these choices — if she wasn’t human.
I have good news! Humans are really really good at being inconsistent, and you can leverage this to your advantage. Compartmentalize! Maintain goals that are “terminal” in one compartment, but which you know are “instrumental” in another, then simply never let those compartments touch!
This may sound completely crazy and irrational, but remember: you aren’t actually in control of your motivation system. You find yourself on a strange playing field, and the optimal move may in fact require mental contortions that make epistemic rationalists shudder.
Hopefully you never run into this particular problem (holding contradictory goals in “terminal” positions), but this illustrates that there are scenarios where compartmentalization works in your favor. Of course we’d prefer to have direct control of our motivation systems, but given that we don’t, compartmentalization is a huge asset.
Take a moment and let this sink in before moving on.
Once you realize that compartmentalization is OK, you are ready to practice my second Dark Side technique: Intentional Compartmentalization. It has many uses outside the realm of goal-hacking.
See, motivation is a fickle beast. And, as you’ll remember, your conscious choices are not directly attached to your motivation levels. You can’t just decide to be more motivated.
At least, not directly.
I’ve found that certain beliefs — beliefs which I know are wrong — can make me more productive. (On a related note, remember that religious organizations are generally more coordinated than rationalist groups.)
It turns out that, under these false beliefs, I can tap into motivational reserves that are otherwise unavailable. The only problem is, I know that these beliefs are downright false.
I’m just kidding, that’s not actually a problem. Compartmentalization to the rescue!
Here’s a couple example beliefs that I keep locked away in my mental compartments, bound up in chains. Every so often, when I need to be extra productive, I don my protective gear and enter these compartments. I never fully believe these things — not globally, at least — but I’m capable of attaining “local belief”, of acting as if I hold these beliefs. This, it turns out, is enough.
Nothing is Beyond My Grasp
We’ll start off with a tame belief, something that is soundly rooted in evidence outside of its little compartment.
I have a global belief, outside all my compartments, that nothing is beyond my grasp.
Others may understand things easier I do or faster than I do. People smarter than myself grok concepts with less effort than I. It may take me years to wrap my head around things that other people find trivial. However, there is no idea that a human has ever had that I cannot, in principle, grok.
I believe this with moderately high probability, just based on my own general intelligence and the fact that brains are so tightly clustered in mind-space. It may take me a hundred times the effort to understand something, but I can still understand it eventually. Even things that are beyond the grasp of a meager human mind, I will one day be able to grasp after I upgrade my brain. Even if there are limits imposed by reality, I could in principle overcome them if I had enough computing power. Given any finite idea, I could in theory become powerful enough to understand it.
This belief, itself, is not compartmentalized. What is compartmentalized is the certainty.
Inside the compartment, I believe that Nothing is Beyond My Grasp with 100% confidence. Note that this is ridiculous: there’s no such thing as 100% confidence. At least, not in my global beliefs. But inside the compartments, while we’re in la-la land, it helps to treat Nothing is Beyond My Grasp as raw, immutable fact.
You might think that it’s sufficient to believe Nothing is Beyond My Grasp with very high probability. If that’s the case, you haven’t been listening: I don’t actually believe Nothing is Beyond My Grasp with an extraordinarily high probability. I believe it with moderate probability, and then I have a compartment in which it’s a certainty.
It would be nice if I never needed to use the compartment, if I could face down technical problems and incomprehensible lingo and being really out of my depth with a relatively high confidence that I’m going to be able to make sense of it all. However, I’m not in direct control of my motivation. And it turns out that, through some quirk in my psychology, it’s easier to face down the oppressive feeling of being in way over my head if I have this rock-solid “belief” that Nothing is Beyond My Grasp.
This is what the compartments are good for: I don’t actually believe the things inside them, but I can still act as if I do. That ability allows me to face down challenges that would be difficult to face down otherwise.
This compartment was largely constructed with the help of The Phantom Tollbooth: it taught me that there are certain impossible tasks you can do if you think they’re possible. It’s not always enough to know that if I believe I can do a thing, then I have a higher probability of being able to do it. I get an extra boost from believing I can do anything.
You might be surprised about how much you can do when you have a mental compartment in which you are unstoppable.
My Willpower Does Not Deplete
Here’s another: My Willpower Does Not Deplete.
Ok, so my willpower actually does deplete. I’ve been writing about how it does, and discussing methods that I use to avoid depletion. Right now, I’m writing about how I’ve acknowledged the fact that my willpower does deplete.
But I have this compartment where it doesn’t.
Ego depletion is a funny thing. If you don’t believe in ego depletion, you suffer less ego depletion. This does not eliminate ego depletion.
Knowing this, I have a compartment in which My Willpower Does Not Deplete. I go there often, when I’m studying. It’s easy, I think, for one to begin to feel tired, and say “oh, this must be ego depletion, I can’t work anymore.” Whenever my brain tries to go there, I wheel this bad boy out of his cage. “Nope”, I respond, “My Willpower Does Not Deplete”.
Surprisingly, this often works. I won’t force myself to keep working, but I’m pretty good at preventing mental escape attempts via “phantom akrasia”. I don’t allow myself to invoke ego depletion or akrasia to stop being productive, because My Willpower Does Not Deplete. I have to actually be tired out, in a way that doesn’t trigger the My Willpower Does Not Deplete safeguards. This doesn’t let me keep going forever, but it prevents a lot of false alarms.
In my experience, the strong version (My Willpower Does Not Deplete) is much more effective than the weak version (My Willpower is Not Depleted Yet), even though it’s more wrong. This probably says something about my personality. Your mileage may vary. Keep in mind, though, that the effectiveness of your mental compartments may depend more on the motivational content than on degree of falsehood.
Anything is a Placebo
Placebos work even when you know they are placebos.
This is the sort of madness I’m talking about, when I say things like “you’re on a strange playing field”.
Knowing this, you can easily activate the placebo effect manually. Feeling sick? Here’s a freebie: drink more water. It will make you feel better.
No? It’s just a placebo, you say? Doesn’t matter. Tell yourself that water makes it better. Put that in a nice little compartment, save it for later. It doesn’t matter that you know what you’re doing: your brain is easily fooled.
Want to be more productive, be healthier, and exercise more effectively? Try using Anything is a Placebo! Pick something trivial and non-harmful and tell yourself that it helps you perform better. Put the belief in a compartment in which you act as if you believe the thing. Cognitive dissonance doesn’t matter! Your brain is great at ignoring cognitive dissonance. You can “know” you’re wrong in the global case, while “believing” you’re right locally.
For bonus points, try combining objectives. Are you constantly underhydrated? Try believing that drinking more water makes you more alert!
Brains are weird.
Truly, these are the Dark Arts of instrumental rationality. Epistemic rationalists recoil in horror as I advocate intentionally cultivating false beliefs. It goes without saying that you should use this technique with care. Remember to always audit your compartmentalized beliefs through the lens of your actual beliefs, and be very careful not to let incorrect beliefs leak out of their compartments.
If you think you can achieve similar benefits without “fooling yourself”, then by all means, do so. I haven’t been able to find effective alternatives. Brains have been honing compartmentalization techniques for eons, so I figure I might as well re-use the hardware.
It’s important to reiterate that these techniques are necessary because you’re not actually in control of your own motivation. Sometimes, incorrect beliefs make you more motivated. Intentionally cultivating incorrect beliefs is surely a path to the Dark Side: compartmentalization only mitigates the damage. If you make sure you segregate the bad beliefs and acknowledge them for what they are then you can get much of the benefit without paying the cost, but there is still a cost, and the currency is cognitive dissonance.
At this point, you should be mildly uncomfortable. After all, I’m advocating something which is completely epistemically irrational. We’re not done yet, though.
I have one more Dark Side technique, and it’s worse.
Willful Inconsistency
I use Intentional Compartmentalization to “locally believe” things that I don’t “globally believe”, in cases where the local belief makes me more productive. In this case, the beliefs in the compartments are things that I tell myself. They’re like mantras that I repeat in my head, at the System 2 level. System 1 is fragmented and compartmentalized, and happily obliges.
Willful Inconsistency is the grown-up, scary version of Intentional Compartmentalization. It involves convincing System 1 wholly and entirely of something that System 2 does not actually believe. There’s no compartmentalization and no fragmentation. There’s nowhere to shove the incorrect belief when you’re done with it. It’s taken over the intuition, and it’s always on. Willful Inconsistency is about having gut-level intuitive beliefs that you explicitly disavow.
Your intuitions run the show whenever you’re not paying attention, so if you’re willfully inconsistent then you’re going to actually act as if these incorrect beliefs are true in your day-to-day life, unless your forcibly override your default actions. Ego depletion and distraction make you vulnerable to yourself.
Use this technique with caution.
This may seem insane even to those of you who took the previous suggestions in stride. That you must sometimes alter your terminal goals is a feature of the playing field, not the agent. The fact that you are not in direct control of your motivation system readily implies that tricking yourself is useful, and compartmentalization is an obvious way to mitigate the damage.
But why would anyone ever try to convince themselves, deep down at the core, of something that they don’t actually believe?
The answer is simple: specialization.
To illustrate, let me explain how I use willful inconsistency.
I have invoked Willful Inconsistency on only two occasions, and they were similar in nature. Only one instance of Willful Inconsistency is currently active, and it works like this:
I have completely and totally convinced my intuitions that unfriendly AI is a problem. A big problem. System 1 operates under the assumption that UFAI will come to pass in the next twenty years with very high probability.
You can imagine how this is somewhat motivating.
On the conscious level, within System 2, I’m much less certain. I solidly believe that UFAI is a big problem, and that it’s the problem that I should be focusing my efforts on. However, my error bars are far wider, my timespan is quite broad. I acknowledge a decent probability of soft takeoff. I assign moderate probabilities to a number of other existential threats. I think there are a large number of unknown unknowns, and there’s a non-zero chance that the status quo continues until I die (and that I can’t later be brought back). All this I know.
But, right now, as I type this, my intuition is screaming at me that the above is all wrong, that my error bars are narrow, and that I don’t actually expect the status quo to continue for even thirty years.
This is just how I like things.
See, I am convinced that building a friendly AI is the most important problem for me to be working on, even though there is a very real chance that MIRI’s research won’t turn out to be crucial. Perhaps other existential risks will get to us first. Perhaps we’ll get brain uploads and Robin Hanson’s emulation economy. Perhaps it’s going to take far longer than expected to crack general intelligence. However, after much reflection I have concluded that despite the uncertainty, this is where I should focus my efforts.
The problem is, it’s hard to translate that decision down to System 1.
Consider a toy scenario, where there are ten problems in the world. Imagine that, in the face of uncertainty and diminishing returns from research effort, I have concluded that the world should allocate 30% of resources to problem A, 25% to problem B, 10% to problem C, and 5% to each of the remaining problems.
Because specialization leads to massive benefits, it’s much more effective to dedicate 30% of researchers to working on problem A rather than having all researchers dedicate 30% of their time to problem A. So presume that, in light of these conclusions, I decide to dedicate myself to problem A.
Here we have a problem: I’m supposed to specialize in problem A, but at the intuitive level problem A isn’t that big a deal. It’s only 30% of the problem space, after all, and it’s not really that much worse than problem B.
This would be no issue if I were in control of my own motivation system: I could put the blinders on and focus on problem A, crank the motivation knob to maximum, and trust everyone else to focus on the other problems and do their part.
But I’m not in control of my motivation system. If my intuitions know that there are a number of other similarly worthy problems that I’m ignoring, if they are distracted by other issues of similar scope, then I’m tempted to work on everything at once. This is bad, because output is maximized if we all specialize.
Things get especially bad when problem A is highly uncertain and unlikely to affect people for decades if not centuries. It’s very hard to convince the monkey brain to care about far-future vagaries, even if I’ve rationally concluded that those are where I should dedicate my resources.
I find myself on a strange playing field, where the optimal move is to lie to System 1.
Allow me to make that more concrete:
I’m much more motivated to do FAI research when I’m intuitively convinced that we have a hard 15 year timer until UFAI.
Explicitly, I believe UFAI is one possibility among many and that the timeframe should be measured in decades rather than years. I’ve concluded that it is my most pressing concern, but I don’t actually believe we have a hard 15 year countdown.
That said, it’s hard to understate how useful it is to have a gut-level feeling that there’s a short, hard timeline. This “knowledge” pushes the monkey brain to go all out, no holds barred. In other words, this is the method by which I convince myself to actually specialize.
This is how I convince myself to deploy every available resource, to attack the problem as if the stakes were incredibly high. Because the stakes are incredibly high, and I do need to deploy every available resource, even if we don’t have a hard 15 year timer.
In other words, Willful Inconsistency is the technique I use to force my intuition to feel as if the stakes are as high as I’ve calculated them to be, given that my monkey brain is bad at responding to uncertain vague future problems. Willful Inconsistency is my counter to Scope Insensitivity: my intuition has difficulty believing the results when I do the multiplication, so I lie to it until it acts with appropriate vigor.
This is the final secret weapon in my motivational arsenal.
I don’t personally recommend that you try this technique. It can have harsh side effects, including feelings of guilt, intense stress, and massive amounts of cognitive dissonance. I’m able to do this in large part because I’m in a very good headspace. I went into this with full knowledge of what I was doing, and I am confident that I can back out (and actually correct my intuitions) if the need arises.
That said, I’ve found that cultivating a gut-level feeling that what you’re doing must be done, and must be done quickly, is an extraordinarily good motivator. It’s such a strong motivator that I seldom explicitly acknowledge it. I don’t need to mentally invoke “we have to study or the world ends”. Rather, this knowledge lingers in the background. It’s not a mantra, it’s not something that I repeat and wear thin. Instead, it’s this gut-level drive that sits underneath it all, that makes me strive to go faster unless I explicitly try to slow down.
This monkey-brain tunnel vision, combined with a long habit of productivity, is what keeps me Moving Towards the Goal.
Those are my Dark Side techniques: Willful Inconsistency, Intentional Compartmentalization, and Terminal Goal Modification.
I expect that these techniques will be rather controversial. If I may be so bold, I recommend that discussion focus on goal-hacking and intentional compartmentalization. I acknowledge that willful inconsistency is unhealthy and I don’t generally recommend that others try it. By contrast, both goal-hacking and intentional compartmentalization are quite sane and, indeed, instrumentally rational.
These are certainly not techniques that I would recommend CFAR teach to newcomers, and I remind you that “it is dangerous to be half a rationalist”. You can royally screw you over if you’re still figuring out your beliefs as you attempt to compartmentalize false beliefs. I recommend only using them when you’re sure of what your goals are and confident about the borders between your actual beliefs and your intentionally false “beliefs”.
It may be surprising that changing terminal goals can be an optimal strategy, and that humans should consider adopting incorrect beliefs strategically. At the least, I encourage you to remember that there are no absolutely rational actions.
Modifying your own goals and cultivating false beliefs are useful because we live in strange, hampered control systems. Your brain was optimized with no concern for truth, and optimal performance may require self deception. I remind the uncomfortable that instrumental rationality is not about being the most consistent or the most correct, it’s about winning. There are games where the optimal move requires adopting false beliefs, and if you find yourself playing one of those games, then you should adopt false beliefs. Instrumental rationality and epistemic rationality can be pitted against each other.
We are fortunate, as humans, to be skilled at compartmentalization: this helps us work around our mental handicaps without sacrificing epistemic rationality. Of course, we’d rather not have the mental handicaps in the first place: but you have to work with what you’re given.
We are weird agents without full control of our own minds. We lack direct control over important aspects of ourselves. For that reason, it’s often necessary to take actions that may seem contradictory, crazy, or downright irrational.
Just remember this, before you condemn these techniques: optimality is as much an aspect of the playing field as of the strategy, and humans occupy a strange playing field indeed.
- Reminding myself just how awful pain can get (plus, an experiment on myself) by 15 Mar 2023 22:44 UTC; 301 points) (EA Forum;
- On saving the world by 30 Jan 2014 20:00 UTC; 230 points) (
- Prioritizing x-risks may require caring about future people by 14 Aug 2022 0:55 UTC; 182 points) (EA Forum;
- Say Wrong Things by 24 May 2019 22:11 UTC; 115 points) (
- A Dialogue On Doublethink by 11 May 2014 19:38 UTC; 103 points) (
- A Review and Summary of the Landmark Forum by 27 May 2021 18:22 UTC; 55 points) (
- Theories of Pain by 26 Aug 2018 22:05 UTC; 35 points) (
- The Art of the Artificial: Insights from ‘Artificial Intelligence: A Modern Approach’ by 25 Mar 2018 6:55 UTC; 31 points) (
- What’s your big idea? by 18 Oct 2019 15:47 UTC; 30 points) (
- Agency and Life Domains by 16 Nov 2014 1:38 UTC; 8 points) (
- 7 Feb 2014 4:22 UTC; 7 points) 's comment on Open Thread for February 3 − 10 by (
- 27 Mar 2014 19:38 UTC; 7 points) 's comment on Two arguments for not thinking about ethics (too much) by (
- 16 Jan 2014 2:22 UTC; 5 points) 's comment on The mechanics of my recent productivity by (
- 9 Oct 2014 13:22 UTC; 4 points) 's comment on On Caring by (
- 25 Dec 2018 19:42 UTC; 3 points) 's comment on Experiences of Self-deception by (
- 9 May 2014 16:19 UTC; 2 points) 's comment on A Dialogue On Doublethink by (
- 17 Nov 2014 11:58 UTC; 2 points) 's comment on Agency and Life Domains by (
- 31 Aug 2022 14:26 UTC; 1 point) 's comment on Effective altruism is no longer the right name for the movement by (EA Forum;
- 10 Jun 2015 1:08 UTC; 1 point) 's comment on How much do we know about creativity? by (
- 18 Nov 2014 22:27 UTC; 1 point) 's comment on Agency and Life Domains by (
- 4 Apr 2015 16:54 UTC; 0 points) 's comment on Against the internal locus of control by (
- 29 Aug 2016 22:12 UTC; 0 points) 's comment on Open Thread, Aug. 22 − 28, 2016 by (
An example from real life: DefectBot = God, TrollBots = your religious neighbors. God does not reward you for your prayers, but your neighbors may punish you socially for lack of trying. You defect against your neighbors by secretly being a member of an atheist community, and generally by not punishing other nonbelievers.
I wonder what techniques could we use to make the compartmentalization stronger and easy to turn off when it’s no longer needed. Clear boundaries. A possible solution would be to use the different set of beliefs only while wearing a silly hat. Not literally silly, because I might want to use it in public without handicapping myself. But some environmental reminder. An amulet, perhaps?
When I was being raised an Orthodox Jew, there were several talismans that served essentially this purpose (though of course my rabbis would not have phrased it that way).
she who wears the magic bracelet of future-self delegation http://i.imgur.com/5Bfq4we.png prefers to do as she is ordered
Amusingly enough, the example of TrollBot that came to mind was the God expounded on in many parts of the New Testament, who will punish you iff you do not unconditionally cooperate with others, including your oppressors.
A lucky rabbit’s foot?
I probably couldn’t stop noticing that it was in fact one unlucky rabbit.
A rabbit’s footprint, maybe.
Well yeah, you’re retroactively stealing his luck.
Not recommended with a Rabbi’s foot, either.
One major example of a situation where you’ll want to hack your terminal goals is if you’re single and want to get into a relationship: you’re far more likely to succeed if you genuinely enjoy the company of members-of-your-preferred-sex even when you don’t think that it will lead to anything.
Agreed. Also helpful is if the parts of you with close access to e.g. your posture and voice tone have an unshakable belief in your dominance within the tribe, and your irresistible sex appeal. In fact social interaction in general is the best example of somewhere that dark side rationality is helpful.
This is the best article on lesswrong in some time. I think it should at least be considered for entry into the sequences. it raises some extremely important challenges to the general ethos around here.
Hmmm. I suspects it depends on the circumstances. If you are a (male) pirate and you want a pirate girlfriend, but the only females in your immediate surroundings are ninjas, you will not find their constant discussions of surikens and stealthy assasinations enjoyable, when you only want to talk about muskets and ship-boardings.
I think that’s exactly the situation the parent was talking about; it’s easier to self-modify to take an interest in shuriken and stealthy assassinations than to convince others to do the converse, or to form a relationship when you have no common interesting conversations.
I have to disagree here. It is very hard to self-modify.
The difficulty of self-modification depends on the part being modified, the degree of modification, and one’s attitudes towards the change. Some self-modifications are easy, others are impossible.
All right, let’s get back to the real world. We were talking about romantic relations.
It is unlikely that a person who likes classical music and computer science will be able to self-modify into a person who likes heavy metal and stealing cars.
Can’t speak for stealing cars, but there’s more overlap between classical and metal fans than you might think; there exists a subgenre of neoclassical heavy metal, even.
Also, since cars are now quite integrated with computers this person might have lots of fun stealing them. And if ze watches Breaking Bad there’s a whole lot of inspiration there for intellectuals looking to turn to a life of blue-collar crime.
Maybe I should be steel-manning Locaha’s argument but my point is I don’t think the limits of this sort of self-mod are well understood, so it’s premature to declare which mods are or aren’t “real world”.
I think the problem here is one of motivation to self-modify. For example, it’s one thing to want to self-modify to like spicy food (possible, but potentially unpleasant) or become bisexual (possible for some, probably, but not for others), but other self-modifications are less desirable—for example, I wouldn’t want to be more interested in “normal people” even if it would increase the number of people with whom I’d have relationships.
-
I meant “interest” in the sense of “enjoy interaction with” or “enjoy a relationship with”.
-
I’d expect most of the gains from becoming more interested in “normal people” to come from the side effect of improving your emotional rapport with such people, not limited to those you might be interested in dating, not from the direct effect of increasing your supply of potential relationships.
Then go somewhere else. Duh. :-)
This makes it sound trivial. Would you consider it trivial for someone ship-wrecked on a Ninja Island with a peg-leg and various other piraty injuries?
You can’t. You need money to repair your ship, and only ninjas hire in this economy...
Then, unfortunately, you must compartmentalise, wear a mask, whatever that makes shurikens endlessly fascinating for you until you (make money to) get your ship fixed. Then set sail, and cast away the mask.
You don’t actually have to talk with the ninja girls. There is no requirement to have the same hobbies as your co-workers.
It helped me very much to follow utility “have (and enjoy) a date” instead of “find a relationship”.
If you don’t enjoy the company of members-of-your-preferred-sex, what do you want a relationship for (that you couldn’t also get from one-night stands or even prostitutes), anyway?
(Possibility of having children in the future?)
You don’t enjoy company of most members-of-your-preferred-sex, but are hopeful that there are people out there that you could spend your life with. The problem is that finding them is painful, because you have to spend time with people whose company you won’t enjoy during the search.
By hacking yourself to enjoy their company you make the search actually pleasant. Though hopefully your final criteria does not change.
“We understand how dangerous a mask can be. We all become what we pretend to be.” ― Patrick Rothfuss, The Name of the Wind
“No man, for any considerable period, can wear one face to himself and another to the multitude, without finally getting bewildered as to which may be the true.” ― Nathaniel Hawthorne, The Scarlet Letter
I don’t think that changing your preferences is the same thing as wearing a mask.
If you want to increase your enjoyment of the company of women you might start to do gratitude journaling where you write down everything enjoyable that happens while you are in the company with women.
You could also do another CBT intervention of finding mental distortions. You might find that you are frustrate in the company of woman because they frequently do stuff you don’t consider rational*. Under the terms of the feeling good handbook that reflects that you suffer under the mental distortions of using “should statements”.
You have expectations of how they are supposed to behave. You could switch those expectations against a more realistic expectation. Women are complex systems and studying how the systems work is an interesting activity.
I think most of the people who speak about hacking enjoyment don’t think about what practical steps it entails.
Disclaimer: *I’m not saying that women are irrational. I’m saying that there are guys who would be more happy if they would let go of a few expectations of how other people should act and switch to become curious in using their intelligence to discover how other people act.
I recall in the whites lies thread a discussion about women lying to men who ask them out. I remember when my friend was lied to by a girl. She said she liked someone else but didn’t, she wanted to let him down easy. He was quite upset about being lied to and I thought he was unequivocally right.
Later I discovered and pondered the perspective of women. Either trained to avoid upsetting men, or fearing possible retaliation to a blatant refusal, plus guess culture. He wouldn’t have done something bad based on her direct refusal, and she wasn’t really from a social place where she had likely encountered violence because of such a refusal, and I know this because I went to school with her all the way from kindergarten. So I attribute it to social conditioning.
Although this is a less persuasive argument than safety, I decided that not having a problem with this action would benefit me in interacting with women, although personally I was never in that situation. What has making this change cost me? Nothing. But it, and many other updates, have allowed me the chance to be less bitter about women should I encounter these circumstances.
As far as your quotes go, yes, deciding to believe this resulted in it becoming a true belief over time. I looked at my terminal goal, and decided to pretend things that made it more likely. Sure those beliefs are now true beliefs of mine, but so what? Has that hurt me somehow?
You could argue the opposite: if you expose yourself indiscriminately to people who don’t share your values, they’ll have a better chance to change them. I think I operate under this assumption. Most people wear some kinds of masks in various situations, and I think some people who insist they shouldn’t just lack basic skills in deception and lie detection. I’m not implying people are more malicious than some people expect, I’m implying deception is generally thought of as a lesser evil than some people think.
If we talk about really hacking your preferences on some deep level, I agree with the danger of unintentionally becoming someone else.
Don’t Believe You’ll Self-Deceive
You seem to be assuming a model where one can only meet a potential mate selected at random from the population, and one’d need to spend a lot of time with her before being allowed to rule her out.
Hm, somewhat, yes. What do you believe?
I mean it’s not purely at random, of course, but surely you need to go out and meet a lot of people.
Well, not necessarily. ;-)
Social status
Someone could e.g. enjoy the company, but be so desperate for a relationship with intimacy that any non-intimate interaction with potentially interesting people would be a painful reminder of what they were missing out on. (By “intimacy”, I don’t mean just sex, but also things like trust and a permission to be open with your feelings, hold hands, kiss in a way that mainly denotes affection rather than desire, etc.)
I see.
I assumed you meant e.g. men who like to act misogynistically when among other men and find it effortful to not do so around women.
It probably helps to have both goals in some degree. I’ve actually made efforts to move my goals in the direction opposite to the one you suggest.
If you don’t enjoy the company of members of your preferred sex when you don’t think it would lead to anything, a relationship is probably not for you, anyway.
Anyone that know’s me knows that I’m quite familiar with the dark arts. I’ve even used hypnosis to con Christians into Atheism a half dozen times. The tempting idea is that dark arts can be used for good—and that the ends justify the means. I’ve since changed my mind.
The thing is, even though I don’t advocate dark arts for persuasion let alone rationality, I almost entirely agree with the actions you advocate. I just disagree strongly with the frame through which you look at them.
For example, I am heavily into what you call “changing terminal goals”, however I disagree that I’m changing terminal goals. If I recognize that pursuing instrumental goal A for sake of “terminal” goal B is the best way to achieve goal B, I’ll self modify in the way you describe. I’ll also do that thing you frame as “being inconsistent” where I make sure to notice if chasing goal A is no longer the best way to achieve goal B, I self modify to stop chasing goal A. If you make sure to remember that step, goals are not sticky. You chase goal A “for its own sake” iff it is the best way to achieve goal B. That’s what instrumental goals are.
The way I see it, the difference in motivation comes not from “terminal vs instrumental”, but from how you’re focusing your attention. In what you call “instrumental” mode, you aren’t focusing solely on your instrumental goal. You’re trying to work on your instrumental goal while you keep glancing over at your terminal goal. That’s distracting, and of course it doesn’t work well. If it’s a long term goal of course you don’t see immediate improvements—and so of course you lose motivation. What you call “hacking my goals to be terminal” I call “realizing at a gut level that in order to get what I want, I need to focus on this instrumental goal without expecting immediate results on my terminal goal”
But there are also downsides to allowing yourself to “fool yourself”. In particular, through that frame, the thought is “it’s false, but so what? It’s useful!”. That stops curiosity dead when you should be asking the question “if it’s false, why is it so useful? Where’s the mutual information that allows it to function as a control system?” and “what true beliefs do better?”.
For example, your “nothing is beyond my grasp” belief. It’s empowering, sure. Just because you recognize that it isn’t technically true doesn’t mean you should deprive yourself of that empowerment—of course. However, lying isn’t necessary for that empowerment. The problem isn’t that you believe you’re defeat-able. The problem is that you fear failure. So instead of focusing on the task at hand, you keep glancing over at the possibility of failure when you should be keeping your eyes on the road. One of the big take home lessons from studying hypnotism is that It’s always about direction of attention. Strip away the frames and motivations and look at where the attention is.
My version or your empowering belief is (to try to crudely translate into words) “I want to succeed. I might not, and if I don’t, it will be truly disappointing. And that’s okay. And even though I might fail, I might not and that would be truly amazing. So I’m going to throw myself at it without looking back”. And my version is better. My version is more stable under assault.
My wrestling coach would spout the cliche “If you can’t believe you’ll win, you wont!”. If I had bought into that, the moment reality slaps me in the face I’d lose grasp of my delusion and crumble. Instead, I laughed at the idea. I went into matches already accepting defeat and focusing on winning anyway—and it allowed me to win a few matches that no one thought I could possibly win.
To each their own.
This may be true for small subgoals, but I feel it’s difficult for large goals. Consider learning to program. In my experience, it is much easier to become a good programmer if you actually love programming. Even if you successfully choose to focus on programming and manage not to be distracted by your “real” goals, the scheduler acts differently if you’ve decided to program versus if you love programming. The difference is in the details, like how you’ll mentally debug a project you’re working on while riding the bus, or scribble ideas in a notebook while in class, things that the scheduler wouldn’t even consider if you’ve shifted your focus but haven’t actually made programming an end unto itself.
If you can achieve the same level of commitment merely by shifting your focus, more power to you. In my experience, there is an extra boost I get from a task being an end in its own right.
That said, as I mentioned in the post, I seldom use terminal-goal-modification myself. Part of the point of that section was to remind people that even if they don’t personally like goal-hacking, there are games in which the optimal move involves changing your actual terminal goals, for any definition of “terminal goals”.
I can see how this may ring true for you, but it does not ring true for me. “Nothing is Beyond My Grasp” has very little do to with fear of failure. I’m glad your particular mantra works for you, but I don’t think it would help me tap into the reserves available in the compartment. In day-to-day life, I have a non-compartmentalized belief that is similar to your mantra, but it’s not related to the compartmentalized belief.
“Nothing is Beyond My Grasp” is useful because it puts me in a frame of mind that is otherwise difficult to access, in which I have additional reserves of motivation. It’s a bit hard to describe in words, but… it’s sort of like context-switching, it’s sort of like a credible precommitment, it’s sort of like stubbornness, and it’s sort of like a word of encouragement from a close friend. It feels kind of like those things. (Describing discontinuous unfounded mental states is hard.)
It’s not that I’m suffering from a “fear of failure” or a “lack of focus”, it’s that there’s set of parameters under which the monkey brain has a particular flavor of performance, and those parameters were baked in long before humans realized how big the world is and how nothing is a certainty. The compartment is sort of like a way to QuickLoad that mental state.
If you can get into that headspace without fooling yourself, more power to you. Personally, I access it via a mental compartment with false beliefs.
That’s the part I agree with.
I’m not a full blown programmer, but I have loved programming to the point of working on it for long stretches and losing sleep because I was too drawn to it to let my mind rest. I still call that kind of thing (and even more serious love for programming) “instrumental”
It’s hard to describe in a single comment, but it’s not the same as just “Hmmm… You make a good point. I guess I should focus on the instrumental goal”. It’s not a conscious decision to willpower some focus. I’m talking about the same process you are when you “switch an instrumental goal to terminal”. It has all the same “qualia” associated with it.
It’s just that I don’t agree with calling “that thing I do because I love doing it” a “terminal goal”
When I look back at the things I have loved doing, they all contributed to more human-general terminal goals. I didn’t always realize this at the time (and at the time I very well might have described it as “terminal”), but in retrospect it all adds up. And when something that I loved doing stopped contributing to the actual terminal goal, I would lose interest. The only difference is that now I’m more aware of whats going on so it’s easier to notice which things I ought to be interested in “for their own sake”.
I don’t think it’s always obvious when something has to do with “fear of failure”. Introspection illusion and all. I’m actually talking about one of the more subtle and harder-to-find flavors of “fear of failure”.
Yes, it’s absolutely about the frame of mind that it puts you in, and it’s difficult to describe in words. The words aren’t doing the heavy lifting, and the feelings just don’t translate that well to english. I don’t even have a “mantra”—I have a mindset that is hard to convey but those words seem to point in roughly the right direction. It definitely includes ferocious stubbornness and feels like I’m being cheered on.
And I certainly don’t expect my words to get you there right off the bat.
And if that’s how you know how to do it, then by all means keep making use of it.
I just think you should open your mind to alternatives in the mean time. Your argument seems to be “we are godshatter, therefore no clean solution exists, therefore we should not bother looking for a clean solution when thinking about how to use our brain”. My response is “yeah kinda, I can see why you’d suspect this, but there’s no way you should be certain enough of that to not keep your eyes open for a clean solution.” and “BTW, I claim to have a clean solution and while I can’t hand it to you on a silver platter, I can wave my hands in the general direction. The clean solution comes with goodies like better “hacks”, more frequent and better aimed use, and an idea of the direction of progress”
And I have to emphasize that I do mean “open your mind to that alternative” not “take that alternative” because it’s not possible to pick up a whole new worldview over a LW comment. I’m not looking for a “Oh! He’s totally right!”. I’m looking for a “Hmmm.....”
Here’s an excellent essay from another LWer on the same sort of perspective about how So8res-”terminal”/jimmy-”instrumental” goals are chosen instrumentally as part of finding your niche.
That’s fine. I think we generally agree and are debating terminology. The phrase seems rather dichotomous. I acknowledged the dichotomy in the post and tried to make my intended meaning explicit (see the “Mind the terminology” section), but I’m not too surprised that there’s a communicational gap here.
I think you’re reading more into my post than is there.
I disapprove of this interpretation of my argument. If the aesthetics of my solutions do not appeal to you, then by all means, search for “cleaner” solutions. I don’t know where you got the impression that I’m closed to your suggestions.
First of all, I’m sorry if anything I said came off as confrontational. I like the post, think it makes important points, and I upvoted it. :)
I agree that we generally agree and are debating terminology. I also agree that you are aware of what you mean when you say “terminal”.
I also think that the differing terminology and associated framing leads to important higher order differences. Do you disagree?
Guilty as charged. My disagreement is only with 1) (what appeared to me to be) the underlying frame and 2) specific actionable differences that result from the different frames
Since the pieces of the underlying frame that I disagree with are unspoken and you haven’t taken a stance on the specific differences, I can’t make my point without reading more into the post than is explicitly there. What is your take on the “do the same thing without Dark” approach?
I don’t actually think you’re closed to my suggestions. No accusations of irrationality here—sorry if it came across otherwise!
I just meant that I think it is very valuable and easily worth a spot on your radar
No problem. I’m also aiming for a non-confrontational tone, that’s sometimes difficult in text.
I don’t know. I haven’t pinpointed the higher order differences that you’re trying to articulate.
I do stand by my point that regardless of your definition of “terminal goal”, I can construct a game in which the optimal move is to change them. I readily admit that under certain definitions of “terminal goal” such games are uncommon.
If it’s the branding that’s annoying you, see this comment—it seems my idea of what qualifies as “dark arts” may differ from the consensus.
I’m not entirely sure what you mean by getting the same effects without the “darkness”. I am quite confident that there are mental states you can only access via first-order self deception, and that it is instrumentally rational to do so. Michael Bloom provides another crisp example of this. I am skeptical that there are ways to attain these gains without self-deception.
Agreed.
Although you do explicitly define “dark arts” differently, that doesn’t really change my issues with the branding. I hope the next part of the comment will explain why (well, that and the objections other people have raised)
That link goes to your previous comment instead of the Michael Blume example. Perhaps you mean his othello role?
I don’t think he did anything sketchy there. Since the explicit goal is to pretend to be someone he’s not in a well defined context, this is a fairly perverse game which makes it nice and easy to cleanly compartmentalize. In fact, everything said in character could be prefaced with “Lago would say” and it wouldn’t even be lying. I’d put this one in the “not really lying because every part of him knows what he’s doing” category. There isn’t a resisting “but it’s not real!” because that’s kinda the point. While it’s obviously an actual situation he was in, I think most cases aren’t this easy.
The other application he mentioned (acting confident for picking up an attractive woman) is more representative of the typical case and more tricky to do right. Say you read a couple posts on LW about how it’s okay to deceive the parts of your monkey brain that are getting in your way—and confidence with women is explicitly mentioned as a good time to do it. So you self deceive to think that you’re super attractive and what not without thinking too much about the risks.
Now, what if “confidence” isn’t your only problem? If you were lacking social intelligence/skills before, you’re still lacking them when you’re playing “confident man”—only now you’re going to ignore the rational uncertainty over how a certain social move will be received. This means you end up doing things that are socially miscalibrated and you end up being the creepy guy. And since “I’m creeping this girl out” is incongruent with “I’m the attractive guy that all women want”, you either keep plowing ahead or dismiss the rejection as “her loss!”. Either way your behavior is not good, and furthermore you’re giving up the chance to analyze your feedback and actually develop your social skills.
And of course, that would be stupid. People like MBlume know better than to disappear down this rabbit hole. But plenty of people actually do fall down that hole (hence the stink around “PUA”)
It doesn’t have to be that blatant though. Even if you know to snap out of it and analyze the feedback when you get a “back off creep”, there are going to be more subtle signs that you don’t pick up on because you’re playing confident—heck, there are plenty of subtle signs that people miss just because they’re subtle. I’ve seen a therapist miss these signs badly and go on to advertise the demo on youtube as a successful provocative therapy session—and this is a guy who trains people in provocative therapy! I don’t want to make it any harder for myself to notice when I’m screwing up.
To give a real life example that actually happened to me/someone I know, I taught self hypnosis to a friend and she ended up spraining her ankle. Since she doesn’t have the heuristic to be very very cautious with dark arts, she used self hypnosis to numb the pain. I consider that to be equivalent to compartmentalizing the belief “My ankle isn’t sprained!” because the end state is the same. Once it didn’t hurt anymore, she brilliantly decided to keep running on it… aaaand she ended up regretting that decision.
Since I do have the heuristic to be very very hesitant to use dark arts, when I sprained my foot… okay, to be honest, I kept running on it too because I’m a stubborn idiot, but I did it despite the pain and if it hurt more I would have stopped. When I decided to do something about the pain I was in, I wanted to take the “clean” and “not dark” approach, so I did my thing where I (again, to give crude and insufficient english pointers) “listen to what the pain has to say”. It completely got rid of the suffering (I could still feel the pain sensations, but it wasn’t bothersome in the least and didn’t demand attention. Quite trippy actually)
But the method I used comes with some caveats. The pain said “Are you sure you weren’t doing something you shouldn’t have been?”, and after thinking about it I was able to to decide that I wasn’t. The pain wanted to make sure I took care of myself, and once I agreed to that, there was no more reason to suffer. It wouldn’t have worked if I had tried to avoid realizing that I shouldn’t have been taking that risk in the first place. It would cease to work the minute I try running on it again. These are nice features :)
The basic idea behind the cleaner way is that all your fears and motivations and the like are the result of nonverbal implicit beliefs. These implicit beliefs may or may not agree with your explicit beliefs, and you may or may not be aware of them. (Empirically, they often have useful information that your explicit beliefs don’t, btw). So what you do is to find out where your implicit beliefs are causing issues, what the beliefs actually say, and if they’re right or not. If they’re right, figure out what you want to do about it. If they’re wrong, change them. This is basically coherence therapy
If you were to take a clean approach in the “confidence with women” situation, you’d probably find that some things you were too afraid to do you probably shouldn’t be doing while others are easily worth the risk. Fear in the former category feels right—like a fear of picking a fight with mike tyson—you just don’t do it and everything is cool. In the latter category it’ll turn to excitement (which you can change cleanly if it’s an issue). Since you’re aware that it might not go well and you’ve accepted that possibility, you don’t have to fear it. Awareness without fear allows you to look hard for things you’re doing wrong without coming off as “not confident”.
The other downside of the dark approach is that if you have incomplete compartmentalization (which can be good to avoid the first problem), you can have this nagging “but I’m lying to myself!” thought which can be distracting. And if reality smacks you in the face, you’re forced to drop your lie and you’re stuck with the maladaptive behaviors you were trying to avoid. When done cleanly you’re already prepared for things to go poorly so you can respond effectively.
Fixed, thanks.
I must admit that I’m somewhat confused, here. I make no claims that the described practices are safe, and in fact I make a number of explicit disclaimers stating that they are not safe. It is dangerous to be half a rationalist, and I readily admit that these tools will bite you hard if you misuse them. This, in fact, is something that I assumed was captured by the “Dark Arts” label. I continue to be baffled by how some people complain about the label, others complain about the danger, and still others complain about both at once.
I completely agree that you shouldn’t go around compartmentalizing at every opportunity, and that you should have a deep understanding of the problem at hand before doing any So8res!DarkArts. Prefer other methods, where possible.
I get the impression that my mental model of when self-deception is optimal differs from your own. I don’t currently have time to try to converge these models right now, but suffice to say that your arguments are not addressing the divergence point.
Regardless, I think we can both agree that self-deception is optimal sometimes, under controlled scenarios in which the agent has a strong understanding of the situation. I think we also agree that such things are dangerous and should be approached with care. All this, I tried to capture with the “Dark Arts” label—I am sorry if that did not make it across the communication gap.
I don’t mean to imply that we disagree or that you didn’t put a big enough disclaimer.
I was trying to highlight what the differences were between what happens when you allow yourself to use the “sometimes Dark Arts is the way to go” frame over the “Instead of using Dark Arts, I will study them until I can separate the active ingredient from the Dark” frame, and one of the big ones is the dangers of Dark Arts.
:) “active ingredients aren’t dark”+”inactive ingredients are”
Fair enough
I’ll agree with that in a weak sense, but not in stronger senses.
I’ve never recognised a more effective psychonaut than you. You’ve probably seen further than I, so I’d appreciate your opinion on a hypo I’ve been nursing.
You see the way pain reacts to your thoughts. If you respect its qualia, find a way to embrace them, that big semi-cognisant iceberg of You, the Subconscious, will take notice, and it will get out of your way, afford you a little more self control, a little less carrot and stick, a little less confusion, a little closer to the some rarely attained level of adulthood.
I suspect that every part of the subconscious can be made to yield in the same way. I think introspective gains are self-accelerating, you don’t just get insights and articulations, you get general introspection skills. I seem to have lost hold of it for now, but I once had what seemed to be an ability to take any vague emotional percept and unravel it into an effective semantic ordinance. It was awesome. I wish I’d been more opportunistic with it.
I get the impression you don’t share my enthusiasm for the prospect of developing a culture supportive of deep subconscious integration, or illumination or whatever you want to call it. What have you seen? Found a hard developmental limit? Or, this is fairly cryptic, do tell me if this makes no sense, but are you hostile to the idea of letting your shadow take you by the hand and ferry you over the is-aught divide? I suspect that the place it would take you is not so bad. I think any alternative you might claim to have is bound to turn out to be nothing but a twisted reflection of its territories.
Without involving Omega-like agents? In a realistic setting?
Please write your own article. This is worthy content, but thousand-word comments are an awful medium.
I am not sure about that. In any case some ‘hack’ is required. Merely knowing the right thing to do is not sufficient.
More broadly, the LW community has not really come to terms with the deep emotionality of the human brain. Yes we pay lip service to cognitive biases etc, but the assumption is more or less that if we gain rationality knowledge and skills we will be more rational.
This assumes our rational brain is in charge. In reality we are a fairly intelligent cognitive brain (average IQ on LW high 130s), with various well known limitations, tightly coupled to an emotional 4 year old. And that four year old can make us not see or notice what it doesn’t want us to know. It can make us forget things when it wants. It can control what we pay attention to and what we ignore. It can direct us to come up with elegant rationalisations for whatever it wants us to do.
Take something as simple as weight loss. The solution for most people [not everyone] is actually very simple—each fewer calories of nutrient rich food and exercise moderately. Track your weight and adjust the amount of food and exercise accordingly. Many really smart people struggle to execute this.
So you try to lose weight and what happens? You suddenly realize that you “forgot” your diet and ate a whole lot of food. Or you “realize” that when you try to lose weight you cannot concentrate. Also how irritable I get. Etc. My weight loss breakthrough occurred when I realized that a lot of the “problems” that happened when on weight loss diets were manufactured ie manipulation by my unconscious so it would get its food. As far as it knew, I was in a famine and I needed urgently to get some food, as much as possible actually. I needed to come to an accommodation with my unconscious and get it on board.
My commitment to it is a) I will stop dieting when I get to a healthy weight b) I will eat a nutrient rich diet c) I will reward myself for losing weight in various ways such as taking nootropics that I cannot take when very overweight due blood pressure side effects etc. So I think we are in it together now.
I think we as LWers need to take on board the whole issue of our tightly coupled unconscious and deal with it in a systematic way. The top post and above posts here are a start. I hope there will be a lot more to come.
It is interesting reading this whole thread how many of the gimmicks and tricks that we are talking about have a very long history. I wonder how much apparent irrationality is actually one of these gimmicks. They are irrational when looked at on their own, but in the context of the makeup of our brains they may make perfect sense.
Only tangentially related: do you know of anyone applying hypnotism to helping people recover from traumatic brain damage?
When I was recovering from my stroke, the basic lesson that it’s all about attention was absolutely critical (as was the lesson that failure is OK, as you mention) and I developed a lot of little useful techniques for focusing my attention accordingly, but (embarrassingly) it had never occurred to me until this moment that we actually have an established technology for doing that.
I remember that Erickson used some insights from his health problems to hypnotic therapy, but I don’t know more details.
I couldn’t give you any names, though I do see that kind of thing mentioned from time to time in the hypnosis communities. I don’t know anything about recovering from brain damage myself, but I did find this on google scholar, and there might be some more interesting stuff out there.
Have you looked into what the meditation people have to say?
Yup, I did quite a lot of meditation-related techniques and it helped enormously.
And thanks for the article.
This is one of my favorite things about a certain brand of writing: it’s meta, or somehow self-aware or self-similar, without rubbing the fact in your face. Italo Calvino’s Six Memos for the Next Millennium are also like this (and are the only lit-crit-like thing I’ve ever enjoyed reading).
Heh, I liked using the placebo effect for morning coffee at my office (they had decaf and regular). I didn’t really want to acquire a caffeine dependency, so I’d pour a bit of caffeinated coffee into my cup (not a consistent amount), and then fill it the rest of the way with decaf. Then, as I walked back to my desk, I’d think to myself, “Wow, there’s a mysterious amount of caffeine in this cup! I might wind up with a lot of energy!”
Worked pretty well.
You really should call this “the Prisoner’s dilemma with shared source code,” because the only strategies for the True Prisoner’s Dilemma are CooperateBot and DefectBot, of which it is obviously better to be DefectBot.
Overall, I found most of this post… aggravating? When you separated out local and global beliefs, I was pleased, and I wished you had done the same with ‘optimality.’ When you’re playing the game of “submit source code to a many-way PD tournament, where you know that there’s a DefectBot and many TrollBots,” then the optimal move is “submit code which cooperates with DefectBot,” even though you end up losing points to DefectBot. That doesn’t seem contentious.
But it seems to me that anything you can do with dark arts, I can do with light arts. The underlying insight there is “understand which game you are playing”- the person who balks at cooperating with DefectBot has not scoped the game correctly. Similarly, the person who says “my goal is to become a rockstar” or “my goal is to become a writer” is playing the game where they talk about becoming a rockstar or a writer, not the game where they actually become a rockstar or a writer.
One reason to say “I will not focus on being a rockstar; I will focus on making good music” is the acknowledgment “I will do my part, and the universe may or may not do its part. I can only control me, not the universe.” I don’t see the necessity of any dark side epistemology here.
But maybe it helps you to view it from a dark perspective; if so, keep on viewing it that way.
Thanks for the feedback!
I disagree. The Prisoner’s Dilemma does not specify that you are blind as to the nature of your opponent. “Visible source code” is a device to allow bots an analog of the many character analysis tools available when humans play against humans.
If you think you’re playing against Omega, or if you use TDT and you think you’re playing against someone else who uses TDT, then you should cooperate. I don’t think an inability to reason about your opponent makes the game more “True”.
This is one of the underlying insights, but another is “your monkey brain may be programmed to act optimally under strange parameters”. Someone else linked a post by MBlume which makes a similar point (in, perhaps, a less aggravating manner).
It may be that you gain access to certain resources only when you believe things you epistemically shouldn’t. In such cases, cultivating false beliefs (preferably compartmentalized) can be very useful.
I apologize for the aggravation. My aim was to be provocative and perhaps uncomfortable, but not aggravating.
The transparent version of the Prisoner’s Dilemma, and the more complicated ‘shared source code’ version that shows up on LW, are generally considered variants of the basic PD.
In contrast to games where you can say things like “I cooperate if they cooperate, and I defect if they defect,” in the basic game you either say “I cooperate” or “I defect.” Now, you might know some things about them, and they might know some things about you, but there’s no causal connection between your action and their action, like there is if they’re informed of your action, they’re informed of your source code, or they have the ability to perceive the future.
“Aggravating” may have been too strong a word; “disappointed” might have been better, that I saw content I mostly agreed with presented in a way I mostly disagreed with, with the extra implication that the presentation was possibly more important than the content.
To me, a “vanilla” Prisoner’s Dilemma involves actual human prisoners who may reason about their partners. I don’t mean to imply that I think “standard” PD involves credible pre-commitments nor perfect knowledge of the opponent. While I agree that in standard PD there’s no causal connection between actions, there can be logical connections between actions that make for interesting strategies (eg if you expect them to use TDT).
On this point, I’m inclined to think that we agree and are debating terminology.
That’s even worse! :-)
I readily admit that my presentation is tailored to my personality, and I understand how others may find it grating.
That said, a secondary goal of this post was to instill doubt in concepts that look sacred (terminal goals, epistemic rationality) and encourage people to consider that even these may be sacrificed for instrumental gains.
It seems you already grasp the tradeoffs between epistemic and instrumental rationality and that you can consistently reach mental states that are elusive to naive epistemically rational agents, and that you’ve come to these conclusions by a different means than I. By my analysis, there are many others who need a push before they are willing to even consider “terminal goals” and “false beliefs” as strategic tools. This post caters more to them.
I’d be very interested to hear more about how you’ve achieved similar results with different techniques!
Although strictly speaking, if you’re playing against Omega or another TDT user, you’re not really playing a true PD any more. (Why? Because mutual cooperation reveals preferences that are inconsistent with the preferences implied by a true PD payoff table.)
Reading the ensuing disagreement, this seems like a good occasion to ask whether this is a policy suggestion, and if so what it is. I don’t think So8res disagrees about any theorems (e.g. about dominant strategies) over formalisms of game theory/PD, so it seems like the scope of the disagreement is (at most) pretty much how one should use the phrase ‘Prisoner’s Dilemma’, and that there are more direct ways of arguing that point, e.g. pointing to ways in which the term (‘Prisoner’s Dilemma’) originally used for the formal PD also being used for e.g. playing against various bots causes thought to systematically go astray/causes confusion/etc.
Pretty much. Cashing out my disagreement as a policy recommendation: don’t call a situation a true PD if that situation’s feasible set doesn’t include (C, D) & (D, C). Otherwise one might deduce that cooperation is the rational outcome for the one-shot, vanilla PD. It isn’t, even if believing it is puts one in good company.
As I understand it, Hofstadter’s advocacy of cooperation was limited to games with some sense of source-code sharing. Basically, both agents were able to assume their co-players had an identical method of deciding on the optimal move, and that that method was optimal. That assumption allows a rather bizarre little proof that cooperation is the result said method arrives at.
And think about it, how could a mathematician actually advocate cooperation in pure, zero knowledge vanilla PD? That just doesn’t make any sense as a model of an intelligent human being’s opinions.
Agreed. But here is what I think Hofstadter was saying: The assumption that is used can be weaker than the assumption that the two players have an identical method. Rather, it just needs to be that they are both “smart”. And this is almost as strong a result as the true zero knowledge scenario, because most agents will do their best to be smart.
Why is he saying that “smart” agents will cooperate? Because they know that the other agent is the same as them in that respect. (In being smart, and also in knowing what being smart means.)
Now, there are some obvious holes in this, but it does hold a certain grain of truth, and is a fairly powerful result in any case. (TDT is, in a sense, a generalization of exactly this idea.)
Have you seen this explored in mathematical language? Cause it’s all so weird that there’s no way I can agree with Hofstadter to that extent. As yet, I don’t know really know what “smart” means.
Yeah, I agree, it is weird. And I think that Hofstadter is wrong: With such a vague definition of being “smart”, his conjecture fails to hold. (This is what you were saying: It’s rather vague and undefined.)
That said, TDT is an attempt to put a similar idea on firmer ground. In that sense, the TDT paper is the exploration in mathematical language of this idea that you are asking for. It isn’t Hofstadterian superrationality, but it is inspired by it, and TDT puts these amorphous concepts that Hofstadter never bothered solidifying into a concrete form.
What ygert said. So-called superrationality has a grain of truth but there are obvious holes in it (at least as originally described by Hofstadter).
Sadly, even intelligent human beings have been known to believe incorrect things for bad reasons.
More to the point, I’m not accusing Hofstadter of advocating cooperation in a zero knowledge PD. I’m accusing him of advocating cooperation in a one-shot PD where both players are known to be rational. In this scenario, too, both players defect.
Hofstadter can deny this only by playing games(!) with the word “rational”. He first defines it to mean that a rational player gets the same answer as another rational player, so he can eliminate (C, D) & (D, C), and then and only then does he decide that it also means players don’t choose a dominated strategy, which eliminates (D, D). But this is silly; the avoids-dominated-strategies definition renders the gets-the-same-answer-as-another-rational-player definition superfluous (in this specific case). Suppose it had never occurred to us to use the former definition of “rational”, and we simply applied the latter definition. We’d immediately notice that neither player cooperates, because cooperation is strictly dominated according to the true PD payoff matrix, and we’d immediately eliminate all outcomes but (D, D). Hofstadter dodges this conclusion by using a gimmick to avoid consistently applying the requirement that rational players don’t leave free utility on the table.
I disagree. Mutual cooperation need not require preferences inconsistent with the payoff table: it could well be that I would prefer to defect, but I have reason to believe that my opponent’s move will be the same as my own (not via any causal mechanism but via a logical relation, eg if I know they use TDT).
But the true PD payoff table directly implies, by elimination of dominated strategies, that both players prefer D to C. If a player instead plays C, the alleged payoff table can’t have represented the actual utility the players assigned to different outcomes.
Playing against a TDT user or Omega or a copy of oneself doesn’t so much refute this conclusion as circumvent it. When both players know their opponent’s move must match their own, half of the payoff table’s entries are effectively deleted because the (C, D) & (D, C) outcomes become unreachable, which means the table’s no longer a true PD payoff table (which has four entries). And notice that with only the (C, C) & (D, D) entries left in the table, the (C, C) entry strictly dominates (D, D) for both players; both players now really & truly prefer to cooperate, and there is no lingering preference to defect.
TDT players achieving mutual cooperation are playing the same game as causal players in Nash equilibrium.
I’m not sure how you think that the game is different when the players are using non-causal decision theories.
I don’t think it’s using a non-causal decision theory that makes the difference. I could play myself in front of a mirror — so that my so-called opponent’s move is directly caused by mine, and I can think things through in purely causal terms — and my point about the payoff table having only two entries would still apply.
What makes the difference is whether non-game-theoretic considerations circumscribe the feasible set of possible outcomes before the players try to optimize within the feasible set. If I know nothing about my opponent, my feasible set has four outcomes. If my opponent is my mirror image (or a fellow TDT user, or Omega), I know my feasible set has two outcomes, because (C, D) & (D, C) are blocked a priori by the setup of the situation. If two human game theorists face off, they also end up ruling out (C, D) & (D, C), but only in the process of whittling the original feasible set of four possibilities down to the Nash equilibrium.
OK, I upvoted it before reading, and now that I have read it, I wish there were a karma transfer feature, so I could upvote it a dozen times more :) Besides the excellent content, it is exemplary written ( engaging multi-level state-explain-summarize style, with quality examples throughout).
By the way, speaking of karma transfer, here is one specification of such a feature: anyone with, say, 1000+ karma should be able to specify the number of upvotes to give, up to 10% of their total karma (diluted 10x for Main posts, since currently each Main upvote gives 10 karma points to OP). The minimum and transfer thresholds are there to prevent misuse of the feature with sock puppets.
Now, back to the subject at hand. What you call compartmentalization and what jimmy calls attention shifting I imagine in terms of the abstract data type “stack”: in your current context you create an instance of yourself with a desired set of goals, then push your meta-self on stack and run the new instance. It is, of course, essential that the instance you create actually pops the stack and yields control at the right time, (and does not go on creating and running more instances, until you get a stack overflow and require medical attention to snap back to reality). Maybe it’s an alarm that goes off internally or externally after a certain time (a standard feature in hypnosis), and/r after a certain goal is achieved (e.g. 10 pages of a novel).
Also, as others pointed out, your ideas might be a better sell if they are not packaged as Dark Arts, which they are not, but rather as being meta-rational. For example, “local irrationality may be globally rational”, or, somewhat more mathematically, “the first approximation to rationality need not appear rational”, or even more nerdy “the rationality function is nonlinear, its power series has a finite convergence radius”, or something else, depending on the audience. Then again, calling it Dark Arts has a lot of shock value in this forum, which attracts interest.
To clarify, what he calls compartmentalization I call compartmentalization. I’d just recommend doing something else, which, if forced to name, I’d call something like “having genuine instrumental goals instead of telling yourself you have instrumental goals”.
When I say “attention shifting” I’m talking abut the “mental bit banging” level thing. When you give yourself the (perhaps compartmentalized) belief that “this water will make me feel better because it has homeopathic morphine in it”, that leads to “so when I drink this water, i will feel better” which leads to anticipating feeling better which leads to pointing your attention to good feelings to the exclusion of bad things.
However, you can get to the same place by deciding “i’m going to drink this water and feel good”—or when you get more practiced at it, just doing the “feeling good” by directing attention to goodness without all the justifications.
(that bit of) my point is that knowing where the attention is screens off how it got there in terms of its effects, so might as well get there through a way that does not have bad side effects.
LW already feels uncomfortably polarized with a clique of ridiculously high karma users at the top. I don’t think giving additional power to the high-karma users is a good idea for the long term.
Huh. never noticed that. A clique? What an interesting perspective. How much karma do you mean? Or is it some subset of high karma users? For example, I happen to have just over 10k karma, does it make me a clique member? What about TheOtherDave, or Nancy? How do you tell if someone is in this clique? How does someone in the clique tell if she is?
Presumably you joined a while ago, when there weren’t so many intimidating high-karma users around
Yes on all counts. You’re clearly the cool kids here.
You see them talk like they know each other. You see them using specialized terms without giving any context because everybody knows that stuff already. You see their enormous, impossible karma totals and wonder if they’ve hacked the system somehow.
Dunno. It probably looks completely different from the other side. I’m just saying that’s what it feels like (and this is bad for attracting new members), not that’s what it’s really like.
(nods) True enough.
That said, you’re absolutely right that it looks completely different from “the other side.”
What does it look like?
Well, for example, I’m as aware of the differences between me and shminux as I ever was. From my perspective (and from theirs, I suspect), we aren’t nearly as homogenous as we apparently are from lmm’s perspective.
For the record, I have also noticed this subset of LW users—I tend to think of them as “Big Names”—and:
You could ask the same of any clique;
It seems like these high-profile members are actually more diverse in their opinions than mere “regulars”.
Of course, this is just my vague impression. And I doubt it’s unique to LessWrong, or particularly worrying; it’s just, y’know, some people are more active members of the community or however you want to phrase it.
(I’ve noticed similar “core” groups on other websites, it’s probably either universal or a hallucination I project onto everything.)
Relevant link: The Tyranny of Structurelessness (which is mostly talking about real-life political groups, but still, much of it is relevant):
I always assumed I’d get a black cloak and a silver mask in the mail once I break 10K and take the Mark of Bayes. Isn’t that what happens?
We aren’t supposed to talk about it.
What if going beyond 1 vote cost karma—so you’d actually need to spend, not just apply, that karma?
I can’t see people using it to the point where their karma-flow went negative, so I don’t think it really helps. It’s a less bad idea, but not I think a good one.
Great post!
This reminds me of the fact that, like many others, on many tasks I tend to work best right before the deadline, while doing essentially nothing in the before that deadline. Of course, for some tasks this is actually the rational way of approaching things (if they can be accomplished sufficiently well in that time just before the deadline). But then there are also tasks for which it would be very useful if I could temporarily turn on a belief saying that this is urgent.
Interestingly, there are also tasks that I can only accomplish well if there isn’t any major time pressure, and a looming deadline prevents me from getting anything done. For those cases, it would be useful if I could just turn off the belief saying that I don’t have much time left.
This is partially the reason why I think that there are people that are incredibly productive and efficient in a structured work environment (like having a boss), and then become less motivated and productive after quitting their jobs and working for their own start-up.
Expectations from people that depend on and demand from you is very motivating, much like a deadline. Start-ups are hard because everyday you have to get up and do something for yourself and suddenly you aren’t working as many hours as you were when you were grinding it out in the corporate world.
It’s not the belief per se, just the emotion. It would be convenient if the emotion could be changed without changing the belief (to something false). Then again, self-motivation does involve sometimes changing beliefs by changing reality (e.g. Beeminder) -- maybe it isn’t much of a stretch to change the beliefs in some structured way.
This was awesome, but the last bit terrified me. I don’t know what that means exactly, but I think it means I shouldn’t do it. I’m definitely going to try the placebo thing. Unfortunately, the others don’t seem… operationalized? enough to really try implementing—what is it that you do to create a compartment in which you are unstoppable? How do you convince that part of yourself/yourself in that mode of that?
Yeah, that’s a good question. I haven’t given any advice as to how to set up a mental compartment, and I doubt it’s the sort of advice you’re going to find around these parts :-)
Setting up a mental compartment is easier than it looks.
First, pick the idea that you want to “believe” in the compartment.
Second, look for justifications for the idea and evidence for the idea. This should be easy, because your brain is very good at justifying things. It doesn’t matter if the evidence is weak, just pour it in there: don’t treat it as weak probabilistic evidence, treat it as “tiny facts”.
It’s very important that, during this process, you ignore all counter-evidence. Pick and choose what you listen to. If you’ve been a rationalist for a while, this may sound difficult, but it’s actually easy. You’re brain is very good at reading counter-evidence and disregarding it offhand if it doesn’t agree with what you “know”. Fuel that confirmation bias.
Proceed to regulate information intake into the compartment. If you’re trying to build up “Nothing is Beyond My Grasp”, then every time that you succeed at something, feed that pride and success into the compartment. Every time you fail, though, simply remind yourself that you knew it was a compartment, and this isn’t too surprising, and don’t let the compartment update.
Before long, you’ll have this discontinuous belief that’s completely out of sync with reality.
That sounds plausible.
It is not clear how to do this.
Hmm. This is the machinery of compartmentalization at play—you can get evidence against belief X and update on it while still maintaining belief X in full force. I’m not sure I can articulate how it’s done. It’s just… the beliefs live in separate magisteria, you see.
This is hard to explain in part because preventing the compartment from updating is not an act of will, it’s an act of omission. You build this wall between your belief systems, and then updates no longer propagate across the wall unless you force them to. To prevent compartment updates, you simply don’t force the compartment to update.
This may sound difficult, but I find that in practice it’s so easy it’s scary. This may differ from person to person, and I wouldn’t be surprised if many people find my methods difficult.
I wouldn’t be surprised if not everybody could do this at all. Could be typical mind fallacy to assume such. Human minds differ quite a lot. Just think about
visual vs. metaphorical ‘imagination’
weight on abstract/sybolic vs. on concrete/fuzzy reasoning.
eidetic memory or not
synaestesis or not
Upvoted before reading past the summary for sheer bravery.
Please don’t upvote before reading past the summary
It’s ok, votes are reversible.
I guess it’s a lot more forgivable if you’re precommitted to reading past the summary, but it still feels icky to me. For one thing, if you upvote based on an initial impression and then read the full post, it’ll probably bias your later evaluation of the post in the direction of your initial vote, because cognitive dissonance.
Upvoted before reading past the summary, but not really for bravery—more for sheer fun. Advocating “wrong” viewpoints, and coming up with counterintuitive solutions that nevertheless work, and in fact work better than conventional wisdom, is one of the best feelings I know.
This article is awesome! I’ve been doing this kind of stuff for years with regards to motivation, attitudes, and even religious belief. I’ve used the terminology of “virtualisation” to talk about my thought-processes/thought-rituals in carefully defined compartments that give me access to emotions, attitudes, skills, etc. I would otherwise find difficult. I even have a mental framework I call “metaphor ascendence” to convert false beliefs into virtualised compartments so that they can be carefully dismantled without loss of existing utility. It’s been nearly impossible to explain to other people how I do and think about this, though often you can show them how to do it without explaining. And for me the major in-road was totally a realisation that there exist tasks which are only possible if you believe they are—guess I’ll have to check out The Phantom Tollbooth (I’ve never read it).
This might be a bit of a personal question (feel free to pm or ignore), but have you by any chance done this with religious beliefs? I felt like I got a hint of that between the lines and it would be amazing to find someone else who does this. I’ve come across so many people in my life who threw away a lot of utility when they left religion, never realising how much of it they could keep or convert without sacrificing their integrity. One friend even teasingly calls me the “atheist Jesus” because of how much utility I pumped back into his life just by leveraging his personal religious past. Religion has been under strong selective pressure for a long time, and has accumulated a crapload of algorithmic optimisations that can easily get tossed by their apostates just because they’re described in terms of false beliefs. My line is always, “I would never exterminate a nuisance species without first sequencing its DNA.” You just have to remember that asking the organism about its own DNA is a silly strategy.
Anyways, I could go on for a long time about this, but this article has given me the language to set up a new series I’ve been trying to rework for Less Wrong, along the lines of this, so I better get cracking. But the buzz of finding someone like-minded is an awesome bonus. Thank you so much for posting.
p.s. I have to agree with various other commenters that I wouldn’t use the “dark arts” description myself—mind optimisation is at the heart of legit rationality. But I see how it definitely makes for useful marketing language, so I won’t give you too much of a hard time for it.
To address your other question: I was raised religious, and I learned about compartmentalization by self-observation (my religion was compartmentalized for a couple years before I noticed what I was doing). That said, since becoming an atheist I have never held a compartmentalized religious belief for motivational purposes or otherwise.
Ah well, I had to ask. I know religion is usually the “other team” for us, so I hope I didn’t push any buttons by asking—definitely not my intention.
To address your postscript: “Dark Arts” was not supposed to mean “bad” or “irrational”, it was supposed to mean “counter-intuitive, surface-level irrational, perhaps costly, but worth the price”.
Strategically manipulating terminal goals and intentionally cultivating false beliefs (with cognitive dissonance as the price) seem to fall pretty squarely in this category. I’m honestly not sure what else people were expecting. Perhaps you could give me an idea of things that squarely qualify as “dark arts” under your definition?
(At a guess, I suppose heavily leveraging taboo tradeoffs and consequentialism may seem “darker” to the layman.)
How about extending the metaphor and calling these techniques “Rituals” (they require a sacrifice, and even though it’s not as “permanent” as in HPMOR, it’s usually dangerous), reserving “Dark” for the arguably-immoral stuff?
In my understanding “Dark Arts” mean, basically, using deceit. In the social context that implies manipulating others for your own purposes. I don’t think “Dark Arts” is a useful term in the context of self-motivation.
Huh. Personally, I feel that the need for comments such as this one strongly indicate “Dark” subject matter. It’s interesting to get a different perspective. Thanks!
Techniques that have negative side effects for other people. Economics models that recommend that people should defect more because it’s in their self interest are “dark”.
Excellent post. I’m glad to find some more stuff on LessWrong that’s directly applicable to real life and the things I’m doing right now.
Overall, an excellent post. It brought up some very clever ideas that I had never thought of or previously encountered.
I do, however, think that your colloquial use of the phrase “terminal value” is likely to confuse and/or irritate a lot of the serious analytic-philosophy crowd here; it might be wise to use some other word or other phrase for your meaning, which seems to be closer to “How an idealized[1] utility-maximizing agent would represent its (literal) Terminal Values internally”. Perhaps a “Goal-in-itself”? A “motivationally core goal”?
Not perfectly idealized, as your post points out
Thanks!
Early on, before “Mind the terminology, here”, the ambiguity is intentional. That your terminal goals are not sacred is true both for “utility function end-values” and “actions that are ends unto themselves”.
Later on in the post, I am explicit about my usage and consistent with the LessWrong Wiki entry on Terminal Values (which is also dichotomous). Still, I understand that some people are likely to get annoyed: this is difficult to avoid when there is a phrase that means too many things.
For now, I think that the usage is explicit enough in the post, but I do appreciate your input.
This post makes a very similar point.
A terminology sometimes used for what you call “System 1′s beliefs” and “System 2′s beliefs” is “aliefs” and “beliefs” respectively.
Another small example. I have a clock near the end of my bed. It runs 15 minutes fast. Not by accident, it’s been reset many times and then set back to 15 minutes fast. I know it’s fast, we even call it the “rocket clock”. None of this knowledge diminishes it’s effectiveness at getting me out of bed sooner, and making me feel more guilty for staying up late. Works very well.
Glad to discover I can now rationalise it as entirely rational behaviour and simply the dark side (where “dark side” only serves to increase perceived awesomeness anyway).
EDIT: I think the effects were significantly worse than this and caused a ton of burnout and emotional trauma. Turns out thinking the world will end with 100% probability if you don’t save it, plus having heroic responsibility, can be a little bit tough sometimes...
I worry most people will ignore the warnings around willful inconsistency, so let me self-report that I did this and it was a bad idea. Central problem: It’s hard to rationally update off new evidence when your system 1 is utterly convinced of something. And I think this screwed with my epistemics around Shard Theory while making communication with people about x-risk much harder, since I’d often typical mind and skip straight to the paperclipper—the extreme scenario I was (and still am to some extent) trying to avoid as my main case.
When my rationality level is higher and my takes have solidified some more I might try this again, but right now it’s counterproductive. System 2 rationality is hard when you have to constantly correct for false System 1 beliefs!
Just read a paper suggesting that “depleting” willpower is a mechanism that gradually prioritizes “want to do” goals over “should/have to do” goals. I’m guessing that “terminal goal hacking” could be seen as a way to shift goals from the “should/have to do” category to the “want to do” category, thus providing a massive boost in one’s ability to actually do them.
Ah! So that’s what I’ve been doing wrong. When I tried to go to the gym regularly with the goal of getting stronger/bigger/having more energy, the actual process of exercising was merely instrumental to me so I couldn’t motivate myself to do it consistently. Two of my friends who are more successful at exercising than me have confirmed that for them exercising is both instrumental and a goal in and of itself.
But while I’m down with the idea of hacking terminal goals, I have no idea how to do that. Whereas compartmentalizing is easy (just ignore evidence against the position you want to believe), goal hacking sounds very difficult. Any suggestions/resources for learning how to do this?
Yes, I have a couple suggestions.
The first suggestion is to try to start thinking of yourself as a fitness buff and actually do things to make yourself start thinking this way. So for example when you are waiting for the morning train, you could glance at fitness magazines instead of what you usually look at. Or you could occasionally buy an inexpensive fitness-related chochke, for example a wrist gripper to keep at your desk at work. Make an effort to look at other peoples’ bodies and note who appears more fit.
So on a cold dark morning when you are deciding whether or not to go the gym, the reasoning should be “I am a fitness buff and fitness buffs go to the gym every day” as opposed to “I want to be stronger/fitter so I should go to the gym.”
The second suggestion is not exactly a goal substitution but it’s a pretty similar concept—what I would call a tie-in. Tape a blank piece of paper to the wall of your office or bedroom. After each workout, draw another tick mark on the paper. For nerdy people, there seems to be a lot of satisfaction in accumulating points. So you are basically tying the goal of getting strong/fit to the goal of accumulating points.
If you think of “fitness buffs” as being a special clique, then noticing other potential members of the clique and feeling like you belong to something bigger than yourself can reinforce your “fitness buff” identity. However, this technique can be very counterproductive. Women in the US and elsewhere are often socialized to compete with each other on the basis of appearance already, so noticing fitter women is already something that we do with not necessarily very positive results.
Actually, I try to keep appearance out my mind when I exercise because I’ve had issues with body dysmorphic disorder. Instead, I identify as someone who’s into, say, urban cycling. Now I notice when other people are carrying a bike helmet with them when they’re in a shop, for instance. I feel like a part of a group, and this feeling of identity encourages me to keep biking.
By the way, the idea that you can tell how fit or healthy someone is just by looking at them isn’t correct. Some thin, healthy-looking people don’t exercise, and some people who are overweight are actually quite healthy according to other measures of fitness, so I’d shy away from using appearance as a proxy for fitness.
That’s not exactly what I was saying, although your point here is correct too I suspect. My point is that fitness buffs have a tendency to look at other peoples’ bodies just as car buffs have a tendency to look at other peoples’ cars and gun buffs have a tendency to look at other peoples’ guns.
Well that’s a different issue. But I do think there’s more of a problem with people putting not enough mental energy into fitness than too much.
You can make a decent guess about a person’s level of fitness based on their appearance. To be sure it will not be perfect. Besides which, the point is not to be a perfect judge of fitness—the point is to find ways to tweak one’s goals in order to have sufficient motivation to go to the gym regularly.
Thanks! The scariness of self-modification didn’t really sink in until I read your comment. I think I can generalize that advice to the other things I’d like to do as well.
What works for me is that I like going to the gym just to go to the gym. More specifically, I love the feeling of being able to lift lots of weight and I love the feeling of being sore the next day. This isn’t necessarily changing an instrumental goal into a terminal goal, but it does sort of reduce the distance between the original terminal goal and the instrumental goal. The original terminal goal (getting fit) has been moved closer in “goalspace” to the instrumental goal (lifting at the gym) by transforming into the terminal goal (get sore).
So in general it might be easier to think of shifting goalposts as far as terminal/instrumental goals. Picking a terminal goal that’s “closer” in goalspace to your original instrumental goal might make it more manageable.
The nice thing about hacking instrumental goals into terminal goals is that while they’re still instrumental you can easily change them.
In your case: You have the TG of becoming fit (BF), and you previously decided on the IG of going to the gym (GG). You’re asking about how to turn GG into a TG, which seems hard.
But notice that you picked GG as an instrument towards attaining BF before thinking about Terminal Goal Hacking (TGH), which suggests it’s not optimal for attainging BF via TGH. The better strategy would be to first ask yourself if another IG would work better for the purpose. For example, you might want to try lots of different sports, especially those that you instinctively find cool, or, if you’re lucky, that you’re good at, which means that you might actually adopt them as TGs more-or-less without trying.
(This is what happened to me, although in my case it was accidental. I tried bouldering and it stuck, even though no other sport I’ve tried in the previous 30 years did.)
Part of the trick is to find sports (or other I/TG candidates) that are convenient (close to work or home, not requiring more participants than you have easy access to) and fun to the point that when you get tired you force yourself to continue because you want to play some more, not because of how buff you want to get. In the sport case try everything, including variations, not just what’s popular or well known, you might be surprised.
(In my case, I don’t much like climbing tall walls—I get tired, bored and frustrated and want to give up when they’re too hard. One might expect that bouldering would be the same (it’s basically the same thing except with much shorter but harder walls), but the effect in my case was completely different: if a problem is too hard I get more motivated to figure out how climb it. The point is not to try bouldering, but to try variations of sports. E.g., don’t just try tennis and give up; try doubles and singles, try squash, try ping-pong, try real tennis, try badminton, one of those might work.)
I had some good results here with the strategy of recovering from a stroke, but I don’t especially recommend it.
Did you ever do this? Or are you still running on some top-down overwritten intuitive models?
If you did back out, what was that like? Did you do anything in particular, or did this effect fade over time?
I don’t think that this is quite right actually.
If the psychological link between them is strong in the right way, the instrumental goal will feel as appealing as the terminal goal (because succeeding at the instrumental goal feels like making progress on the terminal goal). Techniques like this one work.
I think the phenomenon that is being described here is not actually about instrumental goals vs. terminal goals. It’s about time horizons and cost-benefit. We’re more motivated to do things when the reward is immediate than when the reward is far off in time. The number of instrumental steps between our current action and the reward is only relevant in that it correlates with the time horizon.
-
What great timing! I’ve just started investigating the occult and chaos magick (with a ‘k’) just to see if it works.
I’d love to see a top-level post about this when you’re done!
Another nail hit squarely on the head. Your concept of a strange playing field has helped crystallize an insight I’ve been grappling with for a while—a strategy can be locally rational even if it is in some important sense globally irrational. I’ve had several other insights which are specific instances of this and which I only just realized are part of a more general phenomenon. I believe it can be rational to temporarily suspend judgement in the pursuit of certain kinds of mystical experiences (and have done this with some small success), and I believe that it can be rational to think of yourself as a causally efficacious agent even when you know that humans are embedded in a stream of causality which makes the concept of free will nonsensical.
It seems impossible to choose whether to think of ourselves as having free will, unless we have already implicitly assumed that we have free will. More generally the entire pursuit of acting more rational is built on the implicit premise that we have the ability to choose how to act and what to believe.
That just means that you lack the ability to think in non-free will mental frameworks. I don’t think a calvinist who thinks that God is in control from everything has a problem with the word choose and thinking of himself as making choices.
Not completely. It’s build on the idea that we have an influence on what we choose and what you believe.
If you take a belief like: “There’s no ego depletion.” it’s not straightforward to acquire that belief. All the studies that tell me that having that belief means not having ego depletion are not enough to help me acquire that belief while I’m suffering from ego depletion in the present.
For my own part, I didn’t find it too difficult to reconceptualize my understanding of what a “choice” was when dealing with the knowledge that I will predictably fail to choose certain options which nevertheless feel like choices I’m free to make.
The experience of choosing to do something is part of the experience of doing certain things, just like the experience of feeling like I could have chosen something different is. These feelings have no particular relationship to the rest of the world, any more than my feeling scared of something necessarily means that thing is frightening in any objective sense.
My point is that we can’t help but think of ourselves as having free will, whatever the ontological reality of free will actually is.
I really liked the introduction—really well done. (shminux seems to agree!)
Some constructuve criticisms:
‘There are playing fields where you should cooperate with DefectBot, even though that looks completely insane from a naïve viewpoint. Optimality is a feature of the playing field, not a feature of the strategy.’ - I like your main point made with TrollBot, but this last sentence doesn’t seem like a good way of summing up the lesson. What the lesson seems to be in my eyes is: strategies’ being optimal or not is playing-field relative. So you could say that optimality is a relation holding between strategies and playing fields.
Later on you say ‘It helps to remember that “optimality” is as much a feature of the playing field as of the strategy.’ - but, my criticism above aside, this seems inconsistent with the last sentence of the previous quote (here you say optimality is equally a feature of two things, whereas before you said it was not a feature of the strategy)! Here you seem to be leaning more toward my proposed relational gloss.
Another suggestion. The Omega argument comes right after you say you’re going to show that we occupy a strange playing field right now. This tends to make the reader prepare to object ‘But that’s not very realistic!’. Maybe you like that sort of tension and release thing, but my vote would be to first make it clear what you’re doing there—i.e., not right away arguing about the real world, but taking a certain step toward that.
One final suggestion. You write ‘Knowing this, I have a compartment in which my willpower doesn’t deplete’, and something relevantly similar just earlier. Now this is obviously not literally what you mean—rather, it’s something like, you have a compartment housing the belief that your willpower doesn’t deplete. Obviously, you get a certain literary effect by putting it the way you do. Now, I realize reasonable people may think I’m just being overly pedantic here, but I suspect that’s wrong, and that in this sort of discussion, we should habitually help ourselves to such easily-had extra precision. Since things get confusing so quickly in this area, and we’re liable to slip up all over the place, apparently minor infelicities could make a real difference by sapping resources which are about to be taxed to the full.
Thanks, I’ve edited the post to incorporate these suggestions.
I’m incapable of comprehending this mental state. If I didn’t see so much evidence all around me that there are lots and lots of people who seem to just want money as an end in itself, just for the sake of having it, I don’t think I could believe that anyone could possibly seriously think that way. Maybe it’s because I’ve never had much money, but I have tons of ideas for what I’d do if money was no object. Most of them are things I already want to do (and have probably wanted to do for a long time), but can’t do because I don’t have the money. Which… to me implies that people who want money as a terminal goal have either never had another goal, or have never had another goal that they haven’t been able to achieve for lack of money. The latter should be a good thing, at least for some people, but just brings me back to the question of what the hell they want money for. If you don’t have any other goals, or you only have goals that don’t require money to achieve, where is this “acquire money” goal even coming from?
See what I mean? I really just can’t wrap my head around it.
I think I’ve been able to make outstanding progresses last year in improving rationality and starting to work on real problems mostly because of megalomaniac beliefs that were somewhat compartmentalised but that I was able to feel at a gut level each time I had to start working.
Lately, as a result of my progresses, I’ve started slowing down because I was able to come to terms with these megalomaniac beliefs and realise at a gut level they weren’t accurate, so a huge chunk of my drive faded, and my predictions about my goals updated on what I felt I could realise with the drive I was feeling, even if I knew that destroying these beliefs was a sign I really improved and was learning how actually hard it is to do world changing stuff...
I’ll definitely give this a trial run, trying to chain down those beliefs and pull them out as fuel when I need to.
I think the distinction (and disjunction) between instrumental and terminal goals is an oversimplification, at least when applied to motivation (as you’ve demonstrated). My current understanding of goal-setting is that instrumental goals can also be terminal in the sense that one enjoys or is in the habit of doing them.
To take the rock star example: It’s not a lie to enjoy practicing or to enjoy making music, but it’s still true that making good at music is also instrumental to the goal of becoming a rock-star. I might say that making music as being instrumental and terminal, and becoming a rock star as terminal (or instead of terminal, directly instrumental to happiness / satisfaction / utility).
I suppose you’re not necessarily saying that the dark arts are irrational, since the optimal choice depends on the playing field. So I guess I agree, only that I still think that anything that lying to oneself is unnecessary, you just need more qualifying statements, like “I do indeed enjoy this and want to do this, and it so happens that it helps me achieve some other goal”.
Perhaps another way of describing the pitfall of avoiding the dark arts is confusing rationality with Straw Vulcan Rationality and failing to allow instrumental goals feel meaningful.
I suppose this doesn’t really work for something that is purely instrumental. Also, it doesn’t totally address your example for willful inconsistency, where it’d be very tempting to make excuses to work less hard. Or maybe not?
I’m still resistant to the idea that motivation is an illogical playing field, and think that you can be rational (that is, you can always recall the true belief when necessary) if you just use specific enough terminology. Like, differentiating oughts from statements of uncertainty, in your example of willful inconsistency—you ought to work really hard on AI alignment because of expected value. I guess here is where my resistance to motivation being illogical breaks down.
Great content, but this post is significantly too long. Honestly, each of your main points seems worthy of a post of its own!
I’d say that all points are too long by themselves, so if you split the post into several they will still be too long.
I propose that we reappropriate the white/black/grey hat terminology from the Linux community, and refer to black/white/grey cloak rationality. Someday perhaps we’ll have red cloak rationalists.
To summarize, belief in things that are not actually true, may have beneficial impact on your day to day life?
You don’t really need require any level of rationality skills to arrive at that conclusion, but the writeup is quite interesting.
Just don’t fall in the trap of thinking I am going to swallow this placebo and feel better, because I know that even though placebo does not work… crap. Let’s start from the beginning....
The major points are:
You can’t judge the sanity of a strategy in a vacuum
Your mind may allocate resources differently to “terminal” and “instrumental” goals; leverage this
Certain false beliefs may improve productivity; consider cultivating them
Use compartmentalization to mitigate the negative effects of (3)
You already are a flawed system, expect that the optimal strategies will sometimes look weird
Certain quirks that look like flaws (such as compartmentalization) can be leveraged to your advantage
The insane thing is that you can think “I am going to swallow this placebo and feel better, even though I know it’s a placebo” and it still works. Like I said, brains are weird.
Huh? What are you talking about? Placebo does work (somewhat, in some cases). Placebo even works when you know it’s a placebo. Even if you don’t use these techniques. There’s studies and everything.
The brain is absurdly easy to fool when parts of it are complicit. Tell yourself “I will do twenty push-ups and then stop”, put your everything in the twenty knowing you’ll stop after, and after you reach twenty just keep going for a few more until your arms burn. This will work reliably and repeatably. Your brain simply does not notice that you’re systematically lying to it.
He did call it a trap. (And I have a feeling that that’s something that has happened to me quite often, though I can’t think of any good particular example I’d be willing to share.)
Isn’t the feeling of pain when you exercise too much There For A Reason?
(OK, it might be there for a reason that applied in the environment of evolutionary adaptedness but no longer applies today. But as a general heuristic I’m wary of refusing to listen at what my body is telling me unless I’m reasonably sure what I’m doing.)
The model I’m familiar with is that muscular soreness comes from microscopic trauma to the muscle fibers being exercised, and that the same trauma ultimately leads to increased strength (as muscles adapt to prevent it). That’s clearly a simplified model, though: for example, later exercise can help ease soreness (though it hurts more to start with), while the opposite would be true if it was a pure function of trauma.
With my sketchy evopsych hat on, I might speculate that it helps prevent possible damage from overexertion, but that people in the EEA would rapidly habituate to their normal (albeit very high by our standards) levels of activity. Now that we’re relatively very sedentary, such a signal, at least at its ancestral levels of sensitivity, might end up being counterproductive from a health perspective.
From the “reliably”, “repeatedly” and “systematically” in FeepingCreature’s comment, I guessed they weren’t talking about a previously sedentary person just starting to exercise. (I think I was going to clarify I was mainly talking about the middle and long term, but forgot to.) I do expect it to be normal for my muscles to hurt when I’ve just started exercising after weeks of inactivity.
The “feeling of pain when you exercise” is just lactic acid.
Essentially, your body has multiple pathways to generate energy. The “normal” one uses oxygen, but when you spend energy too fast and your lungs and blood cannot supply and transport enough, the cells switch to another pathway. It’s less efficient but it doesn’t need oxygen. Unfortunately a byproduct of this metabolic pathway is lactic acid and it is precisely its accumulation that you feel as the muscle “burn” during exercise.
This appears to have been discredited.
(Note however that DOMS is only one component of the muscle soreness following exercise; but lactic acid doesn’t seem to have been implicated in the others either, at least as far as five minutes of Google can tell me. ATP and H+ ions may be involved in the acute version.)
I’m not talking about delayed onset, I’m talking about pain when you exercise (not after).
Recall the original statement that you were answering: “Isn’t the feeling of pain when you exercise too much There For A Reason?”
I’m finding that somewhat harder to research (it doesn’t seem to have received as much medical attention, possibly because it’s such a short-term effect), but the papers I have turned up are equivocal at best. It might be one of several metabolites involved; it doesn’t seem likely to be the sole culprit.
The ‘lactic acid causes burning’ idea is mostly just hearsay. There is no evidence to support it, and a lot of evidence that lactic acid actually helps buffer the pH and keep acidosis from occurring: http://www.ncbi.nlm.nih.gov/pubmed/15308499
This seems controversial, see e.g. http://ajpregu.physiology.org/content/289/3/R902
P.S. Thanks for the link. That was an entertaining paper to read with authors going on “No, stop saying this, dumbasses, it doesn’t work like that” rants every few paragraphs :-) I am now willing to agree that saying “lactic acid causes acidosis” is technically incorrect. In practice, however, it remains the case that glycolysis leads to both acidosis and increase in the lactic acid concentration so there doesn’t seem to be a need to change anything in how people train and exercise.
That article you linked is a personal letter to the editor of that journal, and as such is not an indicator of controversy. The authors of the original study defended their viewpoints: http://ajpregu.physiology.org/content/289/3/R904
Obviously there was some debate but the defenders of lactic-acid acidosis never actually published results refuting that study, and indeed other studies that came after it all supported the hypothesis that lactic acid buildup does not contribute to muscle fatigue. For instance, this study carried out by freshmen, of all people: http://www.ncbi.nlm.nih.gov/pubmed/19948679
Interesting. The question, though, isn’t whether the lactic acid causes muscle fatigue but whether it causes the characteristic “burning” sensation in the muscles.
Seems like I need to read up on this...
The term ‘fatigue’ seems to be inconsistently used to sometimes refer to DOMS and sometimes to the, as you call it, ‘burning’ sensation in the muscles. However, in both studies I linked, it’s being used to refer to the fatigue during and immediately after exercise, not DOMS.
Well, I am sure there are other factors as well but I haven’t seen it disputed that lactic acid is a major contributor to the “burning” feeling during the exercise. Wikipedia goes into biochemical details.
I agree that lactic acid is much less relevant to post-exercise pain and soreness.
Well there’s potential value in coming up with a model of the underlying principles at work. It’s like the difference between observing that stuff falls when you drop it and coming up with a theory of gravity.
How do you know compartmentalisation is more mentally efficient than dissonance of mental representations of will power?
First, appreciation: I love that calculated modification of self. These, and similar techniques, can be very useful if put to use in the right way. I recognize myself here and there. You did well to abstract it all out this clearly.
Second, a note: You’ve described your techniques from the perspective of how they deviate from epistemic rationality—“Changing your Terminal Goals”, “Intentional Compartmentalization”, “Willful inconsistency”. I would’ve been more inclined to describe them from the perspective of their central effect, e.g. something to the style of: “Subgoal ascension”, “Channeling”, “Embodying”. Perhaps not as marketable to the lesswrong crowd. Multiple perspectives could be used as well.
Third, a question: How did you create that gut feeling of urgency?
As a comment about changing an instrumental value to a terminal value, I’m just going to copy and paste from a recent thread, as it seems equally relevant here.
The recent thread about Willpower as a Resource identified the fundamental issue.
There are tasks we ought to do, and tasks we want to do, and the latter don’t suffer from will power limitations. Find tasks that server your purpose that you want to do. Then do your best to remind yourself that you want the end, and therefore you want the means. Attitude is everything.
When I was in college, about the only way I could avoid completing assignments just before (and occasionally after) the deadline, was to declare that assignments were due within hours of being assigned.
The second and thirst sections were great. But is the idea of ‘terminal goal hacking’ actually controversial? Without the fancy lingo, it says that it’s okay to learn how to genuinely enjoy new activities and turn that skill to activities that don’t seem all that fun now but are useful in the long term. This seems like a common idea in discourse about motivation. I’d be surprised if most people here didn’t already agree with it.
This made the first section boring to me and I was about to conclude that it’s yet another post restating obvious things in needlessly complicated terms and walk away. Fortunately, I kept on reading and got to the fun parts, but it was close.
I’m glad you found the first section obvious—I do, too, but I’ve gotten some pushback in this very community when posting the idea in an open thread a few months back. I think there are many people for whom the idea is non-obvious. (Also, I probably presented it better this time around.)
Note that the first point applies both to “things your brain thinks are generally enjoyable” and “end-values in your utility function”, though games in which you should change the latter can get pretty contrived.
I’m glad you stuck around for the later sections!
FWIW I found that section fairly obvious and found the disclaimers slighlty surprising.
Yes, good post.
A related technique might be called the “tie-in.” So for example, the civil rights advocate who wants to quit smoking might resolve to donate $100 to the KKK if he smokes another cigarette. So that the goal of quitting smoking gets attached to the goal of not wanting to betray one’s passionately held core beliefs.
In fact, one could say that most motivational techniques rely on some form of goal-tweaking.
I think that at least part of the benefit from telling yourself “I don’t get ego depletion” is from telling yourself “I don’t accept ego depletion as an excuse to stop working”.
If you model motivation as trying to get long-term goals accomplished while listening to a short-term excuse-to-stop-working generator, it matches up pretty well. I did a short test just now with telling myself “I don’t accept being tired as an excuse to stop rasping out a hole”, and I noticed at least two attempts to use just that excuse that I’d normally take.
-
If I’m hungry, eating feels like an end unto itself. Once I’m full, not eating feels like an end unto itself. If I’m bored doing a task, doing something else feels like an end unto itself, until I’ve worked on a new task enough to be bored. All my values seem to need justification to an extent, and labeling any of them as terminal values seems to cause significant cognitive dissonance at some point in time.
I haven’t finished this post yet, but the first section (on “hacking terminal values”) seems to be best illumated by the very paragraph in another, older post (by Yvain) I happened to be reading in another tab.
To whit:
… seems to be endorsing changing our “wanting” criteria to better accord with our “approving” criteria, as defined here. Interestingly, this is arguably a clearer version of the situations discussed lower down in that same article.
nice post. However it might be better to characterize the first two classes as beliefs which are true because of the belief, instead of as false beliefs (Which is important so as not to unconsciously weaken our attachment to truth). For example in your case of believing that water will help you feel better, the reason you believe it is because it is actually true by virtue of the belief, similarly when the want to be rock star enjoys making music for its own sake the belief that making music is fun is now true.
In the standard, one-shot, non-cooperative Prisoner Dilemma, “Defect” is the optimal strategy, regardless of what the other player does.
Incorrect. If the other player is very good at reading your intentions, and (with high probability) defects if and only if you defect, then you should cooperate.
Then this is not the standard Prisoner Dilemma. What you are describing is a sequential game where the other player effectively chooses their action after seeing your choice.
And in fact, the other player that you are describing is not even a player. It is just a deterministic environment.
Not necessarily. If you think that they’re good at reading your intentions, then you can achieve mutual cooperation even in a simultaneous choice game.
For another example, if you use TDT and you think that your opponent uses TDT, then you will cooperate—even if you’r playing a one-shot no-binding-contracts imperfect-knowledge sequential PD.
And even if you demand that players have no knowledge of each other (rather than imperfect knowledge), which seems unreasonable to me, then the result still depends upon your priors for the type of agent that you’re likely to face.
If your priors say there’s a strong chance of facing TDT agents who think there’s a strong chance of facing TDT agents who think etc. ad infinum, then you’ll cooperate even without seeing your opponent at all.
How?
AFAIK, TDT was never defined precisely. Assuming that TDT is supposed to do something equivalent to Hofstadter’s superrationality, it has the problem that it is vulnerable to free riders.
If you have a strong prior that you are facing a TDT agent, then you incentivize other players not to use TDT in order to exploit you by defecting, hence you should lower your prior and play defect, which still incentivizes other players to defect. In the end you get the good old Nash equilibrium.
Timeless Decision Theory paper.
Not particularly. I recommend reading the paper.
My other comment might have come across as as unnecessarily hostile. For the sake of a productive discussion, can you please point out (or provide some more specific reference) how TDT can possibly succeed at achieving mutual cooperation in the standard, one-shot, no-communication prisoner’s dilemma? Because, in general, that doesn’t seem to be possible.
I mean, if you are certain (or highly confident) that you are playing against a mental clone (a stronger condition that just using the same decision theory), then you can safely cooperate.
But, in most scenarios, it’s foolish to have a prior like that: other agents that aren’t mental clones of yours will shamelessly exploit you. Even if you start with a large population of mental clones playing anonymous PD against each others, if there are mutations (random or designed), as soon as defectors appear, they will start exploiting the clones that blindly cooperate, at least until the clones have updated their beliefs and switch to defecting, which yields the Nash equilibrium.
Blindly trusting that you are playing against mental clones is a very unstable strategy.
In the program-swap version of one-shot prisoner dilemma, strategies like the CliqueBots (and generalizations) can achieve stable mutual cooperation because they can actually check for each game whether the other party is a mental clone (there is the problem of coordinating on a single clique, but once the clique is chosen, there is no incentive to deviate from it).
But achieving mutual cooperation in the standard one-shot PD seems impossible without making unrealistic assumptions about the other players. I don’t think that even Yudkowsky or other MIRI people argued that TDT can achieve that.
That’s the infamous 120 pages draft (or has it been updated?)
I had started reading it some time ago, found some basic errors, lots of rambling and decided it wasn’t worth my time.
Anyway, I found a good critique to superrationality: http://www.science20.com/hammock_physicist/whats_wrong_superrationality-100813
The linked article says something like: Superrationality is not necessary. All you need is to realize that in real life, there are no single-shot PDs, and therefore you should use the optimal strategy for iterated PD, which cooperates in the first move.
That’s simply refusing to deal with the original question and answering something different instead.
No, it says that one-shot PD is rare, and when it actually happens defecting is indeed the correct choice, even if it is counterintuitive because we are much more accustomed with scenarios that are similar to iterated PD.
Great post. The points strike me as rather obvious, but now that there’s an articulately-written LessWrong post about them I can say them out loud without embarrassment, even among fellow rationalists.
That said, the post could probably be shorter. But not too much shorter! Too short wouldn’t look as respectable.
This highlights all the difficulties even making sense of a notion of rationality (I believe believing truth is well defined but that there is no relation one can define between OBSERVATIONS and BELIEFS that corresponds to our intuitive notion of rationality).
In particular your definition of rational seems to be something about satisfying the most goals or some other act based notion of rationality (not merely the attempt to believe the most truths). However, this creates several natural questions. First, if you would change your goals if you had sufficient time and were a clear enough thinker then does it still count as rational in your sense to achieve them? If so, then you end up with the strange result that you probably SHOULDN’T spend too much time thinking about your goals or otherwise trying to improve your rationality. After all, it is reasonably likely that your goals would change if subject to sufficient consideration and if you do manage to end up changing those goals you now almost certainly won’t achieve the original goals (which are what is rational to achieve) while contemplation and attempts to improve the clearness of your thinking probably don’t offer enough practical benefit to make it on net more likely you will achieve your original goals. This result seems ridiculous and in deep conflict with the idea of rationality.
Alternatively, if the mere fact that with enough clear-eyed reflection you would change your goals means that rational action is the action most likely to achieve the goals you would adopt with enough reflection rather than the goals you would adopt without it. This too leads to absurdities.
Suppose (as I was until recently) I’m a mathematician and I’m committed to solving some rather minor but interesting problem in my field. I don’t consciously realise that I adopted that goal because it is the most impressive thing in my field that I haven’t rejected as infeasible but that correctly describes my actual dispositions, i.e., if I discover that some other far more impressive result is something I can prove than I will switch over to wanting to do that. Now, almost certainly there is at least one open problem in my field that is considered quite hard but actually has some short clever proof but since I currently don’t know what it is every problem considered quite hard is something I am inclined to think is impractical.
However, since rationality is defined as those acts which increase the likelihood that I will achieve the goal I would have had IF I spent arbitrarily long clearheadedly contemplating my goals and given enough time I can consider every short proof it follows that I ACT RATIONALLY WHENEVER MY ACTIONS MAKE ME MORE LIKELY TO SOLVE THE HARD MATH PROBLEM IN MY FIELD THAT HAPPENS TO HAVE AN OVERLOOKED SHORT PROOF EVEN THOUGH I HAVE NO REASON TO PURSUE THAT PROBLEM CURRENTLY. In other words I end up being rational just when I do something that is intuitively deeply irrational, i.e., for no discernable reason happen to ignore all the evidence that suggests the problem is hard and happen to switch to working on it.
This isn’t merely an issue for act rationality as discussed here but also for belief rationality. Intuitively, belief rationality is something that should help me believe true things. Now ask whether it is more rational to believe, as all the current evidence suggests, that the one apparently hard but actually fairly easy math problem is hard or easy. If rationality is really about coming to more true beliefs than it is ALWAYS MORE RATIONAL TO RANDOMLY BELIEVE THAT AN EASY MATH PROBLEM IS EASY (OR A PROVABLE MATHEMATICAL STATEMENT IS TRUE) THAN TO BELIEVE WHATEVER THE EVIDENCE SAYS ABOUT THE PROBLEM. Yet, this is in deep conflict with our intuition that rationality should be about behaving in some principled way with respect to the evidence and not making blind leeps of faith.
Ultimately, the problem comes down to the lack of any principled notion of what counts as a rule for decision making. There is no principled way to distinguish the rule that says ‘Believe what the experts in the field and other evidence tells you about the truth of unresolved mathematical statements’ and ‘Believe what the experts in the field and other evidence tells you about the truth of unresolved mathematical statements except for statement p which you should think is true with complete confidence.’ Since the second rule always yields more truthful beliefs than the first it should be more rational to accept it.
This result is clearly incompatible with our intuitive notion of rationality so we are forced to admit the notion itself is flawed.
Note, that you can’t avoid this problem by insisting that we have reasons for adopting one belief over another or anything like this. After all, consider someone whose basic belief formation mechanism didn’t cause them to accept A if A & B was asserted. They are worse off than us in the same way that we were worse off than the person who randomly accepted ZFC → FLT (Fermat’s Last Theorem is true if set theory is true) before wiles provided any proof. There is a brute mathematical fact that each is inclined to infer without further evidence and that always serves to help them reach true beliefs.
Are you aware of the distinction between epistemic rationality and instrumental rationality? Although “seeking truth” and “achieving goals” can be put at odds, that’s no excuse to throw them both out the window.
I don’t think that the article is saying you should completely abandon terminal goals and truth-seeking altogether. It sounds to me like it’s saying that while in the vast majority of situations it is better to seek truth and not change terminal goals, there are particular circumstances where it is the right thing to do. For instance, if you accidentally saw greater than 50% of the exam answers of a friend who got 100% on a short but important final exam, and you did not have the option of taking a different exam or delaying it or ever taking that exam again, would you intentionally fail the exam? Or would you compartmentalize your knowledge of the answers so that you cannot access it during the exam, so you can take the exam the way you would have if you hadn’t seen your friend’s exam? In this scenario you would probably either have to change your terminal goal that the exam is instrumental to or to intentionally hide your knowledge of the answers from yourself and avoid seeking the truth about them in your own mind at least until the exam is over.
Also, I’m really scared of using these techniques because I have been conditioned not to trust myself at all if I lie to myself. Does it count as compartmentalization to ignore everything I just read here and pretend to myself that I should definitely never lie to myself intentionally, at least until I feel ready to do so without losing a large portion of my sanity and intellectual autonomy? I’m pretty sure already that the answer is yes.
However, I’m kind of new at actively thinking about and asking my own questions about my own thoughts and beliefs. I do not feel like I have observed enough examples of the quality of my reasoning ability to completely counteract the most likely false belief that I should not be intellectually autonomous because relying on my own reasoning ability is more likely to hurt others and myself instead of help.
For most of my life I have been conditioned to believe that, and it has only been very recently that I have started making progress towards eliminating that belief from my mind, rather than compartmentalizing it. I’m worried that using compartmentalization intentionally could significantly interfere with my progress in that regard.
I’m only just managing to hold this problem off right now, and that task is taking more energy and concentration then I think is realistic to be able to allocate to it on a regular basis.
If I tell myself that I don’t need to be honest with myself about myself and my thoughts 100% of the time, and that what matters is that I’m honest with myself about myself and my thoughts most of the time and only dishonest with myself when it’s necessary, then it’s probably going to be disproportionately difficult to trust myself when I test my own honesty with myself and find out that I’m being honest.
Any advice please? I’m rather inexperienced with this level of honest self cognitive analysis (if that’s what it’s called) and I think I might be somewhat out of my league with this problem. Thanks!
From what you’ve told me, I strongly recommend not using any of the techniques I mentioned until you’re much more confident in your mental control.
It seems that we come from very different mental backgrounds (I was encouraged to be intellectually autonomous from a young age), so you should definitely take my suggestions with caution, as it’s likely they won’t work for people with your background.
It sounds to me like you’re in the early stages of taking control over your beliefs, and while it seems you’re on the right track, it doesn’t sound to me like my techniques would be helpful at this juncture.
So I should continue giving my very best effort to be completely honest with myself, and just hope I don’t ever find myself in a catch-22 scenario like the one I just described before I’m ready. Admitting that lying to myself COULD be my best option in particular kinds of situations is not the same as actually being in such a situation and having to take that option. Whew! I was freaking out a bit, worrying that I would have to compartmentalize the information in your article in order to avoid using the techniques in it. Now I realize that was kind of silly of me.
Thanks for your help!