I like the overall framing, which goes from intervening on the minutiae to long-term, big-picture interventions, and correctly noting that optimising for truth at each level does not look the same and that such strategies can even be in conflict.
I want to think more concretely about what short-term and long-term interventions look like, so I’ll try to categorise a bunch of recent ideas on LessWrong, by looking back at all the curated posts and picking ones I think I can fit into this system. I want to do this to see if I’m getting the right overall picture from Geoff’s post, so I’m gonna do this in a pretty fast and loose way, and I assign about a 35% probability that a lot of these posts are severely misplaced.
I think there are two main axis here: one is the period of time over which you observe and then make the intervention, and the other is whether you’re looking at an individual or a group. I’ll start just with individuals.
I think that thought regulation evaluates whether particular thoughts are acceptable. This feels to me like the most rigorous type of analysis. Eliezer’s Local Validity as Key to Sanity and Civilization is about making sure each step of reasoning follows from the previous, so that you don’t wander into false conclusions from true premises. Abram’s post Mistakes with conservation of expected evidence is an example of taking the basic rules of reasoning and showing when particular thoughts are improper. This isn’t a broad heuristic, it’s a law, and comes with a lot of rigour. These are posts about moving from thought A to thought B, and whether thought B is allowed given thought A.
If I frame train of thought regulation as being about taking short walks that aren’t all definitely locally valid steps, but making sure that you end in a place that is true, I think this is often like ‘wearing hats’ or ‘red teaming’ or ‘doing perspective taking’, where you try out a frame of thinking that isn’t your best guess for being true, but captures something you’ve not been thinking about, and ends up coming up with a concrete hypothesis to test or piece of evidence you’ve missed, that you still find valuable after you take the frame off.
Some examples of this include alkjash’s Babbling then Pruning which is about generating many thoughts that don’t meet your high standards then reducing it to only the good ones, and my recommendation to Hold On To The Curiosity which can involve saying statements that are not accurate according to your all-things-considered-view while you search for the thing you’ve noticed. Habryka’s post Models of Moderation tries to put on a lot of different perspectives in short succession, none of which seem straightforwardly true to him but all of which capture some important aspect of the problem, for which the next step is finding solutions that score highly on lots of different perspectives at once. Also Scott’s If It’s Worth Doing, it’s Worth Doing With Made-Up Statistics involves building a false-ish model that makes a true point, which has some similarity. It maybe also includes Jessicata’s Writing Children’s Picture Books which is a frame to think about a subject for a while.
A different post that naturally fits in here is Abram’s Track-Back Meditation, where you just practice for noticing your trains-of-thought. Eliezer’s writing on humility also covers making sure you check that your train of thought was actually accurate.
OP says the next level is about rules. If I think of it as basically being about trains of thought the plural rather than the individual, I’ll say the next level is about multi-trains of thought regulation. I think a central example here would be Anna’s “Flinching away from truth” is often about *protecting* the epistemology. This post feels like saying often you will have mildly broken trains of thought, and trying to fix them on the level of not letting yourself believe a single false thought or ever let a train of thought conclude in a false place, will be bad, because sometimes the reason you’re doing that is to make sure the most important, big-picture thoughts are true. As long as you notice when you seem to be avoiding true thoughts, and look into what implicit buckets you’re making, then you’ll be able to make sure to think the important true thoughts and not break things in the meantime by trying to fix everything locally in a way that messes up the bigger picture.
I think Paul’s post Argument, Intuition and Recursion also fits into this category. I’d need to read it again carefully to be sure, but I recall it primarily being about how to ensure you’re moving in the true direction in the long-run if you often can’t get the ground truth in reasonable amounts of time—if you cannot check whether each of your train of thoughts terminated in being actually true—and how to learn to trust alternative sources of information and ideas.
Plausibly much of Brienne writing about noticing (at her bog Agenty Duck) fits in here as well, which is about in the increasing your long-term ability to bring important parts of your experience into your trains of thought. It’s not about any one train of thought ending right or wrong, but improving them more generally.
That said, this section was the hardest for me to find posts on (I feel like there’s loads for the others), which is interesting, and perhaps suggests we’re neglecting this facet of rationality on LessWrong.
Then we move onto individual holistic regression, which feels to me like it is about stepping into a very complex system, trying to understand it and recommend a high-level change to its trajectory. This isn’t about getting particular thoughts or trains of thought right, it’s just asking where the system is and how all the parts work. Kaj’s post Building up to an Internal Family Systems model feels like it believes you’ll never get perfect thoughts all of the time but that you can build a self-model that will help you notice the main culprits of bad outcomes and address those head-on from time to time. Ray’s Strategies of Personal Growth works on this level too. Zvi’s post Slack is about noticing whether you have the sort of environment that allows you the space to complete the important trains of thoughts, and if not that you should do something. There isn’t currently a notion of perfect slack and there’s no formula for it (yet), but it’s a really useful high-level heuristic.
---
Looking at it this way, I notice the posts I listed started on the more rigorous end and then became less rigorous as I went along. I wonder if this suggests that when you understand something very deeply, you can simply label individual thoughts as good or bad, but when you have a much weaker grasp then you can only notice the pattern it with massive amounts of data, and even then only vaguely. I’ve often said that I’d like to see the notion of Slack formalised, and that I bet it would be really valuable, but for now we’ll have to stick to Zvi’s excellent poetry.
---
Anyhow Geoff; even though I’d guess you haven’t read most of the linked posts, I’m curious to know your sense of whether the above is doing a good job of capturing what you think of as the main axis of levels-of-intervention for individuals, or not. I’m also interested to hear from others if they feel like they would’ve put posts in very different categories, or if they want to offer more examples I didn’t include (of which there are many).
Plausibly much of Brienne writing about noticing (at her bog Agenty Duck) fits in here as well, which is about in the increasing your long-term ability to bring important parts of your experience into your trains of thought. It’s not about any one train of thought ending right or wrong, but improving them more generally.
Huh, the thing I get out of Brienne’s writing was actually “intervening on the level of direct thoughts”, more than any other rationality technique. ‘Noticing’ is the fundamental building block of all “intervene on direct thought” techniques.
Just riffing a bit on the same project you started :)
There’s integrity and accountability—integrity (Level 3) as following a certain decision theory and making it common knowledge that you do, such that others can reliably simulate you, and coordinate and make trades with you; and accountability as choosing who you want to do your individual holistic regulation (Level 4).
It’s certainly often helpful to quantify your beliefs, and to form an all-things-considered opinion as an ensemble model of all the things you might trust. But to restrict your trains-of-thought to always follow an all-things-considered view, never veering off into resonating with a single model or world-view, is, as you point out, not that great. However, spreading the meme of being able to zoom out to an all-things-considered, quantitative opinion when necessary, and engaging with that level regularly enough to build a track-record of being able to do that, seems like a core part of having a healthy Bayesian community, even if you actually use it quite infrequently compared to other modes of thinking (just like professional mathematicians riff on a post-rigorous level but can drop down to the rigorous level when need be). This is part of my current framing for the forecasting class I’m teaching at CFAR mainlines.
There’s also a long list of other CFAR techniques one could analyse.
Eliezer’s and Abram’s posts are interesting Level 1 interventions, but look at lot like improvements to your slow, deliberate, conscious thinking processes, perhaps eventually becoming ingrained in your S1. I’d compare that with TAPs, which seem to intervene quite directly at Level 2 (and probably with backchaining effects to Level 1): “what thoughts do I want to follow from other thoughts?” [1]
This also seems to me to be the core of what makes CBT therapy work, whereby you uncover unwanted trains (“Get invite to social event” → “Visualise public shame from making an embarrassing comment” → “Flinch away from invite”), and then intervene to change their trajectory.
This causes the question of whether there are any more direct interventions at Level 1. Interventions determining which thoughts, in and of themselves, are even desirable or not. I interpret Selective reporting and Lines of retreat as analysing such interventions. The former (a bit extrapolated) as noting that if there are some unitary thoughts we cannot think, regardless of whether we actuallybelieve them, this can cause large mistakes elsewhere in our belief system. The latter tries to tackle the problem when the blocker is motivational rather than social, by embedding the thoughts in conditionals and building a backup plan before considering whether it has to be used.
Then there’s goal factoring, closely related to separation of concerns. Don’t take actions which confusedly optimise for orthogonal goals, separate out your desires and optimize them separately. This probably has implications at Levels 1 through 4.
I could go on through the CFAR techniques and might at a later point, but that will do for now.
[1] This looks more like “epistemic TAPs”, or “internal TAPs”, which haven’t yet become a standard part of the mainline curriculum, where TAPs are often more external, and for things like “Deciding to take the stairs instead of the elevator as soon as I come into the office and look at them”.
These are posts about moving from thought A to thought B, and whether thought B is allowed given thought A.
“Allowed” is of course a very social term, and one that sounds a lot like “will my teacher accept it if I make this inference?”
Which is different from the mathematical mindset of what happens if I make that inference, and is that thing interesting/elegant/useful. What does it capture to have those kinds of inference rules, and does it capture the kind of process I want to run or not?
Moreover, when it comes to Bayesian reasoning and its various generalisations, the correct inference is _inevitable_, and not optional. There is one single credence which is correct to hold given your priors and the evidence you’ve observed. (Compare this to old school rationality, like Popper and Feynman, thought more in terms of you being “allowed” to hold a variety of beliefs as long as you hadn’t been refuted by experiment. I can’t find the reference post for this now, though.)
(The reason I framed it in the style of “am I allowed this thought” / “will my teacher accept it if I make this inference?” is because that’s literally the frame used in the post ;P)
(I want to note that I’m quite interested in having a conversation about the above, both with Geoff but also with others who have thought a lot about rationality.)
I like the overall framing, which goes from intervening on the minutiae to long-term, big-picture interventions, and correctly noting that optimising for truth at each level does not look the same and that such strategies can even be in conflict.
I want to think more concretely about what short-term and long-term interventions look like, so I’ll try to categorise a bunch of recent ideas on LessWrong, by looking back at all the curated posts and picking ones I think I can fit into this system. I want to do this to see if I’m getting the right overall picture from Geoff’s post, so I’m gonna do this in a pretty fast and loose way, and I assign about a 35% probability that a lot of these posts are severely misplaced.
I think there are two main axis here: one is the period of time over which you observe and then make the intervention, and the other is whether you’re looking at an individual or a group. I’ll start just with individuals.
I think that thought regulation evaluates whether particular thoughts are acceptable. This feels to me like the most rigorous type of analysis. Eliezer’s Local Validity as Key to Sanity and Civilization is about making sure each step of reasoning follows from the previous, so that you don’t wander into false conclusions from true premises. Abram’s post Mistakes with conservation of expected evidence is an example of taking the basic rules of reasoning and showing when particular thoughts are improper. This isn’t a broad heuristic, it’s a law, and comes with a lot of rigour. These are posts about moving from thought A to thought B, and whether thought B is allowed given thought A.
If I frame train of thought regulation as being about taking short walks that aren’t all definitely locally valid steps, but making sure that you end in a place that is true, I think this is often like ‘wearing hats’ or ‘red teaming’ or ‘doing perspective taking’, where you try out a frame of thinking that isn’t your best guess for being true, but captures something you’ve not been thinking about, and ends up coming up with a concrete hypothesis to test or piece of evidence you’ve missed, that you still find valuable after you take the frame off.
Some examples of this include alkjash’s Babbling then Pruning which is about generating many thoughts that don’t meet your high standards then reducing it to only the good ones, and my recommendation to Hold On To The Curiosity which can involve saying statements that are not accurate according to your all-things-considered-view while you search for the thing you’ve noticed. Habryka’s post Models of Moderation tries to put on a lot of different perspectives in short succession, none of which seem straightforwardly true to him but all of which capture some important aspect of the problem, for which the next step is finding solutions that score highly on lots of different perspectives at once. Also Scott’s If It’s Worth Doing, it’s Worth Doing With Made-Up Statistics involves building a false-ish model that makes a true point, which has some similarity. It maybe also includes Jessicata’s Writing Children’s Picture Books which is a frame to think about a subject for a while.
A different post that naturally fits in here is Abram’s Track-Back Meditation, where you just practice for noticing your trains-of-thought. Eliezer’s writing on humility also covers making sure you check that your train of thought was actually accurate.
OP says the next level is about rules. If I think of it as basically being about trains of thought the plural rather than the individual, I’ll say the next level is about multi-trains of thought regulation. I think a central example here would be Anna’s “Flinching away from truth” is often about *protecting* the epistemology. This post feels like saying often you will have mildly broken trains of thought, and trying to fix them on the level of not letting yourself believe a single false thought or ever let a train of thought conclude in a false place, will be bad, because sometimes the reason you’re doing that is to make sure the most important, big-picture thoughts are true. As long as you notice when you seem to be avoiding true thoughts, and look into what implicit buckets you’re making, then you’ll be able to make sure to think the important true thoughts and not break things in the meantime by trying to fix everything locally in a way that messes up the bigger picture.
I think Paul’s post Argument, Intuition and Recursion also fits into this category. I’d need to read it again carefully to be sure, but I recall it primarily being about how to ensure you’re moving in the true direction in the long-run if you often can’t get the ground truth in reasonable amounts of time—if you cannot check whether each of your train of thoughts terminated in being actually true—and how to learn to trust alternative sources of information and ideas.
Plausibly much of Brienne writing about noticing (at her bog Agenty Duck) fits in here as well, which is about in the increasing your long-term ability to bring important parts of your experience into your trains of thought. It’s not about any one train of thought ending right or wrong, but improving them more generally.
That said, this section was the hardest for me to find posts on (I feel like there’s loads for the others), which is interesting, and perhaps suggests we’re neglecting this facet of rationality on LessWrong.
Then we move onto individual holistic regression, which feels to me like it is about stepping into a very complex system, trying to understand it and recommend a high-level change to its trajectory. This isn’t about getting particular thoughts or trains of thought right, it’s just asking where the system is and how all the parts work. Kaj’s post Building up to an Internal Family Systems model feels like it believes you’ll never get perfect thoughts all of the time but that you can build a self-model that will help you notice the main culprits of bad outcomes and address those head-on from time to time. Ray’s Strategies of Personal Growth works on this level too. Zvi’s post Slack is about noticing whether you have the sort of environment that allows you the space to complete the important trains of thoughts, and if not that you should do something. There isn’t currently a notion of perfect slack and there’s no formula for it (yet), but it’s a really useful high-level heuristic.
---
Looking at it this way, I notice the posts I listed started on the more rigorous end and then became less rigorous as I went along. I wonder if this suggests that when you understand something very deeply, you can simply label individual thoughts as good or bad, but when you have a much weaker grasp then you can only notice the pattern it with massive amounts of data, and even then only vaguely. I’ve often said that I’d like to see the notion of Slack formalised, and that I bet it would be really valuable, but for now we’ll have to stick to Zvi’s excellent poetry.
---
Anyhow Geoff; even though I’d guess you haven’t read most of the linked posts, I’m curious to know your sense of whether the above is doing a good job of capturing what you think of as the main axis of levels-of-intervention for individuals, or not. I’m also interested to hear from others if they feel like they would’ve put posts in very different categories, or if they want to offer more examples I didn’t include (of which there are many).
Huh, the thing I get out of Brienne’s writing was actually “intervening on the level of direct thoughts”, more than any other rationality technique. ‘Noticing’ is the fundamental building block of all “intervene on direct thought” techniques.
Just riffing a bit on the same project you started :)
There’s integrity and accountability—integrity (Level 3) as following a certain decision theory and making it common knowledge that you do, such that others can reliably simulate you, and coordinate and make trades with you; and accountability as choosing who you want to do your individual holistic regulation (Level 4).
On another note, predictions and calibration training is often pitched as a kind of Level 1⁄2 intervention, but I’m more bullish on it as a Level 2 intervention with important Level 5 consequences.
It’s certainly often helpful to quantify your beliefs, and to form an all-things-considered opinion as an ensemble model of all the things you might trust. But to restrict your trains-of-thought to always follow an all-things-considered view, never veering off into resonating with a single model or world-view, is, as you point out, not that great. However, spreading the meme of being able to zoom out to an all-things-considered, quantitative opinion when necessary, and engaging with that level regularly enough to build a track-record of being able to do that, seems like a core part of having a healthy Bayesian community, even if you actually use it quite infrequently compared to other modes of thinking (just like professional mathematicians riff on a post-rigorous level but can drop down to the rigorous level when need be). This is part of my current framing for the forecasting class I’m teaching at CFAR mainlines.
There’s also a long list of other CFAR techniques one could analyse.
Eliezer’s and Abram’s posts are interesting Level 1 interventions, but look at lot like improvements to your slow, deliberate, conscious thinking processes, perhaps eventually becoming ingrained in your S1. I’d compare that with TAPs, which seem to intervene quite directly at Level 2 (and probably with backchaining effects to Level 1): “what thoughts do I want to follow from other thoughts?” [1]
This also seems to me to be the core of what makes CBT therapy work, whereby you uncover unwanted trains (“Get invite to social event” → “Visualise public shame from making an embarrassing comment” → “Flinch away from invite”), and then intervene to change their trajectory.
This causes the question of whether there are any more direct interventions at Level 1. Interventions determining which thoughts, in and of themselves, are even desirable or not. I interpret Selective reporting and Lines of retreat as analysing such interventions. The former (a bit extrapolated) as noting that if there are some unitary thoughts we cannot think, regardless of whether we actually believe them, this can cause large mistakes elsewhere in our belief system. The latter tries to tackle the problem when the blocker is motivational rather than social, by embedding the thoughts in conditionals and building a backup plan before considering whether it has to be used.
Then there’s goal factoring, closely related to separation of concerns. Don’t take actions which confusedly optimise for orthogonal goals, separate out your desires and optimize them separately. This probably has implications at Levels 1 through 4.
I could go on through the CFAR techniques and might at a later point, but that will do for now.
[1] This looks more like “epistemic TAPs”, or “internal TAPs”, which haven’t yet become a standard part of the mainline curriculum, where TAPs are often more external, and for things like “Deciding to take the stairs instead of the elevator as soon as I come into the office and look at them”.
Nitpick. Mildly triggered by:
“Allowed” is of course a very social term, and one that sounds a lot like “will my teacher accept it if I make this inference?”
Which is different from the mathematical mindset of what happens if I make that inference, and is that thing interesting/elegant/useful. What does it capture to have those kinds of inference rules, and does it capture the kind of process I want to run or not?
Moreover, when it comes to Bayesian reasoning and its various generalisations, the correct inference is _inevitable_, and not optional. There is one single credence which is correct to hold given your priors and the evidence you’ve observed. (Compare this to old school rationality, like Popper and Feynman, thought more in terms of you being “allowed” to hold a variety of beliefs as long as you hadn’t been refuted by experiment. I can’t find the reference post for this now, though.)
Agreement.
(The reason I framed it in the style of “am I allowed this thought” / “will my teacher accept it if I make this inference?” is because that’s literally the frame used in the post ;P)
(I want to note that I’m quite interested in having a conversation about the above, both with Geoff but also with others who have thought a lot about rationality.)