Transcript since I find the above basically impossible to read (I have to go and do something else for a bit; will transcribe more when I’m done):
[note: I have not tried to e.g. turn underlining into italics etc.; this was enough effort as it was; nor does my spacing exactly match the original.]
----
Abram’s Machine-Learning model of the benefits of meditation
(a synthesis of Zizian “fusion” and Shinzen’s explanation of meditation, also inspired by some of the ideas in Kaj Sotala’s “My attempt to explain Looking and enlightenment in non-mysterious terms” … but this model is no substitute for those sources and does not summarize what they have to say)
note that I am not an experienced meditator; let that influence your judgement of the validity of what I have to say as it may.
(also heavily influenced by my CFAR level 2 workshop experience)
My immediate inspiration for postulating this model was noticing that after just a little meditation, tolerating cold or hot shower temperatures was much easier.
[picture: person in hot/cold shower]
I had previously been paying attentin to what happens in my mind when I flinch away from too-hot or too-cold temperatures in the shower, as a way to pay attention to “thoughts which lead to action”.
There are several reasons why it might be interesting to pay attention to thoughts which lead to action.
1. “Where’s the steering wheel on this thing, anyway?” [picture: confusing car dashboard] If you’re experiencing “motivational issues”, then it stands to reason that it might be useful to keep an eye on which thoughts are leading to actions and which are not.
2. “Who [or what] is steering this thing?” [picture: car with various people in it] Far from being alone in a mysterious spacecraft, it is more like we are on a big road trip with lots of backseat driving and fighting for the wheel, if you buy the multiagent mind picture.
We often think as if we were unitary, and blame any failings of this picture on a somewhat mysterious limited resource called “willpower”. I’m not implying willpower models are wrong exactly; I’m unsure of what is going on. But bear with me on the multiagent picture...
I think there is a tendency to gravitate toward narratives where an overarching self with coherent goals drives everything—missing the extent to which we are driven by a variety of urges such as immediate comfort. So, I think it is interesting to watch oneself and look for what really drives actions. You don’t often eat because eating is necessary for continuing proper function of body & brain in order to use them for broader goals; you eat because food tastes good / you’re hungry / etc.
Well, maybe. You have to look for yourself. But, it seems easy to mistakenly rationalize goals as belonging to a coherent whole moreso than is the case.
Why would we be biased to think we are alone in an alien spaceship which we only partly know how to steer, whne in fact we are fighting for the wheel in a crowded road-trip?
[picture: same car as before, loudmouth backseat driver circled]
Well, maybe it is because the only way the loudmouth (that is to say, consciousness) gets any respect around here is by maintaining the illusion of control. More on that later.
3. A third reason to be interested in “thoughts which lead to action” is that it is an agentless notion of decision.
Normally we think of a decision made by an atomic agent which could have done one of several things; chooses one; and, does it. [picture: person labelled “agent” with “input” and “output” arrows, and “environment” outside] In reality, there is no solid boundary between an agent and its environment; no fixed interface with a well-defined set of actions which act across the interface.
[picture: brain, spinal cord, muscles, eyeballs, bones, arrows, with circles sketched in various places]
Instead, there are concentric rings where we might draw such a boundary. The brain? The nerves? The muscles? The skin?
With a more agentless notion of agency, you can easily look further out.
Does this person’s thought of political protest cause such a protest to happen? Does the protest lead to the change which it demands?
Anyway. That is quite enough on what I was thinking in the shower. [picture: recap of some of the pictures from above, in a thought bubble] The point is, after meditation, the thoughts leading to action were quite different, in a way which (temporarily) eliminated any resistance which I had to going under a hot or cold shower which I knew would not harm me but which would ordinarily be difficult to get myself to stand under.
(I normally can take cold-water showers by applying willpower; I’m talking about a shift in what I can do “easily”, without a feeling of effort.)
So. My model of this:
I’m going to be a little bit vague here, and say that we are doing something like some kind of reinforcement learning, and the algorithm we use includes a value table:
[picture: table, actions on x-axis, states on y-axis, cells of table are estimated values of taking actions in states]
A value isn’t just the learned estimate of the immediate reward which you get by taking an action in a state, but rather, the estimate of the eventual rewards, in total, from that action.
This makes the values difficult to estimate.
An estimate is improved by value iteration: passing current estimates of values back along state transitions to make values better-informed.
[picture: table like above, with arrows, saying “if (s1,a1)->(s2,a2) is a common transition, propagate backward along the link (s1,a1)<-(s2,a2)”]
For large state & action sets, this can be too expensive: we don’t have time to propagate along all the possible (state,action) transitions.
So, we can use attention algorithms to focus selectively on what is highest-priority to propagate.
The goal of attention is to converge to good value estimates in the most important state,action pairs as efficiently as possible.
Now, something one might conceivably try is to train the attention algorithm based on reinforcement learning as well. One might even try to run it from the very same value table:
[picture: value table as before, actions partitioned into “thinking” actions that propagate values and “standard external actions”]
“The problem with this design is that it can allow for pathological self-reinforcing patterns of attention to emerge. I will provocatively call such self-reinforcing patterns “ego structures”. An ego structure doesn’t so much feed on real control as on the illusion of control.
[picture: loudmouth-representing-consciousness from before, saying “I told you so!”]
The ego structure gets it supply of value by directing attention to its apparent successes and away from its apparent failures, including focusing on interpretations of events which make it look like the ego had more control than it did during times of success, and less than it did in cases of failure.
[picture: car with loudmouth saying “I never said to go left!”]
Some of this will sound quite familiar to students of cognitive bias. One might normally explain these biases (confirmation bias, optimism bias, attribution bias) as arising from interpersonal incentives (like signalling games).
I would not discount the importance of those much, but the model here suggests that internal dynamics are also to blame. In my model, biases arise from wireheading effects. In the societal analogies mentioned earlier, we’re looking at regulatory capture and rent-seeking.
This is rather fuzzy as a concrete mathematical model because I haven’t specified any structure like an “interpretation”—but I suspect details could be filled in appropriately to make it work. (Specifically, model-based reinforcement needs to somehow be tied in.)
Anyway, where does meditation come in?
My model is that meditation entices an ego structure with the promise of increased focus (i.e., increased attentional control), which is actually delivered, but at the same time dissolves ego structures by training away any contortions of attention which prevent value iteration from spreading value through the table freely and converging to good estimates efficiently.
[picture: happy meditating person with value table including “updating” actions in his/her head]
How does it provide increased control while dissolving control structures?
Well, what are you training when you meditate? Overtly, you are training the ability to keep attention fixed on one thing. This is kind of a weird thing to try to get attention to do. The whole point of attention is to help propagate updates as efficiently as possible. Holding attention on one thing is like asking a computer to load the same data repeatedly. It doesn’t accomplish any computation. Why do it?
[picture: same meditation with an empty-set symbol instead of face + value table]
Well, it isn’t quite a no-operation. Often, the meditative focus is on something which you try to observe in clarity and detail, like the sensations of the body. This can be useful for other reasons.
For the sake of my model, though, think of it as the ego structure trying to keep the attention algorithm in what amounts to a no-op, repeatedly requesting attention in a place where the information has already propagated.
[picture: value table with lots of arrows between the same pair of cells]
The reason this accomplishes anything is that the ego is not in complete control. Shadows dance beneath the surface.
[picture: same value table with a bunch of other arrows sketched in too]
The ego is a set of patterns of attention. It has “attachments”—obligatory mental gymnastics which it has to keep up as part of the power struggle. Indeed, you could say it is only a set of attachments.
In CFAR terms, an attachment is a trigger-action pattern.
Examples:
“Oh, I mustn’t think that way” (rehearsing a negative association to make sure a specific attention pattern stays squished)
Getting up & getting food to distract yourself when you feel sad
Rehearsing all the reasons you’ll definitely succeed whenever a failure though comes up
Meditation forces you to do nothing whenever these thoughts come up, because the only way to maintain attention at a high level is to develop what is called quanimity: any distracting thought is greeted and set aside in the same way, neither holding it up nor squishing it down. No rehearsal of why you must not think that way. No getting up to go to the fridge. No rehearsal of all the reasons why you will definitely succeed.
Constantly greeting distractions with equanimity and setting them aside fills the value table with zeros where attachments previously lived.
[picture: value table with a bunch of zeros in the left portion labelled “attentional acts”]
Note that these are not fake zeros. You are not rewriting your true values out of existence (though it may feel that way to the ego). You are merely experimenting with not responding to thoughts, and observing that nothing terrible happens.
Another way of thinking about this is un-training the halo effect (though I have not seen any experimental evidence supporting this interpretation). Normally, all entities of conscious experience are imbued with some degree of positive or negative feeling (according to cognitive bias research & experienced meditators alike), which we have flinch-like responses to (trigger-action patterns). Practicing non-response weakens the flinch, allowing more appropriate responses.
Putting zeros in the table can actually give the ego more control by eliminating some competition. However, in the long term, it destabilizes the power base.
You might think this makes sustained meditative practice impossible by removing the very motivational structures trying to meditate; and perhaps it sometimes works that way. Another possibility is that the ego is sublimated into a form which serves to sustain the meditative practice, the skills of mental focus / mindfulness which have been gained, and the practice of equanimity. This structure serves to ensure that propagation of value through the table remains unclogged by attachments in the future. Such a structure doesn’t need to play games to get credit for what it does, since it is actually useful.
Regardless, my advice is that you should absolutely not take this model as an invitation to try and dissolve your ego.
Perhaps take it as an invitation to develop better focus, and to practice equanimity in order to debias halo-effect related problems & make “ugh fields” slowly dissolve.
I have no particular indication that directly trying to “dissolve ego” is a safe or fruitful goal, however, and some reason to think that it is not. The indirect route to un-wireheading our cognitive strategies through a gently rising tide of sanity seems safest.
Speaking of the safety of the approach...
Why doesn’t “zeroing out” the value table destroy our values, again??
At sinceriously.fyi, Ziz talks about core vs structure.
Core is where your true values come from. However, core is not complex enough to interface with the world. Core must create structure to think and act on its behalf.
[picture: central golden circle with complex stuff radiating out from it]
“Structure” means habits of thinking and doing; models, procedures. Any structure is an approximation of how the values represented by the core play out in some arena of life.
So, in this model, all the various sub-agents in your mind arise from the core, as parts of the unfolding calculation of the policy maximizing the core’s values.
These can come into conflict only because they are approximations.
[picture: that car again, with core+radiating-stuff superimposed on it]
The model may sound strange at first, but it is a good description of what’s going on in the value-table model I described. (Or rather, the value-table model gives a concrete mechanism for the core/structure idea.)
The values in the table are approximations which drive an agent’s policy; a “structure” is a subset of the value table which acts as a coherent strategy in a subdomain.
Just removing this structure would be bad; but, it would not remove the core values which get propagated around the value table. Structure would re-emerge.
However, meditation does not truly remove any structure. It only weakens structure by practicing temporary disengagement with it. As I said before, meditation does not introduce any false training data; the normal learning mechanisms are updating on the simple observation of what happens when most of the usual structure is suppressed. This update creates an opportunity to do some “garbage collection” if certain structures prove unnecessary.
According to this model, all irrationality is coming from the approximation of value which is inherent in structure, and much of the irrationality there is coming from structures trying to grab credit via regulatory capture.
(“Regulatory capture” refers to getting undue favor from the government, often in the form of spending money lobbying in order to get legislation which is favorably to you; it is like wireheading the government.)
The reflective value-table model predicts that it is easy to get this kind of irrationality; maybe too easy. For example, addictions can be modeled as a mistaken (but self-reinforcing) attention structure like “But if I think about the hangover I’ll have tomorrow, I won’t want to drink!”
So long as the pattern successfully blocks vlaue propagation, it can stick.
(This should be compared with more well-studied models of such irrationality such as hyperbolic discounting.)
Control of attention is a computationally difficult task, but the premise of Buddhist meditation (particularly Zen) is that you have more to unlearn than to learn. In the model I’m presenting here, that’s because of wireheading by attentional structure.
However, there is some skill which must be learned. I said earlier that one must learn equanimity. Let’s go into what that means.
The goal is to form a solid place on which to stand for the purpose of self-evaluation: an attentional structure from which you can judge your other attentional structures impartially.
[picture: wisdom-seeker on mountaintop nonplussed at being told by lotus-sitting master that all that’s in the way of seeing himself is himself and he should simply stand aside]
If you react to your own thoughts too judgementally, you will learn to hide them from yourself. Better to simply try to see them clearly, and trust the learning algorithms of the brain to react appropriately. Value iteration will propagate everything appropriately if attention remains unblocked.
According to some Buddhist teachings, suffering is pain which is not experienced fully; pain with full mindfulness contains no suffering. This is claimed from experience. Why might this be true? What experience might make someone claim this?
Another idea about suffering is that it results from dwelling on a way that reality differs from how you want it to be which you can’t do anything about.
Remember, I’m speaking from within my machine-learning model here, which I don’t think captures everything. In particular, I don’t think the two statements above capture everything important about suffering.
Within the model, though, both statements make sense. We could say that suffering results from a bad attention structure which claims it is still necessary to focus on a thing even though no value-of-information is being derived from i. The only way this can persist is if the attention structure is refusing to look at some aspects of the situation (perhaps because they are too painful), creating a block to value iteration properly scoring the attentional structure’s worth.
For example, it could be refusal to face the ways in which your brilliant plan to end world hunger will succeed or fail due to things beyond your control. You operate under a model which says that you can solve every potential problem by thinking about it, so you suffer when this is not the case.
From a rationalist perspective, this may at first sound like a good thing, like the attitude you want. But it ruins the value-of-information calculations, ignores opportunity costs, and stops you from knowing when to give up.
To act with equanimity is to be able to see a plan as having a 1% chance of success and see it as your best bet anyway, if best bet it is—and in that frame of mind, to be able to devote your whole being toward that plan; and yet, to be able to drop it in a moment if sufficient evidence accumulates in favor of another way.
So, equanimity is closely tied to your ability to keep your judgements of value and your judgements of probability straight.
Adopting more Buddhist terminology (perhaps somewhat abusively), we can call the opposite of equanimity “attachment”—to cling to certain value estimates (or certain beliefs) as if they were good in themselves.
To judge certain states of affairs unacceptable rather than make only relative judgements of better or worse: attachment! You rob yourself of the ability to make tradeoffs in difficult choices!
To cling to sunk costs: attachment! You rob your future for the sake of maintaining your past image of success!
To be unable to look at the possibility of failure and leave yourself a line of retreat: attachment! Attachment! Attachment!
To hunt down and destroy every shred of attachment in oneself—this, too, would be attachment. Unless our full self is already devoted to the task, this will teach some structure to hide itself.
Instead, equanimity must be learned gently, through nonjudgemental observation of one’s own mind, and trust that our native learning algorithm can find the right structure if we are just able to pay full attention.
(I say this not because no sect of Buddhism recommends the ruthless route—far from it—nor because I can derive the recommendation from my model; rather, this route seems least likely to lead to ill effects.)
So, at the five-second level, equanimity is just devoted attention to what is, free from immediate need to judge as positive or negative or to interpret within a pre-conceived story.
“Between stimulus and response there is a space. In that space is our power to choose our response. In our response lies our growth and our freedom.”—Viktor E. Frankl
There’s definitely a lot that is missing in this model, and incorrect. However, it does seem to get at something useful. Apply with care.
Transcript since I find the above basically impossible to read (I have to go and do something else for a bit; will transcribe more when I’m done):
[note: I have not tried to e.g. turn underlining into italics etc.; this was enough effort as it was; nor does my spacing exactly match the original.]
----
Abram’s Machine-Learning model of the benefits of meditation
(a synthesis of Zizian “fusion” and Shinzen’s explanation of meditation, also inspired by some of the ideas in Kaj Sotala’s “My attempt to explain Looking and enlightenment in non-mysterious terms” … but this model is no substitute for those sources and does not summarize what they have to say)
note that I am not an experienced meditator; let that influence your judgement of the validity of what I have to say as it may.
(also heavily influenced by my CFAR level 2 workshop experience)
My immediate inspiration for postulating this model was noticing that after just a little meditation, tolerating cold or hot shower temperatures was much easier.
[picture: person in hot/cold shower]
I had previously been paying attentin to what happens in my mind when I flinch away from too-hot or too-cold temperatures in the shower, as a way to pay attention to “thoughts which lead to action”.
There are several reasons why it might be interesting to pay attention to thoughts which lead to action.
1. “Where’s the steering wheel on this thing, anyway?” [picture: confusing car dashboard] If you’re experiencing “motivational issues”, then it stands to reason that it might be useful to keep an eye on which thoughts are leading to actions and which are not.
2. “Who [or what] is steering this thing?” [picture: car with various people in it] Far from being alone in a mysterious spacecraft, it is more like we are on a big road trip with lots of backseat driving and fighting for the wheel, if you buy the multiagent mind picture.
We often think as if we were unitary, and blame any failings of this picture on a somewhat mysterious limited resource called “willpower”. I’m not implying willpower models are wrong exactly; I’m unsure of what is going on. But bear with me on the multiagent picture...
I think there is a tendency to gravitate toward narratives where an overarching self with coherent goals drives everything—missing the extent to which we are driven by a variety of urges such as immediate comfort. So, I think it is interesting to watch oneself and look for what really drives actions. You don’t often eat because eating is necessary for continuing proper function of body & brain in order to use them for broader goals; you eat because food tastes good / you’re hungry / etc.
Well, maybe. You have to look for yourself. But, it seems easy to mistakenly rationalize goals as belonging to a coherent whole moreso than is the case.
Why would we be biased to think we are alone in an alien spaceship which we only partly know how to steer, whne in fact we are fighting for the wheel in a crowded road-trip?
[picture: same car as before, loudmouth backseat driver circled]
Well, maybe it is because the only way the loudmouth (that is to say, consciousness) gets any respect around here is by maintaining the illusion of control. More on that later.
3. A third reason to be interested in “thoughts which lead to action” is that it is an agentless notion of decision.
Normally we think of a decision made by an atomic agent which could have done one of several things; chooses one; and, does it. [picture: person labelled “agent” with “input” and “output” arrows, and “environment” outside] In reality, there is no solid boundary between an agent and its environment; no fixed interface with a well-defined set of actions which act across the interface.
[picture: brain, spinal cord, muscles, eyeballs, bones, arrows, with circles sketched in various places]
Instead, there are concentric rings where we might draw such a boundary. The brain? The nerves? The muscles? The skin?
With a more agentless notion of agency, you can easily look further out.
Does this person’s thought of political protest cause such a protest to happen? Does the protest lead to the change which it demands?
Anyway. That is quite enough on what I was thinking in the shower. [picture: recap of some of the pictures from above, in a thought bubble] The point is, after meditation, the thoughts leading to action were quite different, in a way which (temporarily) eliminated any resistance which I had to going under a hot or cold shower which I knew would not harm me but which would ordinarily be difficult to get myself to stand under.
(I normally can take cold-water showers by applying willpower; I’m talking about a shift in what I can do “easily”, without a feeling of effort.)
So. My model of this:
I’m going to be a little bit vague here, and say that we are doing something like some kind of reinforcement learning, and the algorithm we use includes a value table:
[picture: table, actions on x-axis, states on y-axis, cells of table are estimated values of taking actions in states]
A value isn’t just the learned estimate of the immediate reward which you get by taking an action in a state, but rather, the estimate of the eventual rewards, in total, from that action.
This makes the values difficult to estimate.
An estimate is improved by value iteration: passing current estimates of values back along state transitions to make values better-informed.
[picture: table like above, with arrows, saying “if (s1,a1)->(s2,a2) is a common transition, propagate backward along the link (s1,a1)<-(s2,a2)”]
For large state & action sets, this can be too expensive: we don’t have time to propagate along all the possible (state,action) transitions.
So, we can use attention algorithms to focus selectively on what is highest-priority to propagate.
The goal of attention is to converge to good value estimates in the most important state,action pairs as efficiently as possible.
Now, something one might conceivably try is to train the attention algorithm based on reinforcement learning as well. One might even try to run it from the very same value table:
[picture: value table as before, actions partitioned into “thinking” actions that propagate values and “standard external actions”]
“The problem with this design is that it can allow for pathological self-reinforcing patterns of attention to emerge. I will provocatively call such self-reinforcing patterns “ego structures”. An ego structure doesn’t so much feed on real control as on the illusion of control.
[picture: loudmouth-representing-consciousness from before, saying “I told you so!”]
The ego structure gets it supply of value by directing attention to its apparent successes and away from its apparent failures, including focusing on interpretations of events which make it look like the ego had more control than it did during times of success, and less than it did in cases of failure.
[picture: car with loudmouth saying “I never said to go left!”]
Some of this will sound quite familiar to students of cognitive bias. One might normally explain these biases (confirmation bias, optimism bias, attribution bias) as arising from interpersonal incentives (like signalling games).
I would not discount the importance of those much, but the model here suggests that internal dynamics are also to blame. In my model, biases arise from wireheading effects. In the societal analogies mentioned earlier, we’re looking at regulatory capture and rent-seeking.
This is rather fuzzy as a concrete mathematical model because I haven’t specified any structure like an “interpretation”—but I suspect details could be filled in appropriately to make it work. (Specifically, model-based reinforcement needs to somehow be tied in.)
Anyway, where does meditation come in?
My model is that meditation entices an ego structure with the promise of increased focus (i.e., increased attentional control), which is actually delivered, but at the same time dissolves ego structures by training away any contortions of attention which prevent value iteration from spreading value through the table freely and converging to good estimates efficiently.
[picture: happy meditating person with value table including “updating” actions in his/her head]
How does it provide increased control while dissolving control structures?
Well, what are you training when you meditate? Overtly, you are training the ability to keep attention fixed on one thing. This is kind of a weird thing to try to get attention to do. The whole point of attention is to help propagate updates as efficiently as possible. Holding attention on one thing is like asking a computer to load the same data repeatedly. It doesn’t accomplish any computation. Why do it?
[picture: same meditation with an empty-set symbol instead of face + value table]
Well, it isn’t quite a no-operation. Often, the meditative focus is on something which you try to observe in clarity and detail, like the sensations of the body. This can be useful for other reasons.
For the sake of my model, though, think of it as the ego structure trying to keep the attention algorithm in what amounts to a no-op, repeatedly requesting attention in a place where the information has already propagated.
[picture: value table with lots of arrows between the same pair of cells]
The reason this accomplishes anything is that the ego is not in complete control. Shadows dance beneath the surface.
[picture: same value table with a bunch of other arrows sketched in too]
The ego is a set of patterns of attention. It has “attachments”—obligatory mental gymnastics which it has to keep up as part of the power struggle. Indeed, you could say it is only a set of attachments.
In CFAR terms, an attachment is a trigger-action pattern.
Examples:
“Oh, I mustn’t think that way” (rehearsing a negative association to make sure a specific attention pattern stays squished)
Getting up & getting food to distract yourself when you feel sad
Rehearsing all the reasons you’ll definitely succeed whenever a failure though comes up
Meditation forces you to do nothing whenever these thoughts come up, because the only way to maintain attention at a high level is to develop what is called quanimity: any distracting thought is greeted and set aside in the same way, neither holding it up nor squishing it down. No rehearsal of why you must not think that way. No getting up to go to the fridge. No rehearsal of all the reasons why you will definitely succeed.
Constantly greeting distractions with equanimity and setting them aside fills the value table with zeros where attachments previously lived.
[picture: value table with a bunch of zeros in the left portion labelled “attentional acts”]
Note that these are not fake zeros. You are not rewriting your true values out of existence (though it may feel that way to the ego). You are merely experimenting with not responding to thoughts, and observing that nothing terrible happens.
Another way of thinking about this is un-training the halo effect (though I have not seen any experimental evidence supporting this interpretation). Normally, all entities of conscious experience are imbued with some degree of positive or negative feeling (according to cognitive bias research & experienced meditators alike), which we have flinch-like responses to (trigger-action patterns). Practicing non-response weakens the flinch, allowing more appropriate responses.
Putting zeros in the table can actually give the ego more control by eliminating some competition. However, in the long term, it destabilizes the power base.
You might think this makes sustained meditative practice impossible by removing the very motivational structures trying to meditate; and perhaps it sometimes works that way. Another possibility is that the ego is sublimated into a form which serves to sustain the meditative practice, the skills of mental focus / mindfulness which have been gained, and the practice of equanimity. This structure serves to ensure that propagation of value through the table remains unclogged by attachments in the future. Such a structure doesn’t need to play games to get credit for what it does, since it is actually useful.
Regardless, my advice is that you should absolutely not take this model as an invitation to try and dissolve your ego.
Perhaps take it as an invitation to develop better focus, and to practice equanimity in order to debias halo-effect related problems & make “ugh fields” slowly dissolve.
I have no particular indication that directly trying to “dissolve ego” is a safe or fruitful goal, however, and some reason to think that it is not. The indirect route to un-wireheading our cognitive strategies through a gently rising tide of sanity seems safest.
Speaking of the safety of the approach...
Why doesn’t “zeroing out” the value table destroy our values, again??
At sinceriously.fyi, Ziz talks about core vs structure.
Core is where your true values come from. However, core is not complex enough to interface with the world. Core must create structure to think and act on its behalf.
[picture: central golden circle with complex stuff radiating out from it]
“Structure” means habits of thinking and doing; models, procedures. Any structure is an approximation of how the values represented by the core play out in some arena of life.
So, in this model, all the various sub-agents in your mind arise from the core, as parts of the unfolding calculation of the policy maximizing the core’s values.
These can come into conflict only because they are approximations.
[picture: that car again, with core+radiating-stuff superimposed on it]
The model may sound strange at first, but it is a good description of what’s going on in the value-table model I described. (Or rather, the value-table model gives a concrete mechanism for the core/structure idea.)
The values in the table are approximations which drive an agent’s policy; a “structure” is a subset of the value table which acts as a coherent strategy in a subdomain.
Just removing this structure would be bad; but, it would not remove the core values which get propagated around the value table. Structure would re-emerge.
However, meditation does not truly remove any structure. It only weakens structure by practicing temporary disengagement with it. As I said before, meditation does not introduce any false training data; the normal learning mechanisms are updating on the simple observation of what happens when most of the usual structure is suppressed. This update creates an opportunity to do some “garbage collection” if certain structures prove unnecessary.
According to this model, all irrationality is coming from the approximation of value which is inherent in structure, and much of the irrationality there is coming from structures trying to grab credit via regulatory capture.
(“Regulatory capture” refers to getting undue favor from the government, often in the form of spending money lobbying in order to get legislation which is favorably to you; it is like wireheading the government.)
The reflective value-table model predicts that it is easy to get this kind of irrationality; maybe too easy. For example, addictions can be modeled as a mistaken (but self-reinforcing) attention structure like “But if I think about the hangover I’ll have tomorrow, I won’t want to drink!”
So long as the pattern successfully blocks vlaue propagation, it can stick.
(This should be compared with more well-studied models of such irrationality such as hyperbolic discounting.)
Control of attention is a computationally difficult task, but the premise of Buddhist meditation (particularly Zen) is that you have more to unlearn than to learn. In the model I’m presenting here, that’s because of wireheading by attentional structure.
However, there is some skill which must be learned. I said earlier that one must learn equanimity. Let’s go into what that means.
The goal is to form a solid place on which to stand for the purpose of self-evaluation: an attentional structure from which you can judge your other attentional structures impartially.
[picture: wisdom-seeker on mountaintop nonplussed at being told by lotus-sitting master that all that’s in the way of seeing himself is himself and he should simply stand aside]
If you react to your own thoughts too judgementally, you will learn to hide them from yourself. Better to simply try to see them clearly, and trust the learning algorithms of the brain to react appropriately. Value iteration will propagate everything appropriately if attention remains unblocked.
According to some Buddhist teachings, suffering is pain which is not experienced fully; pain with full mindfulness contains no suffering. This is claimed from experience. Why might this be true? What experience might make someone claim this?
Another idea about suffering is that it results from dwelling on a way that reality differs from how you want it to be which you can’t do anything about.
Remember, I’m speaking from within my machine-learning model here, which I don’t think captures everything. In particular, I don’t think the two statements above capture everything important about suffering.
Within the model, though, both statements make sense. We could say that suffering results from a bad attention structure which claims it is still necessary to focus on a thing even though no value-of-information is being derived from i. The only way this can persist is if the attention structure is refusing to look at some aspects of the situation (perhaps because they are too painful), creating a block to value iteration properly scoring the attentional structure’s worth.
For example, it could be refusal to face the ways in which your brilliant plan to end world hunger will succeed or fail due to things beyond your control. You operate under a model which says that you can solve every potential problem by thinking about it, so you suffer when this is not the case.
From a rationalist perspective, this may at first sound like a good thing, like the attitude you want. But it ruins the value-of-information calculations, ignores opportunity costs, and stops you from knowing when to give up.
To act with equanimity is to be able to see a plan as having a 1% chance of success and see it as your best bet anyway, if best bet it is—and in that frame of mind, to be able to devote your whole being toward that plan; and yet, to be able to drop it in a moment if sufficient evidence accumulates in favor of another way.
So, equanimity is closely tied to your ability to keep your judgements of value and your judgements of probability straight.
Adopting more Buddhist terminology (perhaps somewhat abusively), we can call the opposite of equanimity “attachment”—to cling to certain value estimates (or certain beliefs) as if they were good in themselves.
To judge certain states of affairs unacceptable rather than make only relative judgements of better or worse: attachment! You rob yourself of the ability to make tradeoffs in difficult choices!
To cling to sunk costs: attachment! You rob your future for the sake of maintaining your past image of success!
To be unable to look at the possibility of failure and leave yourself a line of retreat: attachment! Attachment! Attachment!
To hunt down and destroy every shred of attachment in oneself—this, too, would be attachment. Unless our full self is already devoted to the task, this will teach some structure to hide itself.
Instead, equanimity must be learned gently, through nonjudgemental observation of one’s own mind, and trust that our native learning algorithm can find the right structure if we are just able to pay full attention.
(I say this not because no sect of Buddhism recommends the ruthless route—far from it—nor because I can derive the recommendation from my model; rather, this route seems least likely to lead to ill effects.)
So, at the five-second level, equanimity is just devoted attention to what is, free from immediate need to judge as positive or negative or to interpret within a pre-conceived story.
“Between stimulus and response there is a space. In that space is our power to choose our response. In our response lies our growth and our freedom.”—Viktor E. Frankl
There’s definitely a lot that is missing in this model, and incorrect. However, it does seem to get at something useful. Apply with care.
-- End --