This is a long and good post with a title and early framing advertising a shorter and better post that does not fully exist, but would be great if it did.
The actual post here is something more like “CFAR and the Quest to Change Core Beliefs While Staying Sane.”
The basic problem is that people by default have belief systems that allow them to operate normally in everyday life, and that protect them against weird beliefs and absurd actions, especially ones that would extract a lot of resources in ways that don’t clearly pay off. And they similarly protect those belief systems in order to protect that ability to operate in everyday life, and to protect their social relationships, and their ability to be happy and get out of bed and care about their friends and so on.
A bunch of these defenses are anti-epistemic, or can function that way in many contexts, and stand in the way of big changes in life (change jobs, relationships, religions, friend groups, goals, etc etc).
The hard problem CFAR is largely trying to solve in this telling, and that the sequences try to solve in this telling, is to disable such systems enough to allow good things, without also allowing bad things, or to find ways to cope with the subsequent bad things slash disruptions. When you free people to be shaken out of their default systems, they tend to go to various extremes that are unhealthy for them, like optimizing narrowly for one goal instead of many goals, or having trouble spending resources (including time) on themselves at all, or being in the moment and living life, And That’s Terrible because it doesn’t actually lead to better larger outcomes in addition to making those people worse off themselves.
These are good things that need to be discussed more, but the title and introduction promise something I find even more interesting.
In that taxonomy, the key difference is that there are games one can play, things one can be optimizing for or responding to, incentives one can create, that lead to building more effective tools for modeling and understanding reality, and then changing it. One can cultivate an asthetic sense that these are good, healthy, virtuous, wholesome, etc. Interacting with these systems is ‘good for you’ and more people being in such modes more leads to more good things, broadly construed (if I was doing a post I’d avoid using such loaded language, it’s not useful, but it’s faster as a way to gesture at the thing).
Then there are reality-masking puzzles, which are where instead of creating better maps of the territory and enabling us to master the world, we instead learn to obscure our maps of the world, obscure the maps of others, fool ourselves first to then fool others, and otherwise learn how to do symbolic actions and social manipulations to get advantage or cause actions.
This is related to simulacra (level 1 puzzles versus level 2-4 puzzles), it is related to moral mazes (if you start a small business buying and selling things you are reality revealing, whereas if you are navigating corporate politics you are reality masking, etc). Knowing how to tell which is which, and how to chart paths through problem spaces that shift problems of one type into the other (e.g. finding ways to do reality-revealing marketing/sales/public-relations/politics/testing/teaching/etc to extent possible). In particular, the question of: Are you causing optimization towards learning and figuring out how reality functions, or are you causing optimization towards faking that you understand or agree or are smart/agreeable/conscientious/willing-to-falsify? Are you optimizing for making things explicit, or for making things implicit? Etc.
So I’d love to see a post by Anna, or otherwise, that is entitled “Reality-Revealing and Reality-Masking Puzzles, No Really This Time” that takes this out of the CFAR/AI context entirely. But this still has a lot going on that’s good and seems well over the threshold for inclusion in such a collection.
This is a long and good post with a title and early framing advertising a shorter and better post that does not fully exist, but would be great if it did.
The actual post here is something more like “CFAR and the Quest to Change Core Beliefs While Staying Sane.”
The basic problem is that people by default have belief systems that allow them to operate normally in everyday life, and that protect them against weird beliefs and absurd actions, especially ones that would extract a lot of resources in ways that don’t clearly pay off. And they similarly protect those belief systems in order to protect that ability to operate in everyday life, and to protect their social relationships, and their ability to be happy and get out of bed and care about their friends and so on.
A bunch of these defenses are anti-epistemic, or can function that way in many contexts, and stand in the way of big changes in life (change jobs, relationships, religions, friend groups, goals, etc etc).
The hard problem CFAR is largely trying to solve in this telling, and that the sequences try to solve in this telling, is to disable such systems enough to allow good things, without also allowing bad things, or to find ways to cope with the subsequent bad things slash disruptions. When you free people to be shaken out of their default systems, they tend to go to various extremes that are unhealthy for them, like optimizing narrowly for one goal instead of many goals, or having trouble spending resources (including time) on themselves at all, or being in the moment and living life, And That’s Terrible because it doesn’t actually lead to better larger outcomes in addition to making those people worse off themselves.
These are good things that need to be discussed more, but the title and introduction promise something I find even more interesting.
In that taxonomy, the key difference is that there are games one can play, things one can be optimizing for or responding to, incentives one can create, that lead to building more effective tools for modeling and understanding reality, and then changing it. One can cultivate an asthetic sense that these are good, healthy, virtuous, wholesome, etc. Interacting with these systems is ‘good for you’ and more people being in such modes more leads to more good things, broadly construed (if I was doing a post I’d avoid using such loaded language, it’s not useful, but it’s faster as a way to gesture at the thing).
Then there are reality-masking puzzles, which are where instead of creating better maps of the territory and enabling us to master the world, we instead learn to obscure our maps of the world, obscure the maps of others, fool ourselves first to then fool others, and otherwise learn how to do symbolic actions and social manipulations to get advantage or cause actions.
This is related to simulacra (level 1 puzzles versus level 2-4 puzzles), it is related to moral mazes (if you start a small business buying and selling things you are reality revealing, whereas if you are navigating corporate politics you are reality masking, etc). Knowing how to tell which is which, and how to chart paths through problem spaces that shift problems of one type into the other (e.g. finding ways to do reality-revealing marketing/sales/public-relations/politics/testing/teaching/etc to extent possible). In particular, the question of: Are you causing optimization towards learning and figuring out how reality functions, or are you causing optimization towards faking that you understand or agree or are smart/agreeable/conscientious/willing-to-falsify? Are you optimizing for making things explicit, or for making things implicit? Etc.
So I’d love to see a post by Anna, or otherwise, that is entitled “Reality-Revealing and Reality-Masking Puzzles, No Really This Time” that takes this out of the CFAR/AI context entirely. But this still has a lot going on that’s good and seems well over the threshold for inclusion in such a collection.