I want to share my model of intelligence and research. You won’t agree with it at the first glance. Or at the third glance. (My hope is that you will just give up and agree at the 20th glance.)
But that’s supposed to be good: it means the model is original and brave enough to make risky statements.
In this model any difference in “intelligence levels” or any difference between two minds in general boils down to “commitment level”.
What is “commitment”?
On some level, “commitment” is just a word. It’s not needed to define the ideas I’m going to talk about. What’s much more important is the three levels of commitment. There are often three levels which follow the same pattern, the same outline:
Level 1. You explore a single possibility.
Level 2. You want to explore all possibilities. But you are paralyzed by the amount of possibilities. At this level you are interested in qualities of possibilities. You classify possibilities and types of possibilities.
Level 3. You explore all possibilities through a single possibility. At this level you are interested in dynamics of moving through the possibility space. You classify implications of possibilities.
...
I’m going to give specific examples of the pattern above. This post is kind of repetitive, but it wasn’t AI-generated, I swear. Repetition is a part of commitment.
Why is commitment important?
My explanation won’t be clear before you read the post, but here it goes:
Commitment describes your values and the “level” of your intentionality.
Commitment describes your level of intelligence (in a particular topic). Compared to yourself (your potential) or other people.
Commitments are needed for communication. Without shared commitments it’s impossible for two people to find a common ground.
Commitment describes the “true content” of an argument, an idea, a philosophy. Ultimately, any property of a mind boils down to “commitments”.
Basics
1. Commitment to exploration
I think there are three levels of commitment to exploration.
Level 1. You treat things as immediate means to an end.
Imagine two enemy caveman teleported into a laboratory. They try to use whatever they find to beat each other. Without studying/exploring what they’re using. So, they are just throwing microscopes and beakers at each other. They throw anti-matter guns at each other without even activating them.
Level 2. You explore things for the sake of it.
Think about mathematicians. They can explore math without any goal.
Level 3. You use particular goals to guide your exploration of things. Even though you would care about exploring them without any goal anyway. The exploration space is just too large, so you use particular goals to narrow it down.
Imagine a physicist who explores mathematics by considering imaginary universes and applying physical intuition to discover deep mathematical facts. Such person uses a particular goal/bias to guide “pure exploration”. (inspired by Edward Witten, see Michael Atiyah’s quote)
More examples
In terms of exploring ideas, our culture is at the level 1 (angry caveman). We understand ideas only as “ideas of getting something (immediately)” or “ideas of proving something (immediately)”. We are not interested in exploring ideas for the sake of it. The only metrics we apply to ideas are “(immediate) usefulness” and “trueness”. Not “beauty”, “originality” and “importance”. People in general are at the level 1. Philosophers are at the level 1 or “1.5″. Rationality community is at the level 1 too (sadly): rationalists still mostly care only about immediate usefulness and truth.
In terms of exploring argumentation and reasoning, our culture is at the level 1. If you never thought “stupid arguments don’t exist”, then you are at the level 1: you haven’t explored arguments and reasoning for the sake of it, you immediately jumped to assuming “The Only True Way To Reason” (be it your intuition, scientific method, particular ideology or Bayesian epistemology). You haven’t stepped outside of your perspective a single time. Almost everyone is at the level 1. Eliezer Yudkowsky is at the level 3, but in a much narrower field: Yudkowsky explored rationality with the specific goal/bias of AI safety. However, overall Eliezer is at level 1 too: never studied human reasoning outside of what he thinks is “correct”.
I think this is kind of bad. We are at the level 1 in the main departments of human intelligence and human culture. Two levels below our true potential.
2. Commitment to goals
I think there are three levels of commitment to goals.
Level 1. You have a specific selfish goal.
“I want to get a lot of money” or “I want to save my friends” or “I want to make a ton of paperclips”, for example.
Level 2. You have an abstract goal. But this goal doesn’t imply much interaction with the real world.
“I want to maximize everyone’s happiness” or “I want to prevent (X) disaster”, for example. This is a broad goal, but it doesn’t imply actually learning and caring about anyone’s desires (until the very end). Rationalists are at this level of commitment.
Level 3. You use particular goals to guide your abstract goals.
Some political activists are at this level of commitment. (But please, don’t bring CW topics here!)
3. Commitment to updating
“Commitment to updating” is the ability to re-start your exploration from the square one. I think there are three levels to it.
Level 1.No updating. You never change ideas.
You just keep piling up your ideas into a single paradigm your entire life. You turn beautiful ideas into ugly ones so they fit with all your previous ideas.
Level 2.Updating. You change ideas.
When you encounter a new beautiful idea, you are ready to reformulate your previous knowledge around the new idea.
Level 3.Updating with “check points”. You change ideas, but you use old ideas to prime new ones.
When you explore an idea, you mark some “check points” which you reached with that idea. When you ditch the idea for a new one, you still keep in mind the check points you marked. And use them to explore the new idea faster.
Science
4.1 Commitment and theory-building
I think there are three levels of commitment in theory-building.
Level 1.
You build your theory using only “almost facts”. I.e. you come up with “trivial” theories which are almost indistinguishable from the things we already know.
Level 2.
You build your theory on speculations. You “fantasize” important properties of your idea (which are important only to you or your field).
Level 3.
You build your theory on speculations. But those speculations are important even outside of your field.
I think Eliezer Yudkowsky and LW did theory-building of the 3rd level. A bunch of LW ideas are philosophically important even if you disagree with Bayesian epistemology (Eliezer’s view on ethics and math, logical decision theories and some Alignment concepts).
4.2 Commitment to explaining a phenomenon
I think there are three types of commitment in explaining a phenomenon.
Level 1.
You just want to predict the phenomenon. But many-many possible theories can predict the phenomenon, so you need something more.
Level 2.
You compare the phenomenon to other phenomena and focus on its qualities.
That’s where most of theories go wrong: people become obsessed with their own fantasies about qualities of a phenomenon.
Level 3.
You focus on dynamics which connect this phenomenon to other phenomena. You focus on overlapping implications of different phenomena. 3rd level is needed for any important scientific breakthrough. For example:
Imagine you want to explain combustion (why/how things burn). On one hand you already “know everything” about the phenomenon, so what do you even do? Level 1 doesn’t work. So, you try to think about qualities of burning, types of transformations, types of movement… but that won’t take you anywhere. Level 2 doesn’t work too. The right answer: you need to think not about qualities of transformations and movements, but about dynamics (conservation of mass, kinetic theory of gases) which connect different types of transformations and movements. Level 3 works.
Epistemology pt. 1
5. Commitment and epistemology
I think there are three levels of commitment in epistemology.
Level 1. You assume the primary reality of the physical world. (Physicism)
Take statements “2 + 2 = 4” and “God exists”. To judge those statements, a physicist is going to ask “Do those statements describe reality in a literal way? If yes, they are true.”
Level 2. You assume the primary reality of statements of some fundamental language. (Descriptivism)
To judge statements, a descriptivist is going to ask “Can those statements be expressed in the fundamental language? If yes, they are true.”
Level 3. You assume the primary reality of semantic connections between statements of languages. And the primary reality of some black boxes which create those connections. (Connectivism) You assume that something physical shapes the “language reality”.
To judge statements, a connectivist is going to ask “Do those statements describe an important semantic connection? If yes, they are true.”
...
Recap. Physicist: everything “physical” exists. Descriptivist: everything describable exists. Connectivist: everything important exists. Physicist can be too specific and descriptivist can be too generous. (This pattern of being “too specific” or “too generous” repeats for all commitment types.)
Thinking at the level of semantic connections should be natural to people (because they use natural language and… neural nets in their brains!). And yet this idea is extremely alien to people epistemology-wise.
Implications for rationality
In general, rationalists are “confused” between level 1 and level 2. I.e. they often treat level 2 very seriously, but aren’t fully committed to it.
Eliezer Yudkowsky is “confused” between level 1 and level 3. I.e. Eliezer has a lot of “level 3 ideas”, but doesn’t apply level 3 thinking to epistemology in general.
On one hand, Eliezer believes that “map is not the territory”. (level 1 idea)
On another hand, Eliezer believes that math is an “objective” language shaped by the physical reality. (level 3 idea)
Similarly, Eliezer believes that human ethics are defined by some important “objective” semantic connections (which can evolve, but only to a degree). (level 3)
“Logical decision theories” treat logic as something created by connections between black boxes. (level 3)
When you do Security Mindset, you should make not only “correct”, but beautiful maps. Societal properties of your map matter more than your opinions. (level 3)
So, Eliezer has a bunch of ideas which can be interpreted as “some maps ARE the territory”.
6. Commitment and uncertainty
I think there are three levels of commitment in doubting one’s own reasoning.
Level 1.
You’re uncertain about superficial “correctness” of your reasoning. You worry if you missed a particular counter argument. Example: “I think humans are dumb. But maybe I missed a smart human or applied a wrong test?”
Level 2.
You un-systematically doubt your assumptions and definitions. Maybe even your inference rules a little bit (see “inference objection”). Example: “I think humans are dumb. But what is a “human”? What is “dumb”? What is “is”? And how can I be sure in anything at all?”
Level 3.
You doubt the semantic connections (e.g. inference rules) in your reasoning. You consider particular dynamics created by your definitions and assumptions. “My definitions and assumptions create this dynamic (not presented in all people). Can this dynamic exploit me?”
Example: “I think humans are dumb. But can my definition of “intelligence” exploit me? Can my pessimism exploit me? Can this be an inconvenient way to think about the world? Can my opinion turn me into a fool even I’m de facto correct?”
...
Level 3 is like “security mindset” applied to your own reasoning. LW rationality mostly teaches against it, suggesting you to always take your smallest opinions at face value as “the truest thing you know”. With some exceptions, such as “ethical injunctions”, “radical honesty”, “black swan bets” and “security mindset”.
Epistemology pt. 2
7. Commitment to understanding/empathy
I think there are three levels of commitment in understanding your opponent.
Level 1.
You can pass the Ideological Turing Test in a superficial way (you understand the structure of the opponent’s opinion).
Level 2. “Telepathy”.
You can “inhabit” the emotions/mindset of your opponent.
Level 3.
You can describe the opponent’s position as a weaker version/copy of your own position. And additionally you can clearly imagine how your position could turn out to be “the weaker version/copy” of the opponent’s position. You find a balance between telepathy and “my opinion is the only one which makes sense!”
8. Commitment to “resolving” problems
I think there are three levels of commitment in “resolving” problems.
Level 1.
You treat a problem as a puzzle to be solved by Your Favorite True Epistemology.
Level 2.
You treat a problem as a multi-layered puzzle which should be solved on different levels.
Level 3.
You don’t treat a problem as a self-contained puzzle. You treat it as a “symbol” in the multitude of important languages. You can solve it by changing its meaning (by changing/exploring the languages).
I don’t treat this paradox as a chess puzzle: I don’t think it’s something that could be solved or even “made sense of” from the inside. You need outside context. Like, does it ask you to survive? Then you can simply expect the hanging every day and be safe. (Though—can you do this to your psychology?) Or does the paradox ask you to come up with formal reasoning rules to solve it? But you can make any absurd reasoning system—to make a meaningful system you need to answer “for what purposes this system is going to be needed except this paradox”. So, I think that “from the inside” there’s no ground truth (though it can exist “from the outside”). Without context there’s a lot of simple, but absurd or trivial solutions like “ignore logic, think directly about outcomes” or “come up with some BS reasoning system”. Or say “Solomonoff induction solves all paradoxes: even if it doesn’t, it’s the best possible predictor of reality, so just ignore philosophers, lol”.
Alignment pt. 1
9.1 Commitment to morality
I think there are three levels of commitment in morality.
Level 1. Norms, desires.
You analyze norms of specific communities and desires of specific people. That’s quite easy: you are just learning facts.
Level 2. Ethics and meta-ethics.
You analyze similarities between different norms and desires. You get to pretty abstract and complicated values such as “having agency, autonomy, freedom; having an interesting life; having an ability to form connections with other people”. You are lost in contradictory implications, interpretations and generalizations of those values. You have a (meta-)ethical paralysis.
Level 3. “Abstract norms”.
You analyze similarities between implications of different norms and desires. You analyze dynamics created by specific norms. You realize that the most complicated values are easily derivable from the implications of the simplest norms. (Not without some bias, of course, but still.)
I think moral philosophers and Alignment researches are seriously dropping the ball by ignoring the 3rd level. Acknowledging the 3rd level doesn’t immediately solve Alignment, but it can pretty much “solve” ethics (with a bit of effort).
9.2 Commitment to values
I think there are three levels of values.
Level 1. Inside values (“feeling good”).
You care only about things inside of your mind. For example, do you feel good or not?
Level 2. Real values.
You care about things in the real world. Even though you can’t care about them directly. But you make decisions to not delude yourself and not “simulate” your values.
Level 3. Semantic values.
You care about elements of some real system. And you care about proper dynamics of this system. For example, you care about things your friend cares about. But it’s also important to you that your friend is not brainwashed, not controlled by you. And you are ready that one day your friend may stop caring about anything. (Your value may “die” a natural death.)
3rd level is the level of “semantic values”. They are not “terminal values” in the usual sense. They can be temporal and history-dependent.
9.3 Commitment and research interest
So, you’re interested in ways in which an AI can go wrong. What specifically can you be interested in? I think there are three levels to it.
Level 1. In what ways some AI actions are bad?
You classify AI bugs into types. For example, you find “reward hacking” type of bugs.
Level 2. What qualities of AIs are good/bad?
You classify types of bugs into “qualities”. You find such potentially bad qualities as “AI doesn’t care about the real world” and “AI doesn’t allow to fix itself (corrigibility)”.
Level 3. What bad dynamics are created by bad actions of AI? What good dynamics are destroyed?
Assume AI turned humanity into paperclips. What’s actually bad about that, beyond the very first obvious answer? What good dynamics did this action destroy? (Some answers: it destroyed the feedback loop, the connection between the task and its causal origin (humanity), the value of paperclips relative to other values, the “economical” value of paperclips, the ability of paperclips to change their value.)
On the 3rd level you classify different dynamics. I think people completely ignore the 3rd level. In both Alignment and moral philosophy. 3rd level is the level of “semantic values”.
Alignment pt. 2
10. Commitment to Security Mindset
I think Security Mindset has three levels of commitment.
Level 1. Ordinary paranoia.
You have great imagination, you can imagine very creative attacks on your system. You patch those angles of attack.
Level 2. Security Mindset.
You study your own reasoning about safety of the system. You check if your assumptions are right or wrong. Then, you try to delete as much assumptions as you can. Even if they seem correct to you! You also delete anomalies of the system even if they seem harmless. You try to simplify your reasoning about the system seemingly “for the sake of it”.
Level 3.
You design a system which would be safe even in a world with changing laws of physics and mathematics. Using some bias, of course (otherwise it’s impossible).
Humans, idealized humans are “level 3 safe”. All/almost all current approaches to Alignment don’t give you a “level 3 safe” AI.
11. Commitment to Alignment
I think there are three levels of commitment a (mis)aligned AI can have. Alternatively, those are three or two levels at which you can try to solve the Alignment problem.
Level 1.
AI has a fixed goal or a fixed method of finding a goal (which likely can’t be Aligned with humanity). It respects only its own agency. So, ultimately it does everything it wants.
Level 2.
AI knows that different ethics are possible and is completely uncertain about ethics. AI respects only other people’s agency. So, it doesn’t do anything at all (except preventing, a bit lazily, 100% certain destruction and oppression). Or requires an infinite permission:
Am I allowed to calculate “2 + 2”?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world which you can’t fully comprehend even if I explain it to you?
...
Wait, am I allowed to ask those question? I’m already manipulating you by boring you to death. I can’t even say anything.
Level 3.
AI can respect both its own agency and the agency of humanity. AI finds a way to treat its agency as the continuation of the agency of people. AI makes sure it doesn’t create any dynamic which couldn’t be reversed by people (unless there’s nothing else to do). So, AI can both act and be sensitive to people.
Implications for Alignment research
I think a fully safe system exists only on the level 3. The most safe system is the system which understands what “exploitation” means, so it never willingly exploits its rewards in any way. Humans are an example of such system.
I think alignment researchers are “confused” between level 1 and level 3. They try to fix different “exploitation methods” (ways AI could exploit its rewards) instead of making the AI understand what “exploitation” means.
I also think this is the reason why alignment researches don’t cooperate much, pushing in different directions.
Perception
11. Commitment to properties
Commitments exist even on the level of perception. There are three levels of properties to which your perception can react.
Level 1. Inherent properties.
You treat objects as having more or less inherent properties. “This person is inherently smart.”
Level 2. Meta-properties.
You treat any property as universal. “Anyone is smart under some definition of smartness.”
Level 3. Semantic properties.
You treat properties only as relatively attached to objects: different objects form a system (a “language”) where properties get distributed between them and differentiated. “Everyone is smart, but in a unique way. And those unique ways are important in the system.”
12.1 Commitment to experiences and knowledge
I think there are three levels of commitment to experiences.
Level 1.
You’re interested in particular experiences.
Level 2.
You want to explore all possible experiences.
Level 3.
You’re interested in real objects which produce your experiences (e.g. your friends): you’re interested what knowledge “all possible experiences” could reveal about them. You want to know where physical/mathematical facts and experiences overlap.
12.2 Commitment to experience and morality
I think there are three levels of investigating the connection between experience and morality.
Level 1.
You study how experience causes us to do good or bad things.
Level 2.
You study all the different experiences “goodness” and “badness” causes in us.
Level 3.
You study dynamics created by experiences, which are related to morality. You study implications of experiences. For example: “loving a sentient being feels fundamentally different from eating a sandwich. food taste is something short and intense, but love can be eternal and calm. this difference helps to not treat other sentient beings as something disposable”
I think the existence of the 3rd level isn’t acknowledged much. And yet it could be important for alignment. Most versions of moral sentimentalism are 2nd level at best. Epistemic Sentimentalism can be 3rd level.
Final part
Specific commitments
You can ponder your commitment to specific things.
Are you committed to information?
Imagine you could learn anything (and forget it if you want). Would you be interested in learning different stuff more or less equally? You could learn something important (e.g. the most useful or the most abstract math), but you also could learn something completely useless—such as the life story of every ant who ever lived.
I know, this question is hard to make sense of: of course, anyone would like to learn everything/almost everything if there was no downside to it. But if you have a positive/negative commitment about the topic, then my question should make some sense anyway.
Are you committed to people?
Imagine you got extra two years to just talk to people. To usual people on the street or usual people on the Internet.
Would you be bored hanging out with them?
My answers: >!Maybe I was committed to information in general as a kid. Then I became committed to information related to people, produced by people, known by people.!<
My inspiration for writing this post
I encountered a bunch of people who are more committed to exploring ideas (and taking ideas seriously) than usual. More committed than most rationalists, for example.
But I felt those people lack something:
They are able to explore ideas, but don’t care about that anymore. They care only about their own clusters of idiosyncratic ideas.
They have very vague goals which are compatible with any specific actions.
They don’t care if their ideas could even in principle matter to people. They have “disconnected” from other people, from other people’s context (through some level of elitism).
When they acknowledge you as “one of them”, they don’t try to learn your ideas or share their ideas or argue with you or ask your help for solving a problem.
So, their commitment remains very low. And they are not “committed” to talking.
Conclusion
If you have a high level of commitment (3rd level) at least to something, then we should find a common language. You may even be like a sibling to me.
Thank you for reading this post. 🗿
Cognition
14.1 Studying patterns
I think there are three levels of commitment to patterns.
You study particular patterns.
You study all possible patterns: you study qualities of patterns.
You study implications of patterns. You study dynamics of patterns: how patterns get updated or destroyed when you learn new information.
14.2 Patterns and causality
I think there are three levels in the relationship between patterns and causality. I’m going to give examples about visual patterns.
Level 1.
You learn which patterns are impossible due to local causal processes.
For example: “I’m unlikely to see a big tower made of eggs standing on top of each other”. It’s just not a stable situation due to very familiar laws of physics.
Level 2.
You learn statistical patterns (correlations) which can have almost nothing to do with causality.
For example: “people like to wear grey shirts”.
Level 3.
You learn patterns which have a strong connection to other patterns and basic properties of images. You could say such patterns are created/prevented by “global” causal processes.
For example: “I’m unlikely to see a place fully filled with dogs. dogs are not people or birds or insects, they don’t create such crowds”. This is very abstract, connects to other patterns and basic properties of images.
Implications for Machine Learning
I think...
It’s likely that Machine Learning models don’t learn level 3 patterns as well as they could, as sharply as they could.
Machine Learning models should be 100% able to learn level 3 patterns. It shouldn’t require any specific data.
Learning/comparing level 3 patterns is interesting enough on its own. It could be its own area of research. But we don’t apply statistics/Machine Learning to try mining those patterns. This may be a missed opportunity for humans.
I think researchers are making a blunder by not asking “what kinds of patterns exist? what patterns can be learned in principle?” (not talking about universal approximation theorem)
15. Cognitive processes
Suppose you want to study different cognitive processes, skills, types of knowledge. There are three levels:
You study particular cognitive processes.
You study qualities of cognitive processes.
You study dynamics created by cognitive processes. How “actions” of different cognitive processes overlap.
I think you can describe different cognitive processes in terms of patterns they learn. For example:
Causal reasoning learns abstract configurations of abstract objects in the real world. So you can learn stuff like “this abstract rule applies to most objects in the world”.
Symbolic reasoning learns abstract configurations of abstract objects in your “concept space”. So you can learn stuff like “”concept A contains concept B” is an important pattern”.
Correlational reasoning learns specific configurations of specific objects.
Mathematical reasoning learns specific configurations of abstract objects. So you can build arbitrary structures with abstract building blocks.
Self-aware reasoning can transform abstract objects into specific objects. So you can think thoughts like, for example, “maybe I’m just a random person with random opinions”.
I think all this could be easily enough formalized.
Meta-level
Can you be committed to exploring commitment?
I think yes.
One thing you can do is to split topics into sub-topics and raise your commitment in every particular sub-topic. Vaguely similar to gradient descent. That’s what I’ve been doing in this post so far.
Another thing you can do is to apply recursion. You can split any topic into 3 levels of commitment. But then you can split the third level into 3 levels too. So, there’s potentially an infinity of levels of commitment. And there can be many particular techniques for exploiting this fact.
But the main thing is the three levels of “exploring ways to explore commitment”:
You study particular ways to raise commitment.
You study all possible ways to raise commitment.
You study all possible ways through a particular way. You study dynamics and implications which the ways create.
I don’t have enough information or experience for the 3rd level right now.
*A more “formal” version of the draft (it’s a work in progress): *
There are two interpretations of this post, weak and strong.
Weak interpretation:
I describe a framework about “thee levels of exploration”. I use the framework to introduce some of my ideas. I hope that the framework will give more context to my ideas, making them more understandable. I simply want to find people who are interested in exploring ideas. Exploring just for the sake of exploring or for a specific goal.
Strong interpretation:
I use the framework as a model of intelligence. I claim that any property of intelligence boils down to the “three levels of exploration”. Any talent, any skill. The model is supposed to be “self-evident” because of its simplicity, it’s not based on direct analysis of famous smart people.
Take the strong interpretation with a lot of grains of salt, of course, because I’m not an established thinker and I haven’t achieved anything intellectual. I just thought “hey, this is a funny little simple idea, what ifallintelligence works like this?”, that’s all.
That said, I’ll need to make a couple of extraordinary claims “from inside the framework” (i.e. assuming it’s 100% correct and 100% useful). Just because that’s in the spirit of the idea. Just because it allows to explore the idea to its logical conclusion. Definitely not because I’m a crazy man. You can treat the most outlandish claims as sci-fi ideas.
A formula of thinking?
Can you “reduce” thinking to a single formula? (Sounds like cringe and crackpottery!)
Can you show a single path of the best and fastest thinking?
Well, there’s an entire class of ideas which attempt to do this in different fields, especially the first idea:
My idea is just another attempt at reduction. You don’t have to treat such attempts 100% seriously in order to find value in them. You don’t have to agree with them.
Three levels of exploration
Let’s introduce my framework.
In any topic, there are three levels of exploration:
You study a single X.
You study types of different X. Often I call those types “qualities” of X.
You study types of changes (D): in what ways different X change/get changed by a new thing Y. Y and D need to be important even outside of the (main) context of X.
The point is that at the 2nd level you study similarities between different X directly, but at the 3rd level you study similarities indirectly through new concepts Y and D. The letter “D” means “dynamics”.
I claim that any property of intelligence can be boiled down to your “exploration level”. Any talent, any skill and even more vague things such as “level of intentionality”. I claim that the best and most likely ideas come from the 3rd level. That 3rd level defines the absolute limit of currently conceivable ideas. So, it also indirectly defines the limit of possible/conceivable properties of reality.
You don’t need to trust those extraordinary claims. If the 3rd level simply sounds interesting enough to you and you’re ready to explore it, that’s good enough.
Three levels simplified
A vague description of the three levels:
You study objects.
You study qualities of objects.
You study changes of objects.
Or:
You study a particular thing.
You study everything.
You study abstract ways (D) in which the thing is changed by “everything”.
Or:
You study a particular thing.
You study everything.
You study everything through a particular thing.
So yeah, it’s a Hegelian dialectic rip-off. Down below are examples of applying my framework to different topics. You don’t need to read them all, of course.
Exploring debates
1. Argumentation
I think there are three levels of exploring arguments:
You judge arguments as right or wrong. Smart or stupid.
You study types of arguments. Without judgement.
You study types of changes (D): how arguments change/get changed by some new thing Y. (“dynamics” of arguments)
If you want to get a real insight about argumentation, you need to study how (D) arguments change/get changed by some new thing Y. D and Y need to be important even outside of the context of explicit argumentation.
For example, Y can be “concepts”. And D can be “connecting/separating” (a fundamental process which is important in a ton of contexts). You can study in what ways arguments connect and separate concepts.
A simplified political example: a capitalist can tend to separate concepts (“bad things are caused by mistakes and bad actors”), while a socialist can tend to connect concepts (“bad things are caused by systemic problems”). Conflict Vs. Mistake^(1) is just a very particular version of this dynamic. Different manipulations with concepts create different arguments and different points of view. You can study all such dynamics. You can trace arguments back to fundamental concept manipulations. It’s such a basic idea and yet nobody has done it. Aristotle has done it 2400 years ago, but for formal logic.
^(1. I don’t agree with Scott Alexander, by the way.)
Arguments: conclusion
I think most of us are at the level 1 in argumentation: we throw arguments at each other like angry cavemen without studying what an “argument” is and/or what dynamics it creates. If you completely unironically think that “stupid arguments” exist, then you’re probably on the 1st level. Professional philosophers are at the level 2 at best, but usually lower (they are surprisingly judgemental). At least they are somewhat forced to be tolerant to the most diverse types of arguments due to their profession.
On what level are you? Have you studied arguments without judgement?
2. Understanding/empathy
I think there are three levels in understanding your opponent:
You study a specific description (X) of your opponent’s opinion. You can pass the Ideological Turing Test in a superficial way. Like a parrot.
You study types of descriptions of your opponent’s opinion. (“Qualities” of your opponent’s opinion.) You can “inhabit” the emotions/mindset of your opponent.
You study types of changes (D): how the description of your opponent’s opinion changes/get changed by some new thing Y. D and Y need to be important even outside of debates.
For example, Y can be “copies of the same thing” and D can be “transformations of copies into each other”. Such Y and D are important even outside of debates.
So, on the 3rd level you may be able to describe the opponent’s position as a weaker version/copy of your own position (Y) and clearly imagine how yourposition could turn out to be “the weaker version/copy” of the opponent’s views. You can imagine how opponent’s opinion transforms into truth and your opinion transforms into a falsehood (D).
Other interesting choices of Y and D are possible. For example, Y can be “complexity of the opinion [in a given context]”; D can be “choice of the context” and “increasing/decreasing of complexity”. You can run the opinion of your opponent through different contexts and see how much it reacts to/accommodates the complexity of the world.
Empathy: conclusion
I think people very rarely do the 3rd level of empathy.
Doing it systematically would lead to a new political/epistemological paradigm.
Exploring philosophy
1. Beliefs and ontology
I think there are three levels of studying the connection between beliefs and ontology:
You think you can see the truth of a belief directly. For example, you can say “all beliefs which describe reality in a literal way are true”. You get stuff like Naïve Realism. “Reality is real.”
What can D and Y be? Both things need to be important even outside of the context of explicit beliefs. A couple of versions:
Y can be “semantic connections”. D can be “connecting/separating [semantic connections]”. Both things are generally important, for example in linguistics, in studying semantic change. We get Berkeley’s idealism.
Y can be “probability mass” or some abstract “weight”. D can be “distribution of the mass/weight”. We get probabilism/Bayesianism.
Thinking at the level of semantic connections should be natural to people, because they use natural language and… neural nets in their brains! (Berkeley makes a similar argument: “hey, folks, this is just common sense!”) And yet this idea is extremely alien to people epistemology-wise and ontology-wise. I think the true potential of the 3rd level remains unexplored.
Beliefs: conclusion
I think most rationalists (Bayesians, LessWrong people) are “confused” between the 2nd level and the 1st level, even though they have some 3rd level tools.
Eliezer Yudkowsky is “confused” between the 1st level and the 3rd level: he likes level 1 ideas (e.g. “map is not the territory”), but has a bunch of level 3 ideas (“some maps are the territory”) about math, probability, ethics, decision theory, Security Mindset...
2. Ontology and reality
I think there are three level of exploring the relationship between ontologies and reality:
You think that an ontology describes the essence of reality.
You study how different ontologies describe different aspects of reality.
You study types of changes (D): how ontologies change/get changed by some other concept Y. D and Y need to be important even outside of the topic of (pure) ontology.
Y can be “human minds” or simply “objects”. D can be “matching/not matching” or “creating a structure” (two very basic, but generally important processes). You get Kant’s “Copernican revolution” (reality needs to match your basic ontology, otherwise information won’t reach your mind: but there are different types of “matching” and transcendental idealism defines one of the most complicated ones) and Ontic Structural Realism (ontology is not about things, it’s about structures created by things) respectively.
On what level are you? Have you studied ontologies/epistemologies without judgement? What are the most interesting ontologies/epistemologies you can think of?
3. Philosophy overall
I think there are three levels of doing philosophy in general:
You try to directly prove an idea in philosophy using specific philosophical tools.
You study types of philosophical ideas.
You study types of changes (D): how philosophical ideas change/get changed by some other thing Y. D and Y need to be important even outside of (pure) philosophy.
Semantic connections. (my weak philosophical attempts are here!)
Subjective experience (qualia).
I think people did a lot of 3rd level philosophy, but we haven’t fully committed to the 3rd level yet. We are used to treating philosophy as a closed system, even when we make significant steps outside of that paradigm.
Exploring ethics
1. Commitment to values
I think there are three levels of values:
Real values. You treat your values as particular objects in reality.
Subjective values. You care only about things inside of your mind. For example, do you feel good or not?
Semantic values. You care about types of changes (D): how your values change/get changed by reality (Y). Your value can be expressed as a combination of the three components: “a real thing + its meaning + changes”.
Example of a semantic value: you care about your friendship with someone. You will try to preserve the friendship. But in a limited way: you’re ready that one day the relationship may end naturally (your value may “die” a natural death). Semantic values are temporal and path-dependent. Semantic values are like games embedded in reality: you want to win the game without breaking the rules.
2. Ethics
I think there are three levels of analyzing ethics:
You analyze norms of specific communities and desires of specific people. That’s quite easy: you are just learning facts.
You analyze types of norms and desires. You are lost in contradictory implications, interpretations and generalizations of people’s values. You have a meta-ethical paralysis.
You study types of changes (D): how norms and desires change/get changed by some other thing Y. D and Y need to be important even outside of (purely) ethical context.
Ethics: tasks and games
For example, Y can be “tasks, games, activities” and D can be “breaking/creating symmetries”. You can study how norms and desires affect properties of particular activities.
Let’s imagine an Artificial Intelligence or a genie who fulfills our requests (it’s a “game” between us). We can analyze how bad actions of the genie can break important symmetries of the game. Let’s say we asked it to make us a cup of coffee:
If it killed us after making the coffee, we can’t continue the game. And we ended up with less than we had before. And we wouldn’t make the request if we knew that’s gonna happen. And the game can’t be “reversed”: the players are dead.
If it has taken us under mind control, we can’t affect the game anymore (and it gained 100% control over the game). If it placed us into a delusion, then the state of the game can be arbitrarily affected (by dissolving the illusion). And depends on perspective.
If it made us addicted to coffee, we can’t stop or change the game anymore. And the AI/genie drastically changed the nature of the game without our consent. It changed how the “coffee game” relates to all other games, skewed the “hierarchy of games”.
Those are all “symmetry breaks”. And such symmetry breaks are bad in most of the tasks.
Ethics: Categorical Imperative
With Categorical Imperative, Kant explored a different choice of Y and D. Now Y is “roles of people”, “society” and “concepts”; D is “universalization” and “becoming incoherent/coherent” and other things.
Ethics: Preferences
If Y is “preferences” and D is “averaging”, we get Preference utilitarianism. (Preferences are important even outside of ethics and “averaging” is important everywhere.) But this idea is too “low-level” to use in analysis of ethics.
However, if Y is “versions of an abstract preference” and D is “splitting a preference into versions” and “averaging”, then we get a high-level analog of preference utilitarianism. For example, you can take an abstract value such as Bodily autonomy and try to analyze the entirety of human ethics as an average of versions (specifications) of this abstract value.
Preference utilitarianism reduces ethics to an average of micro-values, the idea above reduces ethics to an average of a macro-value.
Ethics: conclusion
So, what’s the point of the 3rd level of analyzing ethics? The point is to find objective sub-structures in ethics where you can apply deduction to exclude the most “obviously awful” and “maximally controversial and irreversible” actions. The point is to “derive” ethics from much more broad topics, such as “meaningful games” and “meaningful tasks” and “coherence of concepts”.
I think:
Moral philosophers and Alignment researches are ignoring the 3rd level. People are severely underestimating how much they know about ethics.
Acknowledging the 3rd level doesn’t immediately solve Alignment, but it can “solve” ethics or the discourse around ethics. Empirically: just study properties of tasks and games and concepts!
Eliezer Yudkowsky has limited 3rd level understanding of meta-ethics (“Abstracted Idealized Dynamics”, “Morality as Fixed Computation”, “The Bedrock of Fairness”) but misses that he could make his idea more broad.
Particularism (in ethics and reasoning in general) could lead to the 3rd level understanding of ethics.
Exploring perception
1. Properties
There are three levels of looking at properties of objects:
Inherent properties. You treat objects as having more or less inherent properties. E.g. “this person is inherently smart”
Meta-properties. You treat any property as universal. E.g. “anyone is smart under some definition of smartness”
Semantic properties. You treat properties only as relatively attached to objects. You focus on types of changes (D): how properties and their interpretations change/get changed by some other thing Y. You “reduce” properties to D and Y. E.g. “anyone can be a genius or a fool under certain important conditions” or “everyone is smart, but in a unique and important way”
2. Commitment to experiences and knowledge
I think there are three levels of commitment to experiences:
You’re interested in particular experiences.
You want to explore all possible experiences.
You’re interested in types of changes (D): how your experience changes/get changed by some other thing Y. D and Y need to be important even outside of experience.
So, on the 3rd level you care about interesting ways (D) in which experiences correspond to reality (Y).
3. Experience and morality
I think there are three levels of investigating the connection between experience and morality:
You study how experience causes us to do good or bad things.
You study all the different experiences “goodness” and “badness” causes in us.
You study types of changes (D): how your experience changes/get changed by some other thing Y. D and Y need to be important even outside of experience. But related to morality anyway.
For example, Y can be “[basic] properties of concepts” and D can be “matches / mismatches [between concepts and actions towards them]”. You can study how experience affects properties of concepts which in turn bias actions. An example of such analysis: “loving a sentient being feels fundamentally different from eating a sandwich. food taste is something short and intense, but love can be eternal and calm. this difference helps to not treat other sentient beings as something disposable”
I think the existence of the 3rd level isn’t acknowledged much. Most versions of moral sentimentalism are 2nd level at best. Epistemic Sentimentalism can be 3rd level in the best case.
Exploring cognition
1. Patterns
I think there are three levels of [studying] patterns:
You study particular patterns (X). You treat patterns as objective configurations in reality.
You study all possible patterns. You treat patterns as subjective qualities of information, because most patterns are fake.
You study types of changes (D): how patterns change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) pattern analysis. You treat a pattern as a combination of the three components: “X + Y + D”.
For example, Y can be “pieces of information” or “contexts”: you can study how patterns get discarded or redefined (D) when new information gets revealed/new contexts get considered.
You can study patterns which are “objective”, but exist only in a limited context. For example, think about your friend’s bright personality (personality = a pattern). It’s an “objective” pattern, and yet it exists only in a limited context: the pattern would dissolve if you compared your friend to all possible people. Or if you saw your friend in all possible situations they could end up in. Your friend’s personality has some basis in reality (X), has a limited domain of existence (Y) and the potential for change (D).
2. Patterns and causality
I think there are three levels in the relationship between patterns and causality. I’m going to give examples about visual patterns:
You learn which patterns are impossible due to local causal processes. For example: “I’m unlikely to see a big tower made of eggs standing on top of each other”. It’s just not a stable situation due to very familiar laws of physics.
You learn statistical patterns (correlations) which can have almost nothing to do with causality. For example: “people like to wear grey shirts”.
You learn types of changes (D): how patterns change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) pattern analysis. And related to causality.
Y can be “basic properties of images” and “basic properties of patterns”; D can be “sharing properties” and “keeping the complexity the same”. In simpler words:
On the 3rd level you learn patterns which have strong connections to other patterns and basic properties of images. You could say such patterns are created/prevented by “global” causal processes. For example: “I’m unlikely to see a place fully filled with dogs. dogs are not people or birds or insects, they don’t create such crowds or hordes”. This is very abstract, connects to other patterns and basic properties of images.
Causality: implications for Machine Learning
I think...
It’s likely that Machine Learning models don’t learn 3rd level patterns as well as they could, as sharply as they could.
Machine Learning models should be 100% able to learn 3rd level patterns. It shouldn’t require any specific data.
Learning/comparing level 3 patterns is interesting enough on its own. It could be its own area of research. But we don’t apply statistics/Machine Learning to try mining those patterns. This may be a missed opportunity for humans.
3. Cognitive processes
Suppose you want to study different cognitive processes, skills, types of knowledge. There are three levels:
You study particular cognitive processes.
You study types (qualities) of cognitive processes. And types of types (classifications).
You study types of changes (D): how cognitive processes change/get changed by some other thing Y. D and Y need to be important even without the context of cognitive processes.
For example, Y can be “fundamental configurations / fundamental objects” and D can be “finding a fundamental configuration/object in a given domain”. You can “reduce” different cognitive process to those Y and D: (names of the processes below shouldn’t be taken 100% literally)
^(1 “fundamental” means “VERY widespread in a certain domain”)
Causal reasoning learns fundamental configurations of fundamental objects in the real world. So you can learn stuff like “this abstract rule applies to most objects in the world”.
Symbolic reasoning learns fundamental configurations of fundamental objects in your “concept space”. So you can learn stuff like “”concept A containing concept B” is an important pattern” (see set relations).
Correlational reasoning learns specific configurations of specific objects.
Mathematical reasoning learns specific configurations of fundamental objects. So you can build arbitrary structures with abstract building blocks.
Self-aware reasoning can transform fundamental objects into specific objects. So you can think thoughts like, for example, “maybe I’m just a random person with random opinions” (you consider your perspective as non-fundamental) or “maybe the reality is not what it seems”.
I know, this looks “funny”, but I think all this could be easily enough formalized. Isn’t that a natural way to study types of reasoning? Just ask what knowledge a certain type of reasoning learns!
Exploring theories
1. Science
I think there are three ways of doing science:
You predict a specific phenomenon.
You study types of phenomena. (qualities of phenomena)
You study types of changes (D): how the phenomenon changes/get changed by some other thing Y. D and Y need to be important even outside of this phenomenon.
Imagine you want to explain combustion (why/how things burn):
You try to predict combustion. This doesn’t work, because you already know “everything” about burning and there are many possible theories. You end up making things up because there’s not enough new data.
You try to compare combustion to other phenomena. You end up fantasizing about imaginary qualities of the phenomenon. At this level you get something like theories of “classical elements” (fantasies about superficial similarities).
You find or postulate a new thing (Y) which affects/gets affected (D) by combustion. Y and D need to be important in many other phenomena. If Y is “types of matter” and D is “releasing / absorbing”, this gives you Phlogiston theory. If Y is “any matter” and D is “conservation of mass” and “any transformations of matter”, you get Lavoisier’s theory. If Y is “small pieces of matter (atoms)” and D is “atoms hitting each other”, you get Kinetic theory of gases.
So, I think phlogiston theory was a step in the right direction, but it failed because the choice of Y and D wasn’t abstract enough.
I think most significant scientific breakthroughs require level 3 ideas. Partially “by definition”: if a breakthrough is not “level 3″, then it means it’s contained in a (very) specific part of reality.
2. Math
I think there are three ways of doing math:
You explore specific mathematical structures.
You explore types of mathematical structures. And types of types. And typologies. At this level you may get something like Category theory.
You study types of changes (D): how equations change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) math.
Mathematico-philosophical insights
Let’s look at math through the lens of the 3rd level:
Let Y be “infinitely small building blocks” and “infinitely diminishing building blocks”; let D be “becoming infinitely small” and “reaching the limit”. Those Y and D matter even outside of math. We got Calculus.
Let Y be “quasi-physical materials” and D be “stretching, bending etc.”. Those Y and D matter even outside of math. We got Topology.
Let Y be “probability”. That was a completely new concept in all domains of knowledge. We got Probability theory.
Let Y be “different scales” and “different parts”; let D be “(not) repeating”. We got Fractals and Recursion.
Let Y be “directed things” and D be “compositions of movements”. We got Vectors
All concepts above are “3rd level”. But we can classify them, creating new three levels of exploration (yes, this is recursion!). Let’s do this. I think there are three levels of mathematico-philosophical concepts:
Concepts that change the properties of things we count. (e.g. topology, fractals, graph theory)
To define “meta ideas” we need to think about many pairs of “Y, D” simultaneously. This is the most speculative part of the post. Remember, you can treat those speculations simply as sci-fi ideas.
Each pair of abstract concepts (Y, D) defines a “language” for describing reality. And there’s a meta-language which connects all those languages. Or rather there’s many meta-languages. Each meta-language can be described by a pair of abstract concepts too (Y, D).
^(Instead of “languages” I could use the word “models”. But I wanted to highlight that those “models” don’t have to be formal in any way.)
I think the idea of “meta-languages” can be used to analyze:
Consciousness. You can say that consciousness is “made of” multiple abstract interacting languages. On one hand it’s just a trivial description of consciousness, on another hand it might have deeper implications.
Qualia. You can say that qualia is “made of” multiple abstract interacting languages. On one hand this is a trivial idea (“qualia is the sum of your associations”), on another hand this formulation adds important specific details.
The ontology of reality. You can argue that our ways to describe reality (“physical things” vs. purely mathematical concepts, subjective experience vs. physical world, high-level patterns vs. complete reductionism, physical theory vs. philosophical ontology) all conflict with each other and lead to paradoxes when taken to the extreme, but can’t exist without each other. Maybe they are all intertwined?
Meta-ethics. You can argue that concepts like “goodness” and “justice” can’t be reduced to any single type of definition. So, you can try to reduce them to a synthesis of many abstract languages. See G. E. Moore ideas about indefinability: the naturalistic fallacy, the open-question argument.
According to the framework, ideas about “meta-languages” define the limit of conceivable ideas.
If you think about it, it’s actually a quite trivial statement: “meta-models” (consisting of many normal models) is the limit of conceivable models. Your entire conscious mind is such “meta-model”. If no model works for describing something, then a “meta-model” is your last resort. On one hand “meta-models” is a very trivial idea^(1), on another hand nobody ever cared to explore the full potential of the idea.
^(1 for example, we have a “meta-model” of physics: a combination of two wrong theories, General Relativity and Quantum Mechanics.)
Nature of percepts
I talked about qualia in general. Now I just want to throw out my idea about the nature of particular percepts.
There are theories and concepts which link percepts to “possible actions” and “intentions”: see Affordance. I like such ideas, because I like to think about types of actions.
So I have a variation of this idea: I think that any percept gets created by an abstract dynamic (Y, D) or many abstract dynamics. Any (important) percept corresponds to a unique dynamic. I think abstract dynamics bind concepts.
^(But I have only started to think about this. I share it anyway because I think it follows from all the other ideas.)
P.S.
Thank you for reading this.
If you want to discuss the idea, please focus on the idea itself and its particular applications. Or on exploring particular topics!
(draft of a future post)
I want to share my model of intelligence and research. You won’t agree with it at the first glance. Or at the third glance. (My hope is that you will just give up and agree at the 20th glance.)
But that’s supposed to be good: it means the model is original and brave enough to make risky statements.
In this model any difference in “intelligence levels” or any difference between two minds in general boils down to “commitment level”.
What is “commitment”?
On some level, “commitment” is just a word. It’s not needed to define the ideas I’m going to talk about. What’s much more important is the three levels of commitment. There are often three levels which follow the same pattern, the same outline:
Level 1. You explore a single possibility.
Level 2. You want to explore all possibilities. But you are paralyzed by the amount of possibilities. At this level you are interested in qualities of possibilities. You classify possibilities and types of possibilities.
Level 3. You explore all possibilities through a single possibility. At this level you are interested in dynamics of moving through the possibility space. You classify implications of possibilities.
...
I’m going to give specific examples of the pattern above. This post is kind of repetitive, but it wasn’t AI-generated, I swear. Repetition is a part of commitment.
Why is commitment important?
My explanation won’t be clear before you read the post, but here it goes:
Commitment describes your values and the “level” of your intentionality.
Commitment describes your level of intelligence (in a particular topic). Compared to yourself (your potential) or other people.
Commitments are needed for communication. Without shared commitments it’s impossible for two people to find a common ground.
Commitment describes the “true content” of an argument, an idea, a philosophy. Ultimately, any property of a mind boils down to “commitments”.
Basics
1. Commitment to exploration
I think there are three levels of commitment to exploration.
Level 1. You treat things as immediate means to an end.
Imagine two enemy caveman teleported into a laboratory. They try to use whatever they find to beat each other. Without studying/exploring what they’re using. So, they are just throwing microscopes and beakers at each other. They throw anti-matter guns at each other without even activating them.
Level 2. You explore things for the sake of it.
Think about mathematicians. They can explore math without any goal.
Level 3. You use particular goals to guide your exploration of things. Even though you would care about exploring them without any goal anyway. The exploration space is just too large, so you use particular goals to narrow it down.
Imagine a physicist who explores mathematics by considering imaginary universes and applying physical intuition to discover deep mathematical facts. Such person uses a particular goal/bias to guide “pure exploration”. (inspired by Edward Witten, see Michael Atiyah’s quote)
More examples
In terms of exploring ideas, our culture is at the level 1 (angry caveman). We understand ideas only as “ideas of getting something (immediately)” or “ideas of proving something (immediately)”. We are not interested in exploring ideas for the sake of it. The only metrics we apply to ideas are “(immediate) usefulness” and “trueness”. Not “beauty”, “originality” and “importance”. People in general are at the level 1. Philosophers are at the level 1 or “1.5″. Rationality community is at the level 1 too (sadly): rationalists still mostly care only about immediate usefulness and truth.
In terms of exploring argumentation and reasoning, our culture is at the level 1. If you never thought “stupid arguments don’t exist”, then you are at the level 1: you haven’t explored arguments and reasoning for the sake of it, you immediately jumped to assuming “The Only True Way To Reason” (be it your intuition, scientific method, particular ideology or Bayesian epistemology). You haven’t stepped outside of your perspective a single time. Almost everyone is at the level 1. Eliezer Yudkowsky is at the level 3, but in a much narrower field: Yudkowsky explored rationality with the specific goal/bias of AI safety. However, overall Eliezer is at level 1 too: never studied human reasoning outside of what he thinks is “correct”.
I think this is kind of bad. We are at the level 1 in the main departments of human intelligence and human culture. Two levels below our true potential.
2. Commitment to goals
I think there are three levels of commitment to goals.
Level 1. You have a specific selfish goal.
“I want to get a lot of money” or “I want to save my friends” or “I want to make a ton of paperclips”, for example.
Level 2. You have an abstract goal. But this goal doesn’t imply much interaction with the real world.
“I want to maximize everyone’s happiness” or “I want to prevent (X) disaster”, for example. This is a broad goal, but it doesn’t imply actually learning and caring about anyone’s desires (until the very end). Rationalists are at this level of commitment.
Level 3. You use particular goals to guide your abstract goals.
Some political activists are at this level of commitment. (But please, don’t bring CW topics here!)
3. Commitment to updating
“Commitment to updating” is the ability to re-start your exploration from the square one. I think there are three levels to it.
Level 1. No updating. You never change ideas.
You just keep piling up your ideas into a single paradigm your entire life. You turn beautiful ideas into ugly ones so they fit with all your previous ideas.
Level 2. Updating. You change ideas.
When you encounter a new beautiful idea, you are ready to reformulate your previous knowledge around the new idea.
Level 3. Updating with “check points”. You change ideas, but you use old ideas to prime new ones.
When you explore an idea, you mark some “check points” which you reached with that idea. When you ditch the idea for a new one, you still keep in mind the check points you marked. And use them to explore the new idea faster.
Science
4.1 Commitment and theory-building
I think there are three levels of commitment in theory-building.
Level 1.
You build your theory using only “almost facts”. I.e. you come up with “trivial” theories which are almost indistinguishable from the things we already know.
Level 2.
You build your theory on speculations. You “fantasize” important properties of your idea (which are important only to you or your field).
Level 3.
You build your theory on speculations. But those speculations are important even outside of your field.
I think Eliezer Yudkowsky and LW did theory-building of the 3rd level. A bunch of LW ideas are philosophically important even if you disagree with Bayesian epistemology (Eliezer’s view on ethics and math, logical decision theories and some Alignment concepts).
4.2 Commitment to explaining a phenomenon
I think there are three types of commitment in explaining a phenomenon.
Level 1.
You just want to predict the phenomenon. But many-many possible theories can predict the phenomenon, so you need something more.
Level 2.
You compare the phenomenon to other phenomena and focus on its qualities.
That’s where most of theories go wrong: people become obsessed with their own fantasies about qualities of a phenomenon.
Level 3.
You focus on dynamics which connect this phenomenon to other phenomena. You focus on overlapping implications of different phenomena. 3rd level is needed for any important scientific breakthrough. For example:
Imagine you want to explain combustion (why/how things burn). On one hand you already “know everything” about the phenomenon, so what do you even do? Level 1 doesn’t work. So, you try to think about qualities of burning, types of transformations, types of movement… but that won’t take you anywhere. Level 2 doesn’t work too. The right answer: you need to think not about qualities of transformations and movements, but about dynamics (conservation of mass, kinetic theory of gases) which connect different types of transformations and movements. Level 3 works.
Epistemology pt. 1
5. Commitment and epistemology
I think there are three levels of commitment in epistemology.
Level 1. You assume the primary reality of the physical world. (Physicism)
Take statements “2 + 2 = 4” and “God exists”. To judge those statements, a physicist is going to ask “Do those statements describe reality in a literal way? If yes, they are true.”
Level 2. You assume the primary reality of statements of some fundamental language. (Descriptivism)
To judge statements, a descriptivist is going to ask “Can those statements be expressed in the fundamental language? If yes, they are true.”
Level 3. You assume the primary reality of semantic connections between statements of languages. And the primary reality of some black boxes which create those connections. (Connectivism) You assume that something physical shapes the “language reality”.
To judge statements, a connectivist is going to ask “Do those statements describe an important semantic connection? If yes, they are true.”
...
Recap. Physicist: everything “physical” exists. Descriptivist: everything describable exists. Connectivist: everything important exists. Physicist can be too specific and descriptivist can be too generous. (This pattern of being “too specific” or “too generous” repeats for all commitment types.)
Thinking at the level of semantic connections should be natural to people (because they use natural language and… neural nets in their brains!). And yet this idea is extremely alien to people epistemology-wise.
Implications for rationality
In general, rationalists are “confused” between level 1 and level 2. I.e. they often treat level 2 very seriously, but aren’t fully committed to it.
Eliezer Yudkowsky is “confused” between level 1 and level 3. I.e. Eliezer has a lot of “level 3 ideas”, but doesn’t apply level 3 thinking to epistemology in general.
On one hand, Eliezer believes that “map is not the territory”. (level 1 idea)
On another hand, Eliezer believes that math is an “objective” language shaped by the physical reality. (level 3 idea)
Similarly, Eliezer believes that human ethics are defined by some important “objective” semantic connections (which can evolve, but only to a degree). (level 3)
“Logical decision theories” treat logic as something created by connections between black boxes. (level 3)
When you do Security Mindset, you should make not only “correct”, but beautiful maps. Societal properties of your map matter more than your opinions. (level 3)
So, Eliezer has a bunch of ideas which can be interpreted as “some maps ARE the territory”.
6. Commitment and uncertainty
I think there are three levels of commitment in doubting one’s own reasoning.
Level 1.
You’re uncertain about superficial “correctness” of your reasoning. You worry if you missed a particular counter argument. Example: “I think humans are dumb. But maybe I missed a smart human or applied a wrong test?”
Level 2.
You un-systematically doubt your assumptions and definitions. Maybe even your inference rules a little bit (see “inference objection”). Example: “I think humans are dumb. But what is a “human”? What is “dumb”? What is “is”? And how can I be sure in anything at all?”
Level 3.
You doubt the semantic connections (e.g. inference rules) in your reasoning. You consider particular dynamics created by your definitions and assumptions. “My definitions and assumptions create this dynamic (not presented in all people). Can this dynamic exploit me?”
Example: “I think humans are dumb. But can my definition of “intelligence” exploit me? Can my pessimism exploit me? Can this be an inconvenient way to think about the world? Can my opinion turn me into a fool even I’m de facto correct?”
...
Level 3 is like “security mindset” applied to your own reasoning. LW rationality mostly teaches against it, suggesting you to always take your smallest opinions at face value as “the truest thing you know”. With some exceptions, such as “ethical injunctions”, “radical honesty”, “black swan bets” and “security mindset”.
Epistemology pt. 2
7. Commitment to understanding/empathy
I think there are three levels of commitment in understanding your opponent.
Level 1.
You can pass the Ideological Turing Test in a superficial way (you understand the structure of the opponent’s opinion).
Level 2. “Telepathy”.
You can “inhabit” the emotions/mindset of your opponent.
Level 3.
You can describe the opponent’s position as a weaker version/copy of your own position. And additionally you can clearly imagine how your position could turn out to be “the weaker version/copy” of the opponent’s position. You find a balance between telepathy and “my opinion is the only one which makes sense!”
8. Commitment to “resolving” problems
I think there are three levels of commitment in “resolving” problems.
Level 1.
You treat a problem as a puzzle to be solved by Your Favorite True Epistemology.
Level 2.
You treat a problem as a multi-layered puzzle which should be solved on different levels.
Level 3.
You don’t treat a problem as a self-contained puzzle. You treat it as a “symbol” in the multitude of important languages. You can solve it by changing its meaning (by changing/exploring the languages).
Applying this type of thinking to the Unexpected hanging paradox:
Alignment pt. 1
9.1 Commitment to morality
I think there are three levels of commitment in morality.
Level 1. Norms, desires.
You analyze norms of specific communities and desires of specific people. That’s quite easy: you are just learning facts.
Level 2. Ethics and meta-ethics.
You analyze similarities between different norms and desires. You get to pretty abstract and complicated values such as “having agency, autonomy, freedom; having an interesting life; having an ability to form connections with other people”. You are lost in contradictory implications, interpretations and generalizations of those values. You have a (meta-)ethical paralysis.
Level 3. “Abstract norms”.
You analyze similarities between implications of different norms and desires. You analyze dynamics created by specific norms. You realize that the most complicated values are easily derivable from the implications of the simplest norms. (Not without some bias, of course, but still.)
I think moral philosophers and Alignment researches are seriously dropping the ball by ignoring the 3rd level. Acknowledging the 3rd level doesn’t immediately solve Alignment, but it can pretty much “solve” ethics (with a bit of effort).
9.2 Commitment to values
I think there are three levels of values.
Level 1. Inside values (“feeling good”).
You care only about things inside of your mind. For example, do you feel good or not?
Level 2. Real values.
You care about things in the real world. Even though you can’t care about them directly. But you make decisions to not delude yourself and not “simulate” your values.
Level 3. Semantic values.
You care about elements of some real system. And you care about proper dynamics of this system. For example, you care about things your friend cares about. But it’s also important to you that your friend is not brainwashed, not controlled by you. And you are ready that one day your friend may stop caring about anything. (Your value may “die” a natural death.)
3rd level is the level of “semantic values”. They are not “terminal values” in the usual sense. They can be temporal and history-dependent.
9.3 Commitment and research interest
So, you’re interested in ways in which an AI can go wrong. What specifically can you be interested in? I think there are three levels to it.
Level 1. In what ways some AI actions are bad?
You classify AI bugs into types. For example, you find “reward hacking” type of bugs.
Level 2. What qualities of AIs are good/bad?
You classify types of bugs into “qualities”. You find such potentially bad qualities as “AI doesn’t care about the real world” and “AI doesn’t allow to fix itself (corrigibility)”.
Level 3. What bad dynamics are created by bad actions of AI? What good dynamics are destroyed?
Assume AI turned humanity into paperclips. What’s actually bad about that, beyond the very first obvious answer? What good dynamics did this action destroy? (Some answers: it destroyed the feedback loop, the connection between the task and its causal origin (humanity), the value of paperclips relative to other values, the “economical” value of paperclips, the ability of paperclips to change their value.)
On the 3rd level you classify different dynamics. I think people completely ignore the 3rd level. In both Alignment and moral philosophy. 3rd level is the level of “semantic values”.
Alignment pt. 2
10. Commitment to Security Mindset
I think Security Mindset has three levels of commitment.
Level 1. Ordinary paranoia.
You have great imagination, you can imagine very creative attacks on your system. You patch those angles of attack.
Level 2. Security Mindset.
You study your own reasoning about safety of the system. You check if your assumptions are right or wrong. Then, you try to delete as much assumptions as you can. Even if they seem correct to you! You also delete anomalies of the system even if they seem harmless. You try to simplify your reasoning about the system seemingly “for the sake of it”.
Level 3.
You design a system which would be safe even in a world with changing laws of physics and mathematics. Using some bias, of course (otherwise it’s impossible).
Humans, idealized humans are “level 3 safe”. All/almost all current approaches to Alignment don’t give you a “level 3 safe” AI.
11. Commitment to Alignment
I think there are three levels of commitment a (mis)aligned AI can have. Alternatively, those are three or two levels at which you can try to solve the Alignment problem.
Level 1.
AI has a fixed goal or a fixed method of finding a goal (which likely can’t be Aligned with humanity). It respects only its own agency. So, ultimately it does everything it wants.
Level 2.
AI knows that different ethics are possible and is completely uncertain about ethics. AI respects only other people’s agency. So, it doesn’t do anything at all (except preventing, a bit lazily, 100% certain destruction and oppression). Or requires an infinite permission:
Am I allowed to calculate “2 + 2”?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world?
Am I allowed to calculate “2 + 2” even if it leads to a slight change of the world which you can’t fully comprehend even if I explain it to you?
...
Wait, am I allowed to ask those question? I’m already manipulating you by boring you to death. I can’t even say anything.
Level 3.
AI can respect both its own agency and the agency of humanity. AI finds a way to treat its agency as the continuation of the agency of people. AI makes sure it doesn’t create any dynamic which couldn’t be reversed by people (unless there’s nothing else to do). So, AI can both act and be sensitive to people.
Implications for Alignment research
I think a fully safe system exists only on the level 3. The most safe system is the system which understands what “exploitation” means, so it never willingly exploits its rewards in any way. Humans are an example of such system.
I think alignment researchers are “confused” between level 1 and level 3. They try to fix different “exploitation methods” (ways AI could exploit its rewards) instead of making the AI understand what “exploitation” means.
I also think this is the reason why alignment researches don’t cooperate much, pushing in different directions.
Perception
11. Commitment to properties
Commitments exist even on the level of perception. There are three levels of properties to which your perception can react.
Level 1. Inherent properties.
You treat objects as having more or less inherent properties. “This person is inherently smart.”
Level 2. Meta-properties.
You treat any property as universal. “Anyone is smart under some definition of smartness.”
Level 3. Semantic properties.
You treat properties only as relatively attached to objects: different objects form a system (a “language”) where properties get distributed between them and differentiated. “Everyone is smart, but in a unique way. And those unique ways are important in the system.”
12.1 Commitment to experiences and knowledge
I think there are three levels of commitment to experiences.
Level 1.
You’re interested in particular experiences.
Level 2.
You want to explore all possible experiences.
Level 3.
You’re interested in real objects which produce your experiences (e.g. your friends): you’re interested what knowledge “all possible experiences” could reveal about them. You want to know where physical/mathematical facts and experiences overlap.
12.2 Commitment to experience and morality
I think there are three levels of investigating the connection between experience and morality.
Level 1.
You study how experience causes us to do good or bad things.
Level 2.
You study all the different experiences “goodness” and “badness” causes in us.
Level 3.
You study dynamics created by experiences, which are related to morality. You study implications of experiences. For example: “loving a sentient being feels fundamentally different from eating a sandwich. food taste is something short and intense, but love can be eternal and calm. this difference helps to not treat other sentient beings as something disposable”
I think the existence of the 3rd level isn’t acknowledged much. And yet it could be important for alignment. Most versions of moral sentimentalism are 2nd level at best. Epistemic Sentimentalism can be 3rd level.
Final part
Specific commitments
You can ponder your commitment to specific things.
Are you committed to information?
Imagine you could learn anything (and forget it if you want). Would you be interested in learning different stuff more or less equally? You could learn something important (e.g. the most useful or the most abstract math), but you also could learn something completely useless—such as the life story of every ant who ever lived.
I know, this question is hard to make sense of: of course, anyone would like to learn everything/almost everything if there was no downside to it. But if you have a positive/negative commitment about the topic, then my question should make some sense anyway.
Are you committed to people?
Imagine you got extra two years to just talk to people. To usual people on the street or usual people on the Internet.
Would you be bored hanging out with them?
My answers: >!Maybe I was committed to information in general as a kid. Then I became committed to information related to people, produced by people, known by people.!<
My inspiration for writing this post
I encountered a bunch of people who are more committed to exploring ideas (and taking ideas seriously) than usual. More committed than most rationalists, for example.
But I felt those people lack something:
They are able to explore ideas, but don’t care about that anymore. They care only about their own clusters of idiosyncratic ideas.
They have very vague goals which are compatible with any specific actions.
They don’t care if their ideas could even in principle matter to people. They have “disconnected” from other people, from other people’s context (through some level of elitism).
When they acknowledge you as “one of them”, they don’t try to learn your ideas or share their ideas or argue with you or ask your help for solving a problem.
So, their commitment remains very low. And they are not “committed” to talking.
Conclusion
If you have a high level of commitment (3rd level) at least to something, then we should find a common language. You may even be like a sibling to me.
Thank you for reading this post. 🗿
Cognition
14.1 Studying patterns
I think there are three levels of commitment to patterns.
You study particular patterns.
You study all possible patterns: you study qualities of patterns.
You study implications of patterns. You study dynamics of patterns: how patterns get updated or destroyed when you learn new information.
14.2 Patterns and causality
I think there are three levels in the relationship between patterns and causality. I’m going to give examples about visual patterns.
Level 1.
You learn which patterns are impossible due to local causal processes.
For example: “I’m unlikely to see a big tower made of eggs standing on top of each other”. It’s just not a stable situation due to very familiar laws of physics.
Level 2.
You learn statistical patterns (correlations) which can have almost nothing to do with causality.
For example: “people like to wear grey shirts”.
Level 3.
You learn patterns which have a strong connection to other patterns and basic properties of images. You could say such patterns are created/prevented by “global” causal processes.
For example: “I’m unlikely to see a place fully filled with dogs. dogs are not people or birds or insects, they don’t create such crowds”. This is very abstract, connects to other patterns and basic properties of images.
Implications for Machine Learning
I think...
It’s likely that Machine Learning models don’t learn level 3 patterns as well as they could, as sharply as they could.
Machine Learning models should be 100% able to learn level 3 patterns. It shouldn’t require any specific data.
Learning/comparing level 3 patterns is interesting enough on its own. It could be its own area of research. But we don’t apply statistics/Machine Learning to try mining those patterns. This may be a missed opportunity for humans.
I think researchers are making a blunder by not asking “what kinds of patterns exist? what patterns can be learned in principle?” (not talking about universal approximation theorem)
15. Cognitive processes
Suppose you want to study different cognitive processes, skills, types of knowledge. There are three levels:
You study particular cognitive processes.
You study qualities of cognitive processes.
You study dynamics created by cognitive processes. How “actions” of different cognitive processes overlap.
I think you can describe different cognitive processes in terms of patterns they learn. For example:
Causal reasoning learns abstract configurations of abstract objects in the real world. So you can learn stuff like “this abstract rule applies to most objects in the world”.
Symbolic reasoning learns abstract configurations of abstract objects in your “concept space”. So you can learn stuff like “”concept A contains concept B” is an important pattern”.
Correlational reasoning learns specific configurations of specific objects.
Mathematical reasoning learns specific configurations of abstract objects. So you can build arbitrary structures with abstract building blocks.
Self-aware reasoning can transform abstract objects into specific objects. So you can think thoughts like, for example, “maybe I’m just a random person with random opinions”.
I think all this could be easily enough formalized.
Meta-level
Can you be committed to exploring commitment?
I think yes.
One thing you can do is to split topics into sub-topics and raise your commitment in every particular sub-topic. Vaguely similar to gradient descent. That’s what I’ve been doing in this post so far.
Another thing you can do is to apply recursion. You can split any topic into 3 levels of commitment. But then you can split the third level into 3 levels too. So, there’s potentially an infinity of levels of commitment. And there can be many particular techniques for exploiting this fact.
But the main thing is the three levels of “exploring ways to explore commitment”:
You study particular ways to raise commitment.
You study all possible ways to raise commitment.
You study all possible ways through a particular way. You study dynamics and implications which the ways create.
I don’t have enough information or experience for the 3rd level right now.
*A more “formal” version of the draft (it’s a work in progress): *
There are two interpretations of this post, weak and strong.
Weak interpretation:
I describe a framework about “thee levels of exploration”. I use the framework to introduce some of my ideas. I hope that the framework will give more context to my ideas, making them more understandable. I simply want to find people who are interested in exploring ideas. Exploring just for the sake of exploring or for a specific goal.
Strong interpretation:
I use the framework as a model of intelligence. I claim that any property of intelligence boils down to the “three levels of exploration”. Any talent, any skill. The model is supposed to be “self-evident” because of its simplicity, it’s not based on direct analysis of famous smart people.
Take the strong interpretation with a lot of grains of salt, of course, because I’m not an established thinker and I haven’t achieved anything intellectual. I just thought “hey, this is a funny little simple idea, what if all intelligence works like this?”, that’s all.
That said, I’ll need to make a couple of extraordinary claims “from inside the framework” (i.e. assuming it’s 100% correct and 100% useful). Just because that’s in the spirit of the idea. Just because it allows to explore the idea to its logical conclusion. Definitely not because I’m a crazy man. You can treat the most outlandish claims as sci-fi ideas.
A formula of thinking?
Can you “reduce” thinking to a single formula? (Sounds like cringe and crackpottery!)
Can you show a single path of the best and fastest thinking?
Well, there’s an entire class of ideas which attempt to do this in different fields, especially the first idea:
Bayesian epistemology: “epistemology in a single rule” (the rule of updating beliefs)
Utilitarianism, preference utilitarianism: “(meta-)ethics in a single rule”
Baconian method, the prototype of the scientific method: “science in a single rule”
Hegelian dialectic: “philosophy in a single process”
Marxist dialectic: “history in a single process”
My idea is just another attempt at reduction. You don’t have to treat such attempts 100% seriously in order to find value in them. You don’t have to agree with them.
Three levels of exploration
Let’s introduce my framework.
In any topic, there are three levels of exploration:
You study a single X.
You study types of different X. Often I call those types “qualities” of X.
You study types of changes (D): in what ways different X change/get changed by a new thing Y. Y and D need to be important even outside of the (main) context of X.
The point is that at the 2nd level you study similarities between different X directly, but at the 3rd level you study similarities indirectly through new concepts Y and D. The letter “D” means “dynamics”.
I claim that any property of intelligence can be boiled down to your “exploration level”. Any talent, any skill and even more vague things such as “level of intentionality”. I claim that the best and most likely ideas come from the 3rd level. That 3rd level defines the absolute limit of currently conceivable ideas. So, it also indirectly defines the limit of possible/conceivable properties of reality.
You don’t need to trust those extraordinary claims. If the 3rd level simply sounds interesting enough to you and you’re ready to explore it, that’s good enough.
Three levels simplified
A vague description of the three levels:
You study objects.
You study qualities of objects.
You study changes of objects.
Or:
You study a particular thing.
You study everything.
You study abstract ways (D) in which the thing is changed by “everything”.
Or:
You study a particular thing.
You study everything.
You study everything through a particular thing.
So yeah, it’s a Hegelian dialectic rip-off. Down below are examples of applying my framework to different topics. You don’t need to read them all, of course.
Exploring debates
1. Argumentation
I think there are three levels of exploring arguments:
You judge arguments as right or wrong. Smart or stupid.
You study types of arguments. Without judgement.
You study types of changes (D): how arguments change/get changed by some new thing Y. (“dynamics” of arguments)
If you want to get a real insight about argumentation, you need to study how (D) arguments change/get changed by some new thing Y. D and Y need to be important even outside of the context of explicit argumentation.
For example, Y can be “concepts”. And D can be “connecting/separating” (a fundamental process which is important in a ton of contexts). You can study in what ways arguments connect and separate concepts.
A simplified political example: a capitalist can tend to separate concepts (“bad things are caused by mistakes and bad actors”), while a socialist can tend to connect concepts (“bad things are caused by systemic problems”). Conflict Vs. Mistake^(1) is just a very particular version of this dynamic. Different manipulations with concepts create different arguments and different points of view. You can study all such dynamics. You can trace arguments back to fundamental concept manipulations. It’s such a basic idea and yet nobody has done it. Aristotle has done it 2400 years ago, but for formal logic.
^(1. I don’t agree with Scott Alexander, by the way.)
Arguments: conclusion
I think most of us are at the level 1 in argumentation: we throw arguments at each other like angry cavemen without studying what an “argument” is and/or what dynamics it creates. If you completely unironically think that “stupid arguments” exist, then you’re probably on the 1st level. Professional philosophers are at the level 2 at best, but usually lower (they are surprisingly judgemental). At least they are somewhat forced to be tolerant to the most diverse types of arguments due to their profession.
On what level are you? Have you studied arguments without judgement?
2. Understanding/empathy
I think there are three levels in understanding your opponent:
You study a specific description (X) of your opponent’s opinion. You can pass the Ideological Turing Test in a superficial way. Like a parrot.
You study types of descriptions of your opponent’s opinion. (“Qualities” of your opponent’s opinion.) You can “inhabit” the emotions/mindset of your opponent.
You study types of changes (D): how the description of your opponent’s opinion changes/get changed by some new thing Y. D and Y need to be important even outside of debates.
For example, Y can be “copies of the same thing” and D can be “transformations of copies into each other”. Such Y and D are important even outside of debates.
So, on the 3rd level you may be able to describe the opponent’s position as a weaker version/copy of your own position (Y) and clearly imagine how your position could turn out to be “the weaker version/copy” of the opponent’s views. You can imagine how opponent’s opinion transforms into truth and your opinion transforms into a falsehood (D).
Other interesting choices of Y and D are possible. For example, Y can be “complexity of the opinion [in a given context]”; D can be “choice of the context” and “increasing/decreasing of complexity”. You can run the opinion of your opponent through different contexts and see how much it reacts to/accommodates the complexity of the world.
Empathy: conclusion
I think people very rarely do the 3rd level of empathy.
Doing it systematically would lead to a new political/epistemological paradigm.
Exploring philosophy
1. Beliefs and ontology
I think there are three levels of studying the connection between beliefs and ontology:
You think you can see the truth of a belief directly. For example, you can say “all beliefs which describe reality in a literal way are true”. You get stuff like Naïve Realism. “Reality is real.”
You study types of beliefs. You can say that all beliefs of a certain type are true. For example, “all mathematical beliefs are true”. You get stuff like Mathematical Universe Hypothesis, Platonisim, Ontic Structural Realism… “Some description of reality is real.”
You study types of changes (D): how beliefs change/get changed by some new thing Y. You get stuff like Berkeley’s subjective idealism and radical probabilism and Bayesian epistemology: the world of changing ideas. “Some changing description of reality is real.”
What can D and Y be? Both things need to be important even outside of the context of explicit beliefs. A couple of versions:
Y can be “semantic connections”. D can be “connecting/separating [semantic connections]”. Both things are generally important, for example in linguistics, in studying semantic change. We get Berkeley’s idealism.
Y can be “probability mass” or some abstract “weight”. D can be “distribution of the mass/weight”. We get probabilism/Bayesianism.
Thinking at the level of semantic connections should be natural to people, because they use natural language and… neural nets in their brains! (Berkeley makes a similar argument: “hey, folks, this is just common sense!”) And yet this idea is extremely alien to people epistemology-wise and ontology-wise. I think the true potential of the 3rd level remains unexplored.
Beliefs: conclusion
I think most rationalists (Bayesians, LessWrong people) are “confused” between the 2nd level and the 1st level, even though they have some 3rd level tools.
Eliezer Yudkowsky is “confused” between the 1st level and the 3rd level: he likes level 1 ideas (e.g. “map is not the territory”), but has a bunch of level 3 ideas (“some maps are the territory”) about math, probability, ethics, decision theory, Security Mindset...
2. Ontology and reality
I think there are three level of exploring the relationship between ontologies and reality:
You think that an ontology describes the essence of reality.
You study how different ontologies describe different aspects of reality.
You study types of changes (D): how ontologies change/get changed by some other concept Y. D and Y need to be important even outside of the topic of (pure) ontology.
Y can be “human minds” or simply “objects”. D can be “matching/not matching” or “creating a structure” (two very basic, but generally important processes). You get Kant’s “Copernican revolution” (reality needs to match your basic ontology, otherwise information won’t reach your mind: but there are different types of “matching” and transcendental idealism defines one of the most complicated ones) and Ontic Structural Realism (ontology is not about things, it’s about structures created by things) respectively.
On what level are you? Have you studied ontologies/epistemologies without judgement? What are the most interesting ontologies/epistemologies you can think of?
3. Philosophy overall
I think there are three levels of doing philosophy in general:
You try to directly prove an idea in philosophy using specific philosophical tools.
You study types of philosophical ideas.
You study types of changes (D): how philosophical ideas change/get changed by some other thing Y. D and Y need to be important even outside of (pure) philosophy.
To give a bunch of examples, Y can be:
Society and ethical implications. See Social Ontology, Social Epistemology
The full potential of human imagination—and the reality’s weirdness. See Immanuel Kant
Forming and resolving conflicts and contradictions. See Hegelian dialectic
Evolving contexts. See Postmodernism
Language and games. See Ludwig Wittgenstein
The best predictions about low-level stuff. See Bayesian epistemology
Semantic connections. (my weak philosophical attempts are here!)
Subjective experience (qualia).
I think people did a lot of 3rd level philosophy, but we haven’t fully committed to the 3rd level yet. We are used to treating philosophy as a closed system, even when we make significant steps outside of that paradigm.
Exploring ethics
1. Commitment to values
I think there are three levels of values:
Real values. You treat your values as particular objects in reality.
Subjective values. You care only about things inside of your mind. For example, do you feel good or not?
Semantic values. You care about types of changes (D): how your values change/get changed by reality (Y). Your value can be expressed as a combination of the three components: “a real thing + its meaning + changes”.
Example of a semantic value: you care about your friendship with someone. You will try to preserve the friendship. But in a limited way: you’re ready that one day the relationship may end naturally (your value may “die” a natural death). Semantic values are temporal and path-dependent. Semantic values are like games embedded in reality: you want to win the game without breaking the rules.
2. Ethics
I think there are three levels of analyzing ethics:
You analyze norms of specific communities and desires of specific people. That’s quite easy: you are just learning facts.
You analyze types of norms and desires. You are lost in contradictory implications, interpretations and generalizations of people’s values. You have a meta-ethical paralysis.
You study types of changes (D): how norms and desires change/get changed by some other thing Y. D and Y need to be important even outside of (purely) ethical context.
Ethics: tasks and games
For example, Y can be “tasks, games, activities” and D can be “breaking/creating symmetries”. You can study how norms and desires affect properties of particular activities.
Let’s imagine an Artificial Intelligence or a genie who fulfills our requests (it’s a “game” between us). We can analyze how bad actions of the genie can break important symmetries of the game. Let’s say we asked it to make us a cup of coffee:
If it killed us after making the coffee, we can’t continue the game. And we ended up with less than we had before. And we wouldn’t make the request if we knew that’s gonna happen. And the game can’t be “reversed”: the players are dead.
If it has taken us under mind control, we can’t affect the game anymore (and it gained 100% control over the game). If it placed us into a delusion, then the state of the game can be arbitrarily affected (by dissolving the illusion). And depends on perspective.
If it made us addicted to coffee, we can’t stop or change the game anymore. And the AI/genie drastically changed the nature of the game without our consent. It changed how the “coffee game” relates to all other games, skewed the “hierarchy of games”.
Those are all “symmetry breaks”. And such symmetry breaks are bad in most of the tasks.
Ethics: Categorical Imperative
With Categorical Imperative, Kant explored a different choice of Y and D. Now Y is “roles of people”, “society” and “concepts”; D is “universalization” and “becoming incoherent/coherent” and other things.
Ethics: Preferences
If Y is “preferences” and D is “averaging”, we get Preference utilitarianism. (Preferences are important even outside of ethics and “averaging” is important everywhere.) But this idea is too “low-level” to use in analysis of ethics.
However, if Y is “versions of an abstract preference” and D is “splitting a preference into versions” and “averaging”, then we get a high-level analog of preference utilitarianism. For example, you can take an abstract value such as Bodily autonomy and try to analyze the entirety of human ethics as an average of versions (specifications) of this abstract value.
Preference utilitarianism reduces ethics to an average of micro-values, the idea above reduces ethics to an average of a macro-value.
Ethics: conclusion
So, what’s the point of the 3rd level of analyzing ethics? The point is to find objective sub-structures in ethics where you can apply deduction to exclude the most “obviously awful” and “maximally controversial and irreversible” actions. The point is to “derive” ethics from much more broad topics, such as “meaningful games” and “meaningful tasks” and “coherence of concepts”.
I think:
Moral philosophers and Alignment researches are ignoring the 3rd level. People are severely underestimating how much they know about ethics.
Acknowledging the 3rd level doesn’t immediately solve Alignment, but it can “solve” ethics or the discourse around ethics. Empirically: just study properties of tasks and games and concepts!
Eliezer Yudkowsky has limited 3rd level understanding of meta-ethics (“Abstracted Idealized Dynamics”, “Morality as Fixed Computation”, “The Bedrock of Fairness”) but misses that he could make his idea more broad.
Particularism (in ethics and reasoning in general) could lead to the 3rd level understanding of ethics.
Exploring perception
1. Properties
There are three levels of looking at properties of objects:
Inherent properties. You treat objects as having more or less inherent properties. E.g. “this person is inherently smart”
Meta-properties. You treat any property as universal. E.g. “anyone is smart under some definition of smartness”
Semantic properties. You treat properties only as relatively attached to objects. You focus on types of changes (D): how properties and their interpretations change/get changed by some other thing Y. You “reduce” properties to D and Y. E.g. “anyone can be a genius or a fool under certain important conditions” or “everyone is smart, but in a unique and important way”
2. Commitment to experiences and knowledge
I think there are three levels of commitment to experiences:
You’re interested in particular experiences.
You want to explore all possible experiences.
You’re interested in types of changes (D): how your experience changes/get changed by some other thing Y. D and Y need to be important even outside of experience.
So, on the 3rd level you care about interesting ways (D) in which experiences correspond to reality (Y).
3. Experience and morality
I think there are three levels of investigating the connection between experience and morality:
You study how experience causes us to do good or bad things.
You study all the different experiences “goodness” and “badness” causes in us.
You study types of changes (D): how your experience changes/get changed by some other thing Y. D and Y need to be important even outside of experience. But related to morality anyway.
For example, Y can be “[basic] properties of concepts” and D can be “matches / mismatches [between concepts and actions towards them]”. You can study how experience affects properties of concepts which in turn bias actions. An example of such analysis: “loving a sentient being feels fundamentally different from eating a sandwich. food taste is something short and intense, but love can be eternal and calm. this difference helps to not treat other sentient beings as something disposable”
I think the existence of the 3rd level isn’t acknowledged much. Most versions of moral sentimentalism are 2nd level at best. Epistemic Sentimentalism can be 3rd level in the best case.
Exploring cognition
1. Patterns
I think there are three levels of [studying] patterns:
You study particular patterns (X). You treat patterns as objective configurations in reality.
You study all possible patterns. You treat patterns as subjective qualities of information, because most patterns are fake.
You study types of changes (D): how patterns change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) pattern analysis. You treat a pattern as a combination of the three components: “X + Y + D”.
For example, Y can be “pieces of information” or “contexts”: you can study how patterns get discarded or redefined (D) when new information gets revealed/new contexts get considered.
You can study patterns which are “objective”, but exist only in a limited context. For example, think about your friend’s bright personality (personality = a pattern). It’s an “objective” pattern, and yet it exists only in a limited context: the pattern would dissolve if you compared your friend to all possible people. Or if you saw your friend in all possible situations they could end up in. Your friend’s personality has some basis in reality (X), has a limited domain of existence (Y) and the potential for change (D).
2. Patterns and causality
I think there are three levels in the relationship between patterns and causality. I’m going to give examples about visual patterns:
You learn which patterns are impossible due to local causal processes. For example: “I’m unlikely to see a big tower made of eggs standing on top of each other”. It’s just not a stable situation due to very familiar laws of physics.
You learn statistical patterns (correlations) which can have almost nothing to do with causality. For example: “people like to wear grey shirts”.
You learn types of changes (D): how patterns change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) pattern analysis. And related to causality.
Y can be “basic properties of images” and “basic properties of patterns”; D can be “sharing properties” and “keeping the complexity the same”. In simpler words:
On the 3rd level you learn patterns which have strong connections to other patterns and basic properties of images. You could say such patterns are created/prevented by “global” causal processes. For example: “I’m unlikely to see a place fully filled with dogs. dogs are not people or birds or insects, they don’t create such crowds or hordes”. This is very abstract, connects to other patterns and basic properties of images.
Causality: implications for Machine Learning
I think...
It’s likely that Machine Learning models don’t learn 3rd level patterns as well as they could, as sharply as they could.
Machine Learning models should be 100% able to learn 3rd level patterns. It shouldn’t require any specific data.
Learning/comparing level 3 patterns is interesting enough on its own. It could be its own area of research. But we don’t apply statistics/Machine Learning to try mining those patterns. This may be a missed opportunity for humans.
3. Cognitive processes
Suppose you want to study different cognitive processes, skills, types of knowledge. There are three levels:
You study particular cognitive processes.
You study types (qualities) of cognitive processes. And types of types (classifications).
You study types of changes (D): how cognitive processes change/get changed by some other thing Y. D and Y need to be important even without the context of cognitive processes.
For example, Y can be “fundamental configurations / fundamental objects” and D can be “finding a fundamental configuration/object in a given domain”. You can “reduce” different cognitive process to those Y and D: (names of the processes below shouldn’t be taken 100% literally)
^(1 “fundamental” means “VERY widespread in a certain domain”)
Causal reasoning learns fundamental configurations of fundamental objects in the real world. So you can learn stuff like “this abstract rule applies to most objects in the world”.
Symbolic reasoning learns fundamental configurations of fundamental objects in your “concept space”. So you can learn stuff like “”concept A containing concept B” is an important pattern” (see set relations).
Correlational reasoning learns specific configurations of specific objects.
Mathematical reasoning learns specific configurations of fundamental objects. So you can build arbitrary structures with abstract building blocks.
Self-aware reasoning can transform fundamental objects into specific objects. So you can think thoughts like, for example, “maybe I’m just a random person with random opinions” (you consider your perspective as non-fundamental) or “maybe the reality is not what it seems”.
I know, this looks “funny”, but I think all this could be easily enough formalized. Isn’t that a natural way to study types of reasoning? Just ask what knowledge a certain type of reasoning learns!
Exploring theories
1. Science
I think there are three ways of doing science:
You predict a specific phenomenon.
You study types of phenomena. (qualities of phenomena)
You study types of changes (D): how the phenomenon changes/get changed by some other thing Y. D and Y need to be important even outside of this phenomenon.
Imagine you want to explain combustion (why/how things burn):
You try to predict combustion. This doesn’t work, because you already know “everything” about burning and there are many possible theories. You end up making things up because there’s not enough new data.
You try to compare combustion to other phenomena. You end up fantasizing about imaginary qualities of the phenomenon. At this level you get something like theories of “classical elements” (fantasies about superficial similarities).
You find or postulate a new thing (Y) which affects/gets affected (D) by combustion. Y and D need to be important in many other phenomena. If Y is “types of matter” and D is “releasing / absorbing”, this gives you Phlogiston theory. If Y is “any matter” and D is “conservation of mass” and “any transformations of matter”, you get Lavoisier’s theory. If Y is “small pieces of matter (atoms)” and D is “atoms hitting each other”, you get Kinetic theory of gases.
So, I think phlogiston theory was a step in the right direction, but it failed because the choice of Y and D wasn’t abstract enough.
I think most significant scientific breakthroughs require level 3 ideas. Partially “by definition”: if a breakthrough is not “level 3″, then it means it’s contained in a (very) specific part of reality.
2. Math
I think there are three ways of doing math:
You explore specific mathematical structures.
You explore types of mathematical structures. And types of types. And typologies. At this level you may get something like Category theory.
You study types of changes (D): how equations change/get changed by some other thing Y. D and Y need to be important even outside of (explicit) math.
Mathematico-philosophical insights
Let’s look at math through the lens of the 3rd level:
Let Y be “infinitely small building blocks” and “infinitely diminishing building blocks”; let D be “becoming infinitely small” and “reaching the limit”. Those Y and D matter even outside of math. We got Calculus.
Let Y be “quasi-physical materials” and D be “stretching, bending etc.”. Those Y and D matter even outside of math. We got Topology.
Let Y be “probability”. That was a completely new concept in all domains of knowledge. We got Probability theory.
Let Y be “different scales” and “different parts”; let D be “(not) repeating”. We got Fractals and Recursion.
Let Y be “directed things” and D be “compositions of movements”. We got Vectors
Let Y be “things that do basic stuff” and D be “doing sequences of basic stuff”. We got Theory of computation and Computational complexity theory.
Let Y be “games” and “utilities”. We got Utility theory and St. Petersburg paradox, Game theory… and even a new number system (“surreal numbers”).
Let Y be “sets” and D be “basic set relationships”. Those ideas are important in all areas of knowledge. We got Set theory
Let Y be “infinity” and D be “counting” and “making sets”. Those are philosophically important things. We got actual infinity, Hilbert’s Hotel, Cantor’s diagonal argument, Absolute Infinite and The Burali-Forti paradox...
All concepts above are “3rd level”. But we can classify them, creating new three levels of exploration (yes, this is recursion!). Let’s do this. I think there are three levels of mathematico-philosophical concepts:
Concepts that change the properties of things we count. (e.g. topology, fractals, graph theory)
Concepts that change the meaning of counting. (e.g. probability, computation, utility, sets, group theory, Gödel’s incompleteness theorems and Tarski’s undefinability theorem)
Concepts that change the essence of counting. (e.g. Calculus, vectors, probability, actual infinity, fractal dimensions)
So, Calculus is really “the king of kings” and “the insight of insights”. 3rd level of the 3rd level.
3. Physico-philosophical insights
I would classify physico-philosophical concepts as follows:
Concepts that change the way movement affects itself. E.g. Net force, Wave mechanics, Huygens–Fresnel principle
Concepts that change the “meaning” of movement. E.g. the idea of reference frames (principles of relativity), curved spacetime (General Relativity), the idea of “physical fields” (classical electromagnetism), conservation laws and symmetries, predictability of physical systems.
Concepts that change the “essence” of movement, the way movement relates to basic logical categories. E.g. properties of physical laws and theories (Complementarity; AdS/CFT correspondence), the beginning/existence of movement (cosmogony, “why is there something rather than nothing?”, Mathematical universe hypothesis), the relationship between movement and infinity (Supertasks) and computation/complexity, the way “possibility” spreads/gets created (Quantum mechanics, Anthropic principle), the way “relativity” gets created (Mach’s principle), the absolute mismatch between perception and the true nature of reality (General Relativity, Quantum Mechanics), the nature of qualia and consciousness (Hard problem of consciousness), the possibility of Theory of everything and the question “how far can you take [ontological] reductionism?”, the nature of causality and determinism, the existence of space and time and matter and their most basic properties, interpretation of physical theories (interpretations of quantum mechanics).
Exploring meta ideas
To define “meta ideas” we need to think about many pairs of “Y, D” simultaneously. This is the most speculative part of the post. Remember, you can treat those speculations simply as sci-fi ideas.
Each pair of abstract concepts (Y, D) defines a “language” for describing reality. And there’s a meta-language which connects all those languages. Or rather there’s many meta-languages. Each meta-language can be described by a pair of abstract concepts too (Y, D).
^(Instead of “languages” I could use the word “models”. But I wanted to highlight that those “models” don’t have to be formal in any way.)
I think the idea of “meta-languages” can be used to analyze:
Consciousness. You can say that consciousness is “made of” multiple abstract interacting languages. On one hand it’s just a trivial description of consciousness, on another hand it might have deeper implications.
Qualia. You can say that qualia is “made of” multiple abstract interacting languages. On one hand this is a trivial idea (“qualia is the sum of your associations”), on another hand this formulation adds important specific details.
The ontology of reality. You can argue that our ways to describe reality (“physical things” vs. purely mathematical concepts, subjective experience vs. physical world, high-level patterns vs. complete reductionism, physical theory vs. philosophical ontology) all conflict with each other and lead to paradoxes when taken to the extreme, but can’t exist without each other. Maybe they are all intertwined?
Meta-ethics. You can argue that concepts like “goodness” and “justice” can’t be reduced to any single type of definition. So, you can try to reduce them to a synthesis of many abstract languages. See G. E. Moore ideas about indefinability: the naturalistic fallacy, the open-question argument.
According to the framework, ideas about “meta-languages” define the limit of conceivable ideas.
If you think about it, it’s actually a quite trivial statement: “meta-models” (consisting of many normal models) is the limit of conceivable models. Your entire conscious mind is such “meta-model”. If no model works for describing something, then a “meta-model” is your last resort. On one hand “meta-models” is a very trivial idea^(1), on another hand nobody ever cared to explore the full potential of the idea.
^(1 for example, we have a “meta-model” of physics: a combination of two wrong theories, General Relativity and Quantum Mechanics.)
Nature of percepts
I talked about qualia in general. Now I just want to throw out my idea about the nature of particular percepts.
There are theories and concepts which link percepts to “possible actions” and “intentions”: see Affordance. I like such ideas, because I like to think about types of actions.
So I have a variation of this idea: I think that any percept gets created by an abstract dynamic (Y, D) or many abstract dynamics. Any (important) percept corresponds to a unique dynamic. I think abstract dynamics bind concepts.
^(But I have only started to think about this. I share it anyway because I think it follows from all the other ideas.)
P.S.
Thank you for reading this.
If you want to discuss the idea, please focus on the idea itself and its particular applications. Or on exploring particular topics!