If we can only meaningfully talk about parts of the universe that can be pinned down inside the causal graph, where do we find the fact that 2 + 2 = 4? Or did I just make a meaningless noise, there? Or if you claim that “2 + 2 = 4” isn’t meaningful or true, then what alternate property does the sentence “2 + 2 = 4″ have which makes it so much more useful than the sentence “2 + 2 = 3”?
PA proves “2 + 2 = 4” using the associative property. PA does not prove “2 + 2 = 3″. “2 + 2 = 4” is actually shorthand for “((1+1) + (1+1)) = (((1+1)+1)+1)”. Moving stuff next to other stuff in our universe happens to follow the associative property; this is why the belief is useful.
I have myself usually seen Peano arithmetic described with 0 and the successor operation (such as in the context of actually implementing it in a computer). in this case,
where the two theorems needed are that x + S(y) = S(x) + y and that x + 0 = x. I find this to have less incidental complexity (given that we are interested in working up from axioms, not down from conventional arithmetic) perhaps because the tree of the final expression has no branches. The first theorem can be looked at as expressing that “moving stuff results in the same stuff”, i.e. a conservation law; note that the expression has precisely the same number of nodes.
This post feel surprisingly insight-bringing to me because the “associative property” can in a sense be considered the “insignificance of parentheses”… and hence both the insignificance of groupings, and the lack of the need to define a starting point of calculation… and in turn these concepts feel connected to the concepts of reductionism and relativity...
Octonion multiplication is not associative. Exponentation isn’t either; (2^2)^3 = 4^3 = 64, 2^(2^3) = 2^8 = 256. There’s likely some kind of useful math with numberlike objects where no interesting operation is associative.
How can this math refer to the real world without the associative property? If you can’t count the initial “2” then you can’t multiply it or anything else, right? Plus, how could we arrive at a concept of exponentation that didn’t entail the concept of the associative property?
The pure math might just be too much of an inferential leap for me. I need to see how the math would be created from observations in the real world before I can really understand what you are saying.
Well, addition of positive integers is associative and has an obvious real-world analogue, so the associophobic math isn’t a good choice for describing reality. But if you lived in Bejeweled, addition wouldn’t make much sense as a concept—sometimes pushing a thing close to another thing yields two things, sometimes zero. The most fundamental operation would be “flip a pair of adjacent things”, which is not associative. (It’s sort of a transposition, which would give you group theory, which is full of associative operations, but I don’t think you can factor in disappearing rows while preserving associativity—it destroys the bijectivity.)
I agree...PA was invented based on our observations; our observations aren’t just magically predicted by some arbitrary set of rules. PA has only existed since 1889; reality had existed long before that.
I think this example brings out how Pearlian causality differs from other causal theories. For instance, in a counterfactual theory of causation, since the negation of a mathematical truth is impossible, we can’t meaningfully think of them as causes.
But in the Pearlian causality it seems that mathematical statements can have causal relations, since we can factor our uncertainty about them, just as we can other statements. I think endoself’s comment argues this well. I would add that this is a good example of how causation can be subjective. Before 1984, the Taniyama-Shimura-Weil conjecture and Fermat’s last theorem existed as conjectures, and some mathematicians presumably knew about both, but as far as I know they had no clue that they were related. Then Frey conjectured and Ribet proved that the TSW conjecture implies FLT. Then mathematician’s uncertainty was such that they would have causal graphs with TSW causing FLT. Now we have a proof of TSW (mostly by Wiles) but any residual uncertainty is still correlated. In the future, maybe there will be many independent proofs of each, and whatever uncertainty is left about them will be (nearly) uncorrelated.
I also think there can be causal relations between mathematical statements and statements about the world. For instance, maybe there is some conjecture of fluid dynamics, which if true would cause us to believe a certain type of wave can occur in certain circumstances. We can make inferences both ways, for instance, if we observe the wave we might increase our credence in the conjecture, and if we prove the conjecture, we might believe the wave can be observed somehow. But it seems that the causal graph would have the conjecture causing the wave. Part of the graph would be:
[Proof of conjecture -> conjecture -> wave <- (fluid dynamics applies to water) ]
[Proof of conjecture ← conjecture → wave ← (fluid dynamics applies to water) ]
Then mathematician’s uncertainty was such that they would have causal graphs with TSW causing FLT.
Well the direction of the arrow would be unspecified. After all, not FLT implies not TSW is equivalent to TSW implies FLT, so there’s a symmetry here. This often happens in causal modelling; many causal discovery algorithms can output that they know an arrow exists, but they are unable to determine its direction.
Also, conjectures are the causes of their proofs rather than vice versa. You can see this as your degrees of belief in the correctness of purported proofs are independent given that the conjecture is true (or false), but dependent when the truth-value of the conjecture is unknown.
Apart from this detail, I agree with your comment and I find it to be similar to the way I think about the causal structure of math.
This is very different from how I think about it. Could you expand a little? What do you mean by “when the truth-value of the conjecture is unknown”? That neither C nor ¬C is in your bound agent’s store of known theorems?
your degrees of belief in the correctness of purported proofs are independent given that the conjecture is true (or false),
Let S1, S2 be purported single-conclusion proofs of a statement C.
If I know C is false, the purported proofs are trivially independent because they’re fully determined incorrect?
Why is S1 independent of S2 given C is true? Are you saying that learning S2⊢C puts C in our theorem bank, and knowing C is true can change our estimation that S1⊢C , but proofs aren’t otherwise mutually informative? If so, what is the effect of learning ⊨C on P(S1⊢C)? And why don’t you consider proofs which, say, only differ after the first n steps to be dependent, even given the truth of their shared conclusion?
What do you mean by “when the truth-value of the conjecture is unknown”? That neither C nor ¬C is in your bound agent’s store of known theorems?
I meant that the agent is in some state of uncertainty. I’m trying to contrast the case where we are more certain of either C or ¬C with that where we have a significant degree of uncertainty.
If I know C is false, the purported proofs are trivially independent because they’re fully determined incorrect?
Yeah, this is just the trivial case.
Why is S1 independent of S2 given C is true?
I was talking about the simple case where there are no other causal links between the two proofs, like common lemmas or empirical observations. Those do change the causal structure by adding extra nodes and arrows, but I was making the simplifying assumption that we don’t have those things.
But in the Pearlian causality it seems that mathematical statements can have causal relations, since we can factor our uncertainty about them, just as we can other statements.
There maybe uncertainty about casual relations and about mathemtical statements, but that does not mean
mathematics is causal.
I think endoself’s comment argues this well. I would add that this is a good example of how causation can be subjective.
The transition probabilites on a causal diagram may be less than 1, but that represents levels of subjective confidence—epistemology—not causality per se. You can’t prove that the universe is indeterministic by
writing out a diagram.
Before 1984, the Taniyama-Shimura-Weil conjecture and Fermat’s last theorem existed as conjectures, and some mathematicians presumably knew about both, but as far as I know they had no clue that they were related. Then Frey conjectured and Ribet proved that the TSW conjecture implies FLT. Then mathematician’s uncertainty was such that they would have causal graphs with TSW causing FLT. Now we have a proof of TSW (mostly by Wiles) but any residual uncertainty is still correlated. In the future, maybe there will be many independent proofs of each, and whatever uncertainty is left about them will be (nearly) uncorrelated.
Yes you can write out a diagram with transitions indicating logical relationships and probabilities representing
subjective confidence. But the nodes aren’t spatio-temporal events, so it isnt a causal diagram. It is another
kind of diagram which happens to have the same structure.
I also think there can be causal relations between mathematical statements and statements about the world.
Causal relations hold between events, not statements.
For instance, maybe there is some conjecture of fluid dynamics, which if true would cause us to believe a certain type of wave can occur in certain circumstances.
What causes us to believe is evidence, not abstract truth.
We can make inferences both ways, for instance, if we observe the wave we might increase our credence in the conjecture, and if we prove the conjecture, we might believe the wave can be observed somehow.
The production of a proof, which is a spatio temproal event, can cause a change in beleif-state, which
is a spatio temporal event, which causes changes in behaviour....mathematical truth is not involved.
Truth without proof causes nothing. If we dont have reason to believe in a conjecture, we don’t act
on it, even if it is true.
Taboo spatio-temporal. Why is it a good idea to give one category of statements the special name ‘events’ and to reason about them differently than you would reason about other events?
“Events” aren’t a kind of statement. However a subset of statements is about events.
The point of separating them out is that this discussion is about causality, and, uncontentiously, causality
links events. If something is a Non-event (or a statement is not about an event), that is a good argument
for not granting it (or what is is about) causal powers.
Yes, I’m trying to get you to reduce the concept rather than take it as primitive. I know what an event is, by I think that the distinction between events and statements is fuzzy, and I think that events are best understood as a subcategory of statements.
Reduce it to whart? You’ve already “tabood” spatio-temporal. I can’t communicate anything to you without some set of common meanings. It;s a cheap trick to complain that someone can’t define something from a basis of shared-nothing, since no one can do that with any term.
I know what an event is, by I think that the distinction between events and statements is fuzzy, and I think that events are best understood as a subcategory of statements.
The diffrerence is screamingly obvious. Statements are verbal communiations. If one asteroid crashes
into another, that is an event but not a statement.. Statements are events, because they happen at particular
palces and times, but most events are not statements. You’ve got it the wrong way round.
I meant ‘statement’ in the abstract sense of what is stated rather than things like when it is stated and who it is stated by. ‘Proposition’ has the meanings that I intend without any others, so it would better convey my meaning here.
Reduce it to whart? You’ve already “tabood” spatio-temporal.
The point of rationalist taboo is to eliminate all the different phrasings we can use to mention a concept without really understanding it and force us to consider how the phenomenon being discussed actually works. Your wording presumes certain intuition about what the physical world is and how it should work by virtue of being “physical”, intuitions that are not usually argued for or even noticed. When you say you can’t explain what an “event” or something “spatio-temporal” is without reference to words that really just restate the concept, that is giving a mysterious answer. Things work a certain way, and we can determine how.
I have no idea what “work” means, please explain...
If you are a native english speaker, you will have enough of an understanding of “event” to appreciate my point. You expect me understand terms like “work” without your going through the process of giving a sematic bedrock definition , beyond the common one.
It is meaningful to talk about mathematical facts causing other mathematical facts. For example, if I knew the complete laws of physics but did not have enough computing power to determine all their consequences (which would be impossible anyways, as I’m living inside of them), my uncertainty about what is going to happen in the universe would be described by the exact same probability distribution as my uncertainty about the mathematical consequences of the laws of physics, and so both distributions would satisfy the causal Markov condition for the same causal graph* (modulo any uncertainty about whether the laws that I believe to be correct actually do describe the universe).
This works the same way with any other set of mathematical facts. I believe that if the abc conjecture is true, then Szpiro’s conjecture is also true and I believe that if the abc conjecture is false, then Shinichi Mochizuki’s proof of it is flawed. All of these facts can be put into one probability distribution which can then be factored over a Bayesian network. There is no need to separate the mathematical from the nonmathematical.
* Depending on how exactly you phrase the question, I would even say that these distributions are describing my uncertainty about the same thing, but that isn’t necessary here.
While it seems viable to say that mathematical truths exert causal (albeit time-invariant) influence on the universe, you might think it prima facie unlikely that we could causally influence them. However, it perhaps we can causally affect the outputs of decision algorithms.
On its own, mathematics is just the study of various systems of rules for symbol manipulation. The “unreasonable effectiveness of mathematics” in more practical areas like physics allows us to conclude something about the structure of physical law. Whatever the true laws of physics are, we conclude, they seem to be well described by certain systems of rules for symbol manipulation.
In our causal models, this is an arrow between “laws of physics” and “observations about the behavior of collections of objects” [subject to Eliezer’s caveat about a “laws of physics” node]. Once we include this arrow, we can do neat tricks like counting the numbers of rocks in piles A and B, encoding those numbers in symbols, manipulating those symbols, and finally using the result to correctly predict how many rocks we will have once we combine the two piles.
I think you’ve taken EY’s question too literally. The real question is about the status of statements and facts of formal systems (“systems of rules for symbol manipulation”) in general, not arithmetic, specifically.
If you define “mathematics” to include all formal systems, then you can say EY’s meditation is about mathematics.
Unfortunately I accidentally looked at some of the other answers first, but I think my answer would have been this anyway:
The fact “2+2=4” is a fact about the products of certain systems of thought and computation. It is also a abstract description of a certain property of objects in reality (namely, that taking two sets of two objects and combining the sets results in a set of four objects), at least at the classical level.
There seem to be all sorts of reasons that our distant ancestors’ development of a number sense was useful enough to be evolutionarily favored. Do we still have everyone in the tribe? Do we outnumber the enemies? How many predators were chasing me? Are these all of my children? A number sense that told us “2+2=3” could be quite maladaptive.
I see it as being in the causal graph—not as a node, but as an arrow (actually, a whole class of arrows). If I have two stones, and I put two more stones with them, then this will cause me to have four stones. Note that this doesn’t apply in all cases—if I have two piles of sand and I put them together with two more piles of sand, the result is one really big pile of sand and not four piles—but it applies in enough cases that the cases in which it does not apply can be considered exceptions for various reasons.
The problem with this is that something can both be caused and appear in a lot of causal interactions. For example, if I launch a giant mirror into space to block out the sun, all of the arrows from the sun to brightness everywhere have to be changed. In AI, this is often represented using plate notation, where a rectangle (the plate) is drawn around a group of variables that repeat and an arrow from outside the plate effects every instance of the variables in the plate.
‘2+2=4’ can be causally linked to reality. If you take 2 objects, and add 2 others, you’ve got 4, and this can be mapped back to the concept of ‘2+2=4’. Computers, and your brain, do it all the time.
This argument falls when we start talking about things which don’t seem to actually exist, like fractions when talking about indivisible particles. But numbers can be mapped to many things (that’s what abstracting things tends to do), so even though fractions don’t exist in that particular case, they do when talking about pies, so fractions can be mapped back to reality.
But this second argument seems to fall when talking about things like infinities, which can’t be mapped back to reality, as far as I know (maybe when talking about the number of points in a distance?). But in that case, we are just extrapolating rules which we have already mapped from the universe into our models. We know how the laws of physics work, so when we see the spaceship going of into the distance, where we’ll never be able to interact with it again, we know it’s still there, because we are extrapolating the laws of physics to outside the observable universe. Likewise, when confronted with infinity, mathematicians extrapolated certain known rules, and from that inferred properties about infinities, and because their rules were correct, whenever computations involving infities were resolved to more manageable numbers, they were consistent with everything else.
So our representations of numbers are a map of the territory (actually, many territories, because numbers are abstract).
Actually, if you think of it as affecting us, but not being affected by us, it is, in EY’s words, mathematics is higher. We would be “shadows” influenced by the higher tier, but unable to affect it.
But I don’t really think this line of reasoning leads anywhere.
Meditation:
If we can only meaningfully talk about parts of the universe that can be pinned down inside the causal graph, where do we find the fact that 2 + 2 = 4? Or did I just make a meaningless noise, there? Or if you claim that “2 + 2 = 4” isn’t meaningful or true, then what alternate property does the sentence “2 + 2 = 4″ have which makes it so much more useful than the sentence “2 + 2 = 3”?
PA proves “2 + 2 = 4” using the associative property. PA does not prove “2 + 2 = 3″. “2 + 2 = 4” is actually shorthand for “((1+1) + (1+1)) = (((1+1)+1)+1)”. Moving stuff next to other stuff in our universe happens to follow the associative property; this is why the belief is useful.
I have myself usually seen Peano arithmetic described with 0 and the successor operation (such as in the context of actually implementing it in a computer). in this case,
where the two theorems needed are that x + S(y) = S(x) + y and that x + 0 = x. I find this to have less incidental complexity (given that we are interested in working up from axioms, not down from conventional arithmetic) perhaps because the tree of the final expression has no branches. The first theorem can be looked at as expressing that “moving stuff results in the same stuff”, i.e. a conservation law; note that the expression has precisely the same number of nodes.
(I like that! The idea that it follows just from the associative property and no other features of PA is quite elegant.)
This post feel surprisingly insight-bringing to me because the “associative property” can in a sense be considered the “insignificance of parentheses”… and hence both the insignificance of groupings, and the lack of the need to define a starting point of calculation… and in turn these concepts feel connected to the concepts of reductionism and relativity...
I would go further and say that without the associative property the concept of numbers, math, and “2 + 2 = 4” does not make sense.
Octonion multiplication is not associative. Exponentation isn’t either; (2^2)^3 = 4^3 = 64, 2^(2^3) = 2^8 = 256. There’s likely some kind of useful math with numberlike objects where no interesting operation is associative.
How can this math refer to the real world without the associative property? If you can’t count the initial “2” then you can’t multiply it or anything else, right? Plus, how could we arrive at a concept of exponentation that didn’t entail the concept of the associative property?
The pure math might just be too much of an inferential leap for me. I need to see how the math would be created from observations in the real world before I can really understand what you are saying.
Well, addition of positive integers is associative and has an obvious real-world analogue, so the associophobic math isn’t a good choice for describing reality. But if you lived in Bejeweled, addition wouldn’t make much sense as a concept—sometimes pushing a thing close to another thing yields two things, sometimes zero. The most fundamental operation would be “flip a pair of adjacent things”, which is not associative. (It’s sort of a transposition, which would give you group theory, which is full of associative operations, but I don’t think you can factor in disappearing rows while preserving associativity—it destroys the bijectivity.)
Cool, great example.
2+2=4 isn’t a cause. It’s a tautological description. Describing things is useful, though.
I agree...PA was invented based on our observations; our observations aren’t just magically predicted by some arbitrary set of rules. PA has only existed since 1889; reality had existed long before that.
I think this example brings out how Pearlian causality differs from other causal theories. For instance, in a counterfactual theory of causation, since the negation of a mathematical truth is impossible, we can’t meaningfully think of them as causes.
But in the Pearlian causality it seems that mathematical statements can have causal relations, since we can factor our uncertainty about them, just as we can other statements. I think endoself’s comment argues this well. I would add that this is a good example of how causation can be subjective. Before 1984, the Taniyama-Shimura-Weil conjecture and Fermat’s last theorem existed as conjectures, and some mathematicians presumably knew about both, but as far as I know they had no clue that they were related. Then Frey conjectured and Ribet proved that the TSW conjecture implies FLT. Then mathematician’s uncertainty was such that they would have causal graphs with TSW causing FLT. Now we have a proof of TSW (mostly by Wiles) but any residual uncertainty is still correlated. In the future, maybe there will be many independent proofs of each, and whatever uncertainty is left about them will be (nearly) uncorrelated.
I also think there can be causal relations between mathematical statements and statements about the world. For instance, maybe there is some conjecture of fluid dynamics, which if true would cause us to believe a certain type of wave can occur in certain circumstances. We can make inferences both ways, for instance, if we observe the wave we might increase our credence in the conjecture, and if we prove the conjecture, we might believe the wave can be observed somehow. But it seems that the causal graph would have the conjecture causing the wave. Part of the graph would be:
[Proof of conjecture -> conjecture -> wave<- (fluid dynamics applies to water) ][Proof of conjecture ← conjecture → wave ← (fluid dynamics applies to water) ]
Well the direction of the arrow would be unspecified. After all, not FLT implies not TSW is equivalent to TSW implies FLT, so there’s a symmetry here. This often happens in causal modelling; many causal discovery algorithms can output that they know an arrow exists, but they are unable to determine its direction.
Also, conjectures are the causes of their proofs rather than vice versa. You can see this as your degrees of belief in the correctness of purported proofs are independent given that the conjecture is true (or false), but dependent when the truth-value of the conjecture is unknown.
Apart from this detail, I agree with your comment and I find it to be similar to the way I think about the causal structure of math.
This is very different from how I think about it. Could you expand a little? What do you mean by “when the truth-value of the conjecture is unknown”? That neither C nor ¬C is in your bound agent’s store of known theorems?
Let S1, S2 be purported single-conclusion proofs of a statement C.
If I know C is false, the purported proofs are trivially independent because they’re fully determined incorrect?
Why is S1 independent of S2 given C is true? Are you saying that learning S2⊢C puts C in our theorem bank, and knowing C is true can change our estimation that S1⊢C , but proofs aren’t otherwise mutually informative? If so, what is the effect of learning ⊨C on P(S1⊢C)? And why don’t you consider proofs which, say, only differ after the first n steps to be dependent, even given the truth of their shared conclusion?
I meant that the agent is in some state of uncertainty. I’m trying to contrast the case where we are more certain of either C or ¬C with that where we have a significant degree of uncertainty.
Yeah, this is just the trivial case.
I was talking about the simple case where there are no other causal links between the two proofs, like common lemmas or empirical observations. Those do change the causal structure by adding extra nodes and arrows, but I was making the simplifying assumption that we don’t have those things.
Hmm, you are right. Thanks for the correction!
There maybe uncertainty about casual relations and about mathemtical statements, but that does not mean mathematics is causal.
The transition probabilites on a causal diagram may be less than 1, but that represents levels of subjective confidence—epistemology—not causality per se. You can’t prove that the universe is indeterministic by writing out a diagram.
Yes you can write out a diagram with transitions indicating logical relationships and probabilities representing subjective confidence. But the nodes aren’t spatio-temporal events, so it isnt a causal diagram. It is another kind of diagram which happens to have the same structure.
Causal relations hold between events, not statements.
What causes us to believe is evidence, not abstract truth.
The production of a proof, which is a spatio temproal event, can cause a change in beleif-state, which is a spatio temporal event, which causes changes in behaviour....mathematical truth is not involved. Truth without proof causes nothing. If we dont have reason to believe in a conjecture, we don’t act on it, even if it is true.
Taboo spatio-temporal. Why is it a good idea to give one category of statements the special name ‘events’ and to reason about them differently than you would reason about other events?
Also, what about Newcomb’s problem?
“Events” aren’t a kind of statement. However a subset of statements is about events. The point of separating them out is that this discussion is about causality, and, uncontentiously, causality links events. If something is a Non-event (or a statement is not about an event), that is a good argument for not granting it (or what is is about) causal powers.
What is an event? What properties do events have that statements do not?
Are you a native English speaker?
Yes, I’m trying to get you to reduce the concept rather than take it as primitive. I know what an event is, by I think that the distinction between events and statements is fuzzy, and I think that events are best understood as a subcategory of statements.
Reduce it to whart? You’ve already “tabood” spatio-temporal. I can’t communicate anything to you without some set of common meanings. It;s a cheap trick to complain that someone can’t define something from a basis of shared-nothing, since no one can do that with any term.
The diffrerence is screamingly obvious. Statements are verbal communiations. If one asteroid crashes into another, that is an event but not a statement.. Statements are events, because they happen at particular palces and times, but most events are not statements. You’ve got it the wrong way round.
I meant ‘statement’ in the abstract sense of what is stated rather than things like when it is stated and who it is stated by. ‘Proposition’ has the meanings that I intend without any others, so it would better convey my meaning here.
The point of rationalist taboo is to eliminate all the different phrasings we can use to mention a concept without really understanding it and force us to consider how the phenomenon being discussed actually works. Your wording presumes certain intuition about what the physical world is and how it should work by virtue of being “physical”, intuitions that are not usually argued for or even noticed. When you say you can’t explain what an “event” or something “spatio-temporal” is without reference to words that really just restate the concept, that is giving a mysterious answer. Things work a certain way, and we can determine how.
I have no idea what “work” means, please explain...
If you are a native english speaker, you will have enough of an understanding of “event” to appreciate my point. You expect me understand terms like “work” without your going through the process of giving a sematic bedrock definition , beyond the common one.
Newcomb’s problem is an irrelevant-to-everything Waste Of Money Brains And Time, AFIAC.
It is meaningful to talk about mathematical facts causing other mathematical facts. For example, if I knew the complete laws of physics but did not have enough computing power to determine all their consequences (which would be impossible anyways, as I’m living inside of them), my uncertainty about what is going to happen in the universe would be described by the exact same probability distribution as my uncertainty about the mathematical consequences of the laws of physics, and so both distributions would satisfy the causal Markov condition for the same causal graph* (modulo any uncertainty about whether the laws that I believe to be correct actually do describe the universe).
This works the same way with any other set of mathematical facts. I believe that if the abc conjecture is true, then Szpiro’s conjecture is also true and I believe that if the abc conjecture is false, then Shinichi Mochizuki’s proof of it is flawed. All of these facts can be put into one probability distribution which can then be factored over a Bayesian network. There is no need to separate the mathematical from the nonmathematical.
* Depending on how exactly you phrase the question, I would even say that these distributions are describing my uncertainty about the same thing, but that isn’t necessary here.
The thing is that mathematics seems to have an additional causal structure that seems to be (at least partially) independent from the proof structure.
I agree with this. I didn’t mean to give the impression that the causal structure is the same as the proof structure.
While it seems viable to say that mathematical truths exert causal (albeit time-invariant) influence on the universe, you might think it prima facie unlikely that we could causally influence them. However, it perhaps we can causally affect the outputs of decision algorithms.
On its own, mathematics is just the study of various systems of rules for symbol manipulation. The “unreasonable effectiveness of mathematics” in more practical areas like physics allows us to conclude something about the structure of physical law. Whatever the true laws of physics are, we conclude, they seem to be well described by certain systems of rules for symbol manipulation.
In our causal models, this is an arrow between “laws of physics” and “observations about the behavior of collections of objects” [subject to Eliezer’s caveat about a “laws of physics” node]. Once we include this arrow, we can do neat tricks like counting the numbers of rocks in piles A and B, encoding those numbers in symbols, manipulating those symbols, and finally using the result to correctly predict how many rocks we will have once we combine the two piles.
I think you’ve taken EY’s question too literally. The real question is about the status of statements and facts of formal systems (“systems of rules for symbol manipulation”) in general, not arithmetic, specifically. If you define “mathematics” to include all formal systems, then you can say EY’s meditation is about mathematics.
Unfortunately I accidentally looked at some of the other answers first, but I think my answer would have been this anyway:
The fact “2+2=4” is a fact about the products of certain systems of thought and computation. It is also a abstract description of a certain property of objects in reality (namely, that taking two sets of two objects and combining the sets results in a set of four objects), at least at the classical level.
There seem to be all sorts of reasons that our distant ancestors’ development of a number sense was useful enough to be evolutionarily favored. Do we still have everyone in the tribe? Do we outnumber the enemies? How many predators were chasing me? Are these all of my children? A number sense that told us “2+2=3” could be quite maladaptive.
I see it as being in the causal graph—not as a node, but as an arrow (actually, a whole class of arrows). If I have two stones, and I put two more stones with them, then this will cause me to have four stones. Note that this doesn’t apply in all cases—if I have two piles of sand and I put them together with two more piles of sand, the result is one really big pile of sand and not four piles—but it applies in enough cases that the cases in which it does not apply can be considered exceptions for various reasons.
The problem with this is that something can both be caused and appear in a lot of causal interactions. For example, if I launch a giant mirror into space to block out the sun, all of the arrows from the sun to brightness everywhere have to be changed. In AI, this is often represented using plate notation, where a rectangle (the plate) is drawn around a group of variables that repeat and an arrow from outside the plate effects every instance of the variables in the plate.
‘2+2=4’ can be causally linked to reality. If you take 2 objects, and add 2 others, you’ve got 4, and this can be mapped back to the concept of ‘2+2=4’. Computers, and your brain, do it all the time.
This argument falls when we start talking about things which don’t seem to actually exist, like fractions when talking about indivisible particles. But numbers can be mapped to many things (that’s what abstracting things tends to do), so even though fractions don’t exist in that particular case, they do when talking about pies, so fractions can be mapped back to reality.
But this second argument seems to fall when talking about things like infinities, which can’t be mapped back to reality, as far as I know (maybe when talking about the number of points in a distance?). But in that case, we are just extrapolating rules which we have already mapped from the universe into our models. We know how the laws of physics work, so when we see the spaceship going of into the distance, where we’ll never be able to interact with it again, we know it’s still there, because we are extrapolating the laws of physics to outside the observable universe. Likewise, when confronted with infinity, mathematicians extrapolated certain known rules, and from that inferred properties about infinities, and because their rules were correct, whenever computations involving infities were resolved to more manageable numbers, they were consistent with everything else.
So our representations of numbers are a map of the territory (actually, many territories, because numbers are abstract).
Mathematics is a lower, actually the lowest, tier.
Actually, if you think of it as affecting us, but not being affected by us, it is, in EY’s words, mathematics is higher. We would be “shadows” influenced by the higher tier, but unable to affect it.
But I don’t really think this line of reasoning leads anywhere.