I instinctively want to have random variables denoting any coinflip when presented with a list of coinflips yet I can’t have that if the set of events is not the powerset because in that case those wouldn’t be measurable functions.
No problem, you just explicitly use two different mathematical models at the same time, modelling different aspects of your problem. One for the whole series of the coin tosses and the other for the i-th coin toss.
doesn’t allow you to express individual coin tosses anyway—you need a different sample space for it.
Consider the situation where I’m flipping the coin and I keep getting heads, I imagine I get more and more surprised as I’m flipping.
Likewise, consider situations where:
You’ve written a specific non-trivial long combination of Heads and Tails and then, as you flip a coin, this particular combination is being produced. All the same logic for n-th flip. You are not much surprised to see every individual coin toss outcome, but are more and more surprised that they end up into a specific sequence that you’ve written beforehand.
Same as 1. but you’ve written a different combination of Heads and Tails and thus you are neither surprised to see every individual outcome, nor the total result.
Same as 1. but you haven’t written any combination in advance. Once again you are not surprised.
In 1. you’ve observed a rare event and are surprised because of it. In 2. and 3. you didn’t and thus you are not. Even if you’ve observed the same outcome - sequence of Heads and Tails—in all 1. 2. and 3. The events that you’ve observed are quite different. And if you do a simple sanity check, you will notice that, indeed, it’s very easy to replicate situations 2. and 3. but very hard to replicate 1.
Situation 1. is similar to observing many Heads in a row. The difference is that your brain is wired to track for many Heads/many Tails by default, but you being able to track the specific non-trivial combination requires an active precommitment.
rather the key is that I feel surprise when one of my assumptions about the world has become too improbable
This is not an alternative explanation. This is restating the same fact in different terms. If your assumption about the world has become too improbable it means that you’ve accumulated enough evidence against this assumption. Strength of the evidence against an assumption is literally how improbable encountering such event according to this assumption is. It’s not one way or the other. It’s always both.
As I understand (but correct me if I am wrong), your claim is that we don’t feel surprise when observing what is commonly thought of as a rare event, because we don’t actually observe a rare event, because of one quirk of our human psychology we implicitly use a non-maximal event space. But you now seem to allow for another probability space which, if true, seems to me a somewhat inelegant part of the theory. Do you claim that our subconscious tracks events in multiple ways simultaneously or am I misunderstanding you?
Relatedly, the power set does allow me to express individual coin tosses. Let X1 be the following function on Ω:
X1(ω)={1 if ω∈{HH,HT}0 otherwise
In this case X1 is measurable, because X−11[{1}]={HH,HT}∈P(Ω) (minor point: Your F is not the powerset of Ω), same for X−11[{0}]. Therefore X1 is actually a random variable modeling that the first throw is head.
Regarding your examples, I’m not sure I’m understanding you: Is your claim that the eventspace is different in the three cases leading to different probabilities for the events observed? I thought your theory said that our human psychology works with non-maximal eventspaces, but it seems it also works with different event spaces in different situations? (EDIT: Rereading the post, it seems you’ve adressed this part: if I understand correctly, one can influence their event space by way of focusing on specific outcomes?)
Wouldn’t it be much simpler to say that in 1, your previous assumption that the coinflips are independent from what you write on a paper became too low probability after observing the coinflips and that caused the feeling of surprise?
I’m afraid I don’t understand your last paragraph, to me it clearly seems an alternative explanation. Please, elaborate. It’s not true that any time I observe a low-probability event, one of my assumptions gets low-prob. For example, if I observe HHTHTTHHTTHT, no assumption of mine does, because I didn’t have a previous assumption that I will get coinflips different from HHTHTTHHTTHT. An assumption is not just any statement\proposition\event, it’s a belief about the world which is actually assumed beforehand.
To me your explanation leaves some things unexplained: for example: In what situation will our human psychology use which non-maximal event spaces? What is the evolutionary reason for this quirk? Isn’t being surprised in the all heads case rational in an objective sense? Should we expect an alien species to be or not be surprised?
For my proposed explanation these are easy questions to answer: We are not surprised because of the non-maximal event spaces, rather, we are surprised if one of our assumptions loses a lot of probability. The evolutionary reason is that the feeling of surprise caused us to investigate and in cases when one of our assumptions got too improbable, we should actually investigate the alternatives. Yes, being surprised in these cases is objectively rational and we should expect an alien species to do the same on all-heads throw and not do the same on some random string of H/T.
As I understand (but correct me if I am wrong), your claim is that we don’t feel surprise when observing what is commonly thought of as a rare event, because we don’t actually observe a rare event, because of one quirk of our human psychology we implicitly use a non-maximal event space.
Yes, this is correct. The general principle is “You can observe only what you are paying attention to”. And human quirk is by default paying attention to many Heads/Tails in a row.
But you now seem to allow for another probability space which, if true, seems to me a somewhat inelegant part of the theory. Do you claim that our subconscious tracks events in multiple ways simultaneously or am I misunderstanding you?
It’s not I who allow stuff. The point is that there is nothing in probability theory that forbids us from doing it. It’s not some new radical idea either. Solomonoff inductor is supposed to track all the models at the same time, for instance.
Another point is that, in fact, our minds (not necessary subconscious) are indeed able to use multiple mathematical models at the same time, as long as the sample spaces are different. This is an empirical claim, which you may check yourself.
The question of elegance is less important to me, it’s a matter of taste, essentially. Personally, I think using two different models for their specific tasks and nothing else is a more elegant design than trying to stick all the required functionality and then some more into one bigger model. Actually I still don’t think that just one model would be enough for what you want it to do, anyway.
minor point: Your F is not the powerset of Ω
Yes, you are completely correct, I forgot to add triplets and a couple of pairs. Anyway, let’s explore this kind of modeling:
Therefore X1 is actually a random variable modeling that the first throw is head.
So suppose you make a series of n coin tosses, your sample space are all possible combinations of Heads and Tails with length n and event space is its powers set. Let’s define event Hi as a set of all possible combinations of Heads and Tails with length n where Heads is in i-th place. You toss a coin the first time and get Heads. Has the event H1 just happened?
No, because H1 is realized only when either of its outcomes is realized and it’s outcomes are series of coin tosses with length n. So you can only say that H1 happened after all the coin tossing is done.
So if you want to update in process, you need a model for the i-th coin toss—which sample space are all possible combinations of Heads and Tails with length i and event space is its powers set. And then with every coin toss this model changes. So in the end you will have n different models.
Also I think you will have to use model for the current coin toss result anyway so that the switch from i to i+1 can properly be implemented. Maybe there is some clever way around this problem. In any case, human minds seem to be working the obvious way: notice that the outcome of the current coin toss is Heads/Tails, add it to the list of all the previous coin tosses with length i and thus be able to say which outcome in i+1th model has been realized.
And, of course, if you want to compare different assumptions about a coin you will have to track even more models in your mind.
(EDIT: Rereading the post, it seems you’ve adressed this part: if I understand correctly, one can influence their event space by way of focusing on specific outcomes?
Yes, your edit is correct. We can change what we are paying attention to and thus observe different events, which mathematically can be described as having different event spaces. There are some potential issues here, like whether your really made yourself pay attention only to the specific combination you’ve selected and thus are not surprised at all by ten Heads in a row, or are you just adding a new combination to the list of specific combinations which includes all Heads and all Tails thus becoming only about 50% less surprised when observing all Heads.
But this doesn’t matter much in the realm of decision making. If you want to do some action with only 1/2^n probability you can commit to a specific outcome with length n, toss a coin n times and do the action only if this particular outcome is realized.
Wouldn’t it be much simpler to say that in 1, your previous assumption that the coinflips are independent from what you write on a paper became too low probability after observing the coinflips and that caused the feeling of surprise?
Strictly speaking, no, because now you have to add the whole new level of multiple alternative hypothesis with their own probability spaces you are also tracking in your mind and prioritizing between them.
I have a simple rule: “surprise is proportional to the improbability of the event observed” and then use already existent difference between events and outcomes, to explain why observing every outcome of a random number generator is not surprising.
You add an extra distinction between “observed events” and “assumption invalidating observed events”. And I don’t see what it bring to the table. Seems to be a clear case of an extra entity. You can just reduce three entities model (assumption invalidating events, events, outcomes) to two entities (events, outcomes) model, without loosing anything.
It’s not true that any time I observe a low-probability event, one of my assumptions gets low-prob. For example, if I observe HHTHTTHHTTHT, no assumption of mine does, because I didn’t have a previous assumption that I will get coinflips different from HHTHTTHHTTHT.
If you didn’t have an assumption that observing HHTHTTHHTTHT is improbable then in what sense did you observe an improbable event when you saw the outcome HHTHTTHHTTHT?
Your assumptions can be described as a probability space with less rich sigma-algebra in which outcome HHTHTTHHTTHT isn’t an event in itself. Let’s call it model A. Observing an improbable event in model A equals your assumption becoming improbable and vice versa.
On the other hand, you are also trying to keep a probability space with a power set in your mind as well. And there {HHTHTTHHTTHT} is an event with low probability. This is model B.
What you are saying is that if you observed an outcome that corresponds to a low probable event in model B, it doesn’t mean that you’ve observed a low probable event in model A. And I completely agree. What I’m saying, is that you do not need to talk about model B in the first place, as it doesn’t actually correspond to to what you are able to observe and just adds extra confusion.
To me your explanation leaves some things unexplained: for example: In what situation will our human psychology use which non-maximal event spaces? What is the evolutionary reason for this quirk? Isn’t being surprised in the all heads case rational in an objective sense? Should we expect an alien species to be or not be surprised?
Naturally, it depends on our assumptions, what we are paying attention to. A person who is tracking a specific outcome and sees it being realized observes a much less probable event than a person who is tracking a dozen different outcomes, this one included.
There are some built in intuitions about what feels more or less random and its possible to speculate about their evolutionary reasons for them and for our ability to modify what we are paying attention to. There are, indeed, more things to be said on these topics. But they are besides the point of what I wanted to communicate in this post—probability theory and one of its apparent paradoxes which is quite relevant to anthropic reasoning which I’m trying to solve. The idea that our brain is a pattern seeking machine is already quite popular and I doubt that I have much new to add here.
No problem, you just explicitly use two different mathematical models at the same time, modelling different aspects of your problem. One for the whole series of the coin tosses and the other for the i-th coin toss.
Ω={HH,TT,HT,TH}, F={∅,{HT,TH},{HH,TT},{HH,TT,HT,TH}}
Ωi={H,T}, Fi={∅,{H},{T},{H,T}}
Notice, that using a powerset
F={∅,{HT},{TH},{HT,TH},{HH,TT},{HH,TT,HT,TH}}
doesn’t allow you to express individual coin tosses anyway—you need a different sample space for it.
Likewise, consider situations where:
You’ve written a specific non-trivial long combination of Heads and Tails and then, as you flip a coin, this particular combination is being produced. All the same logic for n-th flip. You are not much surprised to see every individual coin toss outcome, but are more and more surprised that they end up into a specific sequence that you’ve written beforehand.
Same as 1. but you’ve written a different combination of Heads and Tails and thus you are neither surprised to see every individual outcome, nor the total result.
Same as 1. but you haven’t written any combination in advance. Once again you are not surprised.
In 1. you’ve observed a rare event and are surprised because of it. In 2. and 3. you didn’t and thus you are not. Even if you’ve observed the same outcome - sequence of Heads and Tails—in all 1. 2. and 3. The events that you’ve observed are quite different. And if you do a simple sanity check, you will notice that, indeed, it’s very easy to replicate situations 2. and 3. but very hard to replicate 1.
Situation 1. is similar to observing many Heads in a row. The difference is that your brain is wired to track for many Heads/many Tails by default, but you being able to track the specific non-trivial combination requires an active precommitment.
This is not an alternative explanation. This is restating the same fact in different terms. If your assumption about the world has become too improbable it means that you’ve accumulated enough evidence against this assumption. Strength of the evidence against an assumption is literally how improbable encountering such event according to this assumption is. It’s not one way or the other. It’s always both.
As I understand (but correct me if I am wrong), your claim is that we don’t feel surprise when observing what is commonly thought of as a rare event, because we don’t actually observe a rare event, because of one quirk of our human psychology we implicitly use a non-maximal event space. But you now seem to allow for another probability space which, if true, seems to me a somewhat inelegant part of the theory. Do you claim that our subconscious tracks events in multiple ways simultaneously or am I misunderstanding you?
Relatedly, the power set does allow me to express individual coin tosses. Let X1 be the following function on Ω:
X1(ω)={1 if ω∈{HH,HT}0 otherwise
In this case X1 is measurable, because X−11[{1}]={HH,HT}∈P(Ω) (minor point: Your F is not the powerset of Ω), same for X−11[{0}]. Therefore X1 is actually a random variable modeling that the first throw is head.
Regarding your examples, I’m not sure I’m understanding you: Is your claim that the eventspace is different in the three cases leading to different probabilities for the events observed? I thought your theory said that our human psychology works with non-maximal eventspaces, but it seems it also works with different event spaces in different situations? (EDIT: Rereading the post, it seems you’ve adressed this part: if I understand correctly, one can influence their event space by way of focusing on specific outcomes?)
Wouldn’t it be much simpler to say that in 1, your previous assumption that the coinflips are independent from what you write on a paper became too low probability after observing the coinflips and that caused the feeling of surprise?
I’m afraid I don’t understand your last paragraph, to me it clearly seems an alternative explanation. Please, elaborate. It’s not true that any time I observe a low-probability event, one of my assumptions gets low-prob. For example, if I observe HHTHTTHHTTHT, no assumption of mine does, because I didn’t have a previous assumption that I will get coinflips different from HHTHTTHHTTHT. An assumption is not just any statement\proposition\event, it’s a belief about the world which is actually assumed beforehand.
To me your explanation leaves some things unexplained: for example: In what situation will our human psychology use which non-maximal event spaces? What is the evolutionary reason for this quirk? Isn’t being surprised in the all heads case rational in an objective sense? Should we expect an alien species to be or not be surprised?
For my proposed explanation these are easy questions to answer: We are not surprised because of the non-maximal event spaces, rather, we are surprised if one of our assumptions loses a lot of probability. The evolutionary reason is that the feeling of surprise caused us to investigate and in cases when one of our assumptions got too improbable, we should actually investigate the alternatives. Yes, being surprised in these cases is objectively rational and we should expect an alien species to do the same on all-heads throw and not do the same on some random string of H/T.
Yes, this is correct. The general principle is “You can observe only what you are paying attention to”. And human quirk is by default paying attention to many Heads/Tails in a row.
It’s not I who allow stuff. The point is that there is nothing in probability theory that forbids us from doing it. It’s not some new radical idea either. Solomonoff inductor is supposed to track all the models at the same time, for instance.
Another point is that, in fact, our minds (not necessary subconscious) are indeed able to use multiple mathematical models at the same time, as long as the sample spaces are different. This is an empirical claim, which you may check yourself.
The question of elegance is less important to me, it’s a matter of taste, essentially. Personally, I think using two different models for their specific tasks and nothing else is a more elegant design than trying to stick all the required functionality and then some more into one bigger model. Actually I still don’t think that just one model would be enough for what you want it to do, anyway.
Yes, you are completely correct, I forgot to add triplets and a couple of pairs. Anyway, let’s explore this kind of modeling:
So suppose you make a series of n coin tosses, your sample space are all possible combinations of Heads and Tails with length n and event space is its powers set. Let’s define event Hi as a set of all possible combinations of Heads and Tails with length n where Heads is in i-th place. You toss a coin the first time and get Heads. Has the event H1 just happened?
No, because H1 is realized only when either of its outcomes is realized and it’s outcomes are series of coin tosses with length n. So you can only say that H1 happened after all the coin tossing is done.
So if you want to update in process, you need a model for the i-th coin toss—which sample space are all possible combinations of Heads and Tails with length i and event space is its powers set. And then with every coin toss this model changes. So in the end you will have n different models.
Also I think you will have to use model for the current coin toss result anyway so that the switch from i to i+1 can properly be implemented. Maybe there is some clever way around this problem. In any case, human minds seem to be working the obvious way: notice that the outcome of the current coin toss is Heads/Tails, add it to the list of all the previous coin tosses with length i and thus be able to say which outcome in i+1th model has been realized.
And, of course, if you want to compare different assumptions about a coin you will have to track even more models in your mind.
Yes, your edit is correct. We can change what we are paying attention to and thus observe different events, which mathematically can be described as having different event spaces. There are some potential issues here, like whether your really made yourself pay attention only to the specific combination you’ve selected and thus are not surprised at all by ten Heads in a row, or are you just adding a new combination to the list of specific combinations which includes all Heads and all Tails thus becoming only about 50% less surprised when observing all Heads.
But this doesn’t matter much in the realm of decision making. If you want to do some action with only 1/2^n probability you can commit to a specific outcome with length n, toss a coin n times and do the action only if this particular outcome is realized.
Strictly speaking, no, because now you have to add the whole new level of multiple alternative hypothesis with their own probability spaces you are also tracking in your mind and prioritizing between them.
I have a simple rule: “surprise is proportional to the improbability of the event observed” and then use already existent difference between events and outcomes, to explain why observing every outcome of a random number generator is not surprising.
You add an extra distinction between “observed events” and “assumption invalidating observed events”. And I don’t see what it bring to the table. Seems to be a clear case of an extra entity. You can just reduce three entities model (assumption invalidating events, events, outcomes) to two entities (events, outcomes) model, without loosing anything.
If you didn’t have an assumption that observing HHTHTTHHTTHT is improbable then in what sense did you observe an improbable event when you saw the outcome HHTHTTHHTTHT?
Your assumptions can be described as a probability space with less rich sigma-algebra in which outcome HHTHTTHHTTHT isn’t an event in itself. Let’s call it model A. Observing an improbable event in model A equals your assumption becoming improbable and vice versa.
On the other hand, you are also trying to keep a probability space with a power set in your mind as well. And there {HHTHTTHHTTHT} is an event with low probability. This is model B.
What you are saying is that if you observed an outcome that corresponds to a low probable event in model B, it doesn’t mean that you’ve observed a low probable event in model A. And I completely agree. What I’m saying, is that you do not need to talk about model B in the first place, as it doesn’t actually correspond to to what you are able to observe and just adds extra confusion.
Naturally, it depends on our assumptions, what we are paying attention to. A person who is tracking a specific outcome and sees it being realized observes a much less probable event than a person who is tracking a dozen different outcomes, this one included.
There are some built in intuitions about what feels more or less random and its possible to speculate about their evolutionary reasons for them and for our ability to modify what we are paying attention to. There are, indeed, more things to be said on these topics. But they are besides the point of what I wanted to communicate in this post—probability theory and one of its apparent paradoxes which is quite relevant to anthropic reasoning which I’m trying to solve. The idea that our brain is a pattern seeking machine is already quite popular and I doubt that I have much new to add here.