You seem to be generating more numbers without thinking what they mean for the original probability. “There are lots of numbers” is not a counterargument. What you might be looking towards is the idea that there is no stable probability, that as you consider more things it blows up, or dies, or oscillates. But because alternating decreasing series converge, the probability is in fact stable. So what do all these other numbers mean? They’re numbers related to how precise (not how accurate) your estimate of the probability is. They don’t change your estimate, but they change the precision of your estimate. If someone rolls a loaded die in front of me, I estimate a chance of 1⁄6 that he gets a 4, because I don’t know how the die is loaded. But the precision of my probability estimate is less than if it was an unloaded die.
The doubts of doubts are in fact smaller than the doubts, for the simple reason that all probabilities are less than one. Doubts are on the order of probability, while doubts of doubts are the probability of the doubt of doubt times the negative effect of the original doubt.
And again, “there are lots of numbers” is not necessarily a bad thing. The world has lots of numbers. A bad thing would be something like a contradiction, or an indeterminate point where reality should be, or an infinity where a finite number should be.
Duhem was an old french guy who was around before Popper. Quine was a more recent american guy who was around after Popper.
The regress doesn’t offer any probability at all because it never ends and you cannot analyze the whole thing which would require infinitely many steps. You imply it has a simple pattern but I don’t think it does as my example showed (where I tried to estimate probabilities successively) which you did not reply to.
If you could analyze the whole infinite regress, the probability it would offer that your first probability estimate was correct is infinitesimal because if you consider the odds of infinitely many probabilities—all below 1 -- you get an infinitesimal result. (If you have 99%, and then 99% of that, and so on, it keeps going down forever).
As I commented on, it does not alternate. Let me try again:
Theory T1 is the temporary insanity theory. You assign it 1%.
Theory T2 is the theory that your first probability assignment was correct. You assign that 90%.
Now, suppose we find out T2 is false. What happens? Does the probability of T1 go up or down?
We don’t know (given only the statements made so far; if you introduce new statements you could come up with an answer but they would themselves be subject to further questioning). So it isn’t true that the signs keep alternating and balance out. The answer is unknown, not stable. They don’t have signs at all.
The issue has nothing to do with “a lot of numbers”. Infinities are different than lots of numbers. They have special properties.
The probability estimate does not go up or down just because it’s imprecise. What gets plugged into the familiar formulas as a probability can be calculated as the expected probability over a distribution. So your T2 is in fact always false—in a continuous probability distribution there’s no one “right answer,” instead there’s an average answer. If we go with your Ts a bit more literally, the new probability is correctly given by Bayes’ rule, which is known and stable.
You seem to think the lack of a right answer is a problem, that if the odds that your first answer is correct are infinitesimal, that’s bad for this method of finding out knowledge (though I still feel like this is an odd thing to focus on, since we’re stuck with reality and can’t choose to pop over to another reality just because it has nice properties). But the trick is that although our estimates are wrong, they’re wrong in an unknown direction—they’re wrong as a consequence of having incomplete information, but they are the best fit to the information that we do have.
The point is that you can’t and don’t know the probability of anything.
Whenever you make a guess at a probability (e.g. 0.1% chance of temporary insanity yesterday), you have to wonder (in your epistemology): what is the probability that this guess (this probability estimate) is correct?
And whatever you guess about that, you have to wonder it again. And then again.
None of this has to do with the imprecision of knowledge. How do you know you are in the right ballpark? How do you know that you are within plus or minus 50%? You do not know. You can say you are, and you can give that a probability. But that itself could be false, its probability could be questioned.
So there is a regress.
At any step in the regress, if you are mistaken, the whole thing crumbles, all the way to the very first probability you assigned. They don’t crumble a little with the addition of minor error, they simply could be anything whatsoever and you don’t know.
Why? Well Suppose you think the probability you were correct about T4 is 99%. And let’s suppose even that you’re right about that. It doesn’t matter. But it’s a 1% case. It could be T40 just as well, or T4000, and the probability could be 99.999%. It doesn’t matter.
So, T4 says that the probability you are correct about T3′s probability estimate is 99%. Which is true, but you’re unlucky this time. T4 is false.
Where does that leave T3? It leaves it unknown. Your probability estimate for it is not changed but gone entirely. You haven’t got one and you don’t know what it is. And with no status for T3, no reliability, no justification, no nothing, then T2 is gone too. And so goes T1. And T0. The end.
T3 said you’re 99% confident that T2′s probability estimate is correct. T2 said you’re 99% confident that T1′s probability estimate is correct. You see how they all fall apart? Each one depends on the next.
There are various ways out. You can make a probability estimate, and when asked the probability you are right about that, answer 100% or refuse to answer. You can suggest we stop asking. You can accept some things which haven’t got a probability (but this contradicts your general method). Whatever. But there is no rational solution which makes everything work. I think you may have misunderstood the regress as having something to do with the probability of the initial theory at each step (so that it goes up and down, in smaller and smaller amounts, and with opposite signs at each step). But that’s not it. It’s more meta than that. Every step is to ask the probability of the previous probability estimate (not theory directly about the real world).
You say we’re stuck with reality. This is true. But it does not mean that your picture of reality is correct. Popper has an epistemology which is not broken, which has no regress. This one is broken. There’s no need to despair, only to change your mind. To let your theories die in your place, as Popper put it.
You’re repeating the same things again. Which means you probably didn’t understand what I said about probability estimates always being wrong. Meanwhile probability distributions can be exactly right, in the sense that they perfectly fit your current knowledge. You should go read a few books or take a class on probability. As a second book I would recommend E.T. Jaynes’ Probability Theory.
As for probability being a part of reality, remember what I said about uncertainty being probabilistic, if you use the axioms that something cannot be more and less likely than something else at the same time, estimates of likeliness should not have large jumps on infinitesimal evidence, and that estimates of likeliness should not ignore information or make it up (okay, fine, I just copied and pasted that from before)? Which of those axioms does Karl Popper reject?
You didn’t understand my point or address it. At all. You just gave up and stopped trying to engage with me. I was still trying. Communication isn’t trivially easy.
Your list of axioms don’t have anything to do with the regress argument I’ve been making, and they aren’t even close to sufficient to support your worldview (they don’t even say that we can ever can or should make a probability estimate about anything).
Because your point is in terms of the truth of specific probabilities, which are already always wrong, your point is ill-formed. T1=0, T2=1, the end. To do better you need to understand probability distributions.
If your first probability estimate is wrong, without any error bar—but simply wrong in an unknown way—then you’re screwed, right?
Edit: And what are you talking about with T2=1? It does not have a probability of 1. That sounds like your “signs flip” thing which I addressed already. I still think you are imagining a different regress than the one I was talking about.
If your first probability estimate is wrong, without any error bar—but simply wrong in an unknown way—then you’re screwed, right?
Think of it this way—if it’s wrong in an utterly unknown way, then the wrongness has perfect symmetry; there’s nothing to distinguish being wrong one way from being wrong in another. By the axiom that you shouldn’t make up information, when the information is symmetric, that part of the distribution (“part” as in you convolve the different parts together to get the total distribution) should be symmetric too. And since the final probability estimate is just the average over your distribution, the symmetry makes the problem easy—or if the problem is poorly defined or poorly understood, it at the very least gives you error bars—it makes the answer somewhere between your current estimate and the maximum entropy estimate.
If you’re wrong in an unknown way, then it could just as well be 1% or 99%.
You might try to claim this averages to 50%. But theories don’t have uniform probability. There are more possible mistakes than truths. Almost all theories are mistaken. So when the probability is unknown, we have every reason to think it’s a mistake (if we’re just going to guess; we could of course use Popper’s epistemology instead which handles all this stuff), and there’s no justification for the theory. Right?
Your comments about error bars are subject to regresses (what is the probability you are right about that method? about the maximum entropy estimate? etc)
You don’t seem to be thinking with the concept of an probability distribution, or an average of one. You say “If you’re wrong in an unknown way, then it could just as well be 1% or 99%” as if it spells doom for any attempt to quantify probabilities. When really all it is is a symmetry property for a probability distribution.
I guess I shouldn’t be expected to give you a class in probability over the internet when you are already convinced it’s all wrong. But again, I think you should read a textbook on this stuff, or take a class.
If that’s what you’re using “the regress” to mean, sure, sign me up. But this has even less bearing than usual on whether uncertainty can be represented by probability, unless you are making the (unlikely and terrible) argument that nothing can be represented by anything.
You don’t get an infinite regress if you use a universal prior.
The universal prior probability of any prefix p of a computable sequence x is the sum of the probabilities of all programs (for a universal computer) that compute something starting with p.
Actually you still do… You simply have to ask: what is the probability that the universal prior idea is correct? And whatever you say, ask the probability that is correct. And so on.
The regress works no matter what you say, even if you say something about universal priors.
“Correct” in what sense? In the actual agent using it, it’s a probability distribution over statements, not a statement itself! Do you mean “What is the probability that the universal prior has [certain property that we consider a reason to use it]?”
A probability distribution over statements is itself a statement (one states that the probability distribution is X). Maybe you use the word “statement” in a fancy way but I include anything.
And the property I was talking about is truth. But another one could be used.
No, because the universe has a state/law, not a probability distribution over states. A theory/universe/statement is either true or false, a probability distribution over theories is not, though it can be scored for accuracy in various ways. A probability distribution over theories is not a statement about the actual state of the universe.
Similarly, the universal prior is in no way “true”; it’s a distribution, not a statement at all. You shouldn’t even expect it to be “true” since it’s meant to be updated. What is important about it is that it has various nice properties such as eventually learning any computable distribution.
The first prior is where the regress bottoms out. Bayesian reasoning has to stop somewhere—and it stops at the first prior.
This area is known as “The problem of the priors”. For most agents it is no big deal—since they are rapidly swamped by evidence that overwhelms their priors, so there is little sensitivity to their exact values.
Eh, okay, we can leave aside semantics.
You seem to be generating more numbers without thinking what they mean for the original probability. “There are lots of numbers” is not a counterargument. What you might be looking towards is the idea that there is no stable probability, that as you consider more things it blows up, or dies, or oscillates. But because alternating decreasing series converge, the probability is in fact stable. So what do all these other numbers mean? They’re numbers related to how precise (not how accurate) your estimate of the probability is. They don’t change your estimate, but they change the precision of your estimate. If someone rolls a loaded die in front of me, I estimate a chance of 1⁄6 that he gets a 4, because I don’t know how the die is loaded. But the precision of my probability estimate is less than if it was an unloaded die.
The doubts of doubts are in fact smaller than the doubts, for the simple reason that all probabilities are less than one. Doubts are on the order of probability, while doubts of doubts are the probability of the doubt of doubt times the negative effect of the original doubt.
And again, “there are lots of numbers” is not necessarily a bad thing. The world has lots of numbers. A bad thing would be something like a contradiction, or an indeterminate point where reality should be, or an infinity where a finite number should be.
Duhem was an old french guy who was around before Popper. Quine was a more recent american guy who was around after Popper.
The regress doesn’t offer any probability at all because it never ends and you cannot analyze the whole thing which would require infinitely many steps. You imply it has a simple pattern but I don’t think it does as my example showed (where I tried to estimate probabilities successively) which you did not reply to.
If you could analyze the whole infinite regress, the probability it would offer that your first probability estimate was correct is infinitesimal because if you consider the odds of infinitely many probabilities—all below 1 -- you get an infinitesimal result. (If you have 99%, and then 99% of that, and so on, it keeps going down forever).
As I commented on, it does not alternate. Let me try again:
Theory T1 is the temporary insanity theory. You assign it 1%.
Theory T2 is the theory that your first probability assignment was correct. You assign that 90%.
Now, suppose we find out T2 is false. What happens? Does the probability of T1 go up or down?
We don’t know (given only the statements made so far; if you introduce new statements you could come up with an answer but they would themselves be subject to further questioning). So it isn’t true that the signs keep alternating and balance out. The answer is unknown, not stable. They don’t have signs at all.
The issue has nothing to do with “a lot of numbers”. Infinities are different than lots of numbers. They have special properties.
The probability estimate does not go up or down just because it’s imprecise. What gets plugged into the familiar formulas as a probability can be calculated as the expected probability over a distribution. So your T2 is in fact always false—in a continuous probability distribution there’s no one “right answer,” instead there’s an average answer. If we go with your Ts a bit more literally, the new probability is correctly given by Bayes’ rule, which is known and stable.
You seem to think the lack of a right answer is a problem, that if the odds that your first answer is correct are infinitesimal, that’s bad for this method of finding out knowledge (though I still feel like this is an odd thing to focus on, since we’re stuck with reality and can’t choose to pop over to another reality just because it has nice properties). But the trick is that although our estimates are wrong, they’re wrong in an unknown direction—they’re wrong as a consequence of having incomplete information, but they are the best fit to the information that we do have.
The point is that you can’t and don’t know the probability of anything.
Whenever you make a guess at a probability (e.g. 0.1% chance of temporary insanity yesterday), you have to wonder (in your epistemology): what is the probability that this guess (this probability estimate) is correct?
And whatever you guess about that, you have to wonder it again. And then again.
None of this has to do with the imprecision of knowledge. How do you know you are in the right ballpark? How do you know that you are within plus or minus 50%? You do not know. You can say you are, and you can give that a probability. But that itself could be false, its probability could be questioned.
So there is a regress.
At any step in the regress, if you are mistaken, the whole thing crumbles, all the way to the very first probability you assigned. They don’t crumble a little with the addition of minor error, they simply could be anything whatsoever and you don’t know.
Why? Well Suppose you think the probability you were correct about T4 is 99%. And let’s suppose even that you’re right about that. It doesn’t matter. But it’s a 1% case. It could be T40 just as well, or T4000, and the probability could be 99.999%. It doesn’t matter.
So, T4 says that the probability you are correct about T3′s probability estimate is 99%. Which is true, but you’re unlucky this time. T4 is false.
Where does that leave T3? It leaves it unknown. Your probability estimate for it is not changed but gone entirely. You haven’t got one and you don’t know what it is. And with no status for T3, no reliability, no justification, no nothing, then T2 is gone too. And so goes T1. And T0. The end.
T3 said you’re 99% confident that T2′s probability estimate is correct. T2 said you’re 99% confident that T1′s probability estimate is correct. You see how they all fall apart? Each one depends on the next.
There are various ways out. You can make a probability estimate, and when asked the probability you are right about that, answer 100% or refuse to answer. You can suggest we stop asking. You can accept some things which haven’t got a probability (but this contradicts your general method). Whatever. But there is no rational solution which makes everything work. I think you may have misunderstood the regress as having something to do with the probability of the initial theory at each step (so that it goes up and down, in smaller and smaller amounts, and with opposite signs at each step). But that’s not it. It’s more meta than that. Every step is to ask the probability of the previous probability estimate (not theory directly about the real world).
You say we’re stuck with reality. This is true. But it does not mean that your picture of reality is correct. Popper has an epistemology which is not broken, which has no regress. This one is broken. There’s no need to despair, only to change your mind. To let your theories die in your place, as Popper put it.
You’re repeating the same things again. Which means you probably didn’t understand what I said about probability estimates always being wrong. Meanwhile probability distributions can be exactly right, in the sense that they perfectly fit your current knowledge. You should go read a few books or take a class on probability. As a second book I would recommend E.T. Jaynes’ Probability Theory.
As for probability being a part of reality, remember what I said about uncertainty being probabilistic, if you use the axioms that something cannot be more and less likely than something else at the same time, estimates of likeliness should not have large jumps on infinitesimal evidence, and that estimates of likeliness should not ignore information or make it up (okay, fine, I just copied and pasted that from before)? Which of those axioms does Karl Popper reject?
You didn’t understand my point or address it. At all. You just gave up and stopped trying to engage with me. I was still trying. Communication isn’t trivially easy.
Your list of axioms don’t have anything to do with the regress argument I’ve been making, and they aren’t even close to sufficient to support your worldview (they don’t even say that we can ever can or should make a probability estimate about anything).
Because your point is in terms of the truth of specific probabilities, which are already always wrong, your point is ill-formed. T1=0, T2=1, the end. To do better you need to understand probability distributions.
If your first probability estimate is wrong, without any error bar—but simply wrong in an unknown way—then you’re screwed, right?
Edit: And what are you talking about with T2=1? It does not have a probability of 1. That sounds like your “signs flip” thing which I addressed already. I still think you are imagining a different regress than the one I was talking about.
Think of it this way—if it’s wrong in an utterly unknown way, then the wrongness has perfect symmetry; there’s nothing to distinguish being wrong one way from being wrong in another. By the axiom that you shouldn’t make up information, when the information is symmetric, that part of the distribution (“part” as in you convolve the different parts together to get the total distribution) should be symmetric too. And since the final probability estimate is just the average over your distribution, the symmetry makes the problem easy—or if the problem is poorly defined or poorly understood, it at the very least gives you error bars—it makes the answer somewhere between your current estimate and the maximum entropy estimate.
If you’re wrong in an unknown way, then it could just as well be 1% or 99%.
You might try to claim this averages to 50%. But theories don’t have uniform probability. There are more possible mistakes than truths. Almost all theories are mistaken. So when the probability is unknown, we have every reason to think it’s a mistake (if we’re just going to guess; we could of course use Popper’s epistemology instead which handles all this stuff), and there’s no justification for the theory. Right?
Your comments about error bars are subject to regresses (what is the probability you are right about that method? about the maximum entropy estimate? etc)
You don’t seem to be thinking with the concept of an probability distribution, or an average of one. You say “If you’re wrong in an unknown way, then it could just as well be 1% or 99%” as if it spells doom for any attempt to quantify probabilities. When really all it is is a symmetry property for a probability distribution.
I guess I shouldn’t be expected to give you a class in probability over the internet when you are already convinced it’s all wrong. But again, I think you should read a textbook on this stuff, or take a class.
Are you aware that Yudkowsky doesn’t dispute the regress? He has an article on it.
http://lesswrong.com/lw/s0/where_recursive_justification_hits_bottom/
If that’s what you’re using “the regress” to mean, sure, sign me up. But this has even less bearing than usual on whether uncertainty can be represented by probability, unless you are making the (unlikely and terrible) argument that nothing can be represented by anything.
You don’t get an infinite regress if you use a universal prior.
Actually you still do… You simply have to ask: what is the probability that the universal prior idea is correct? And whatever you say, ask the probability that is correct. And so on.
The regress works no matter what you say, even if you say something about universal priors.
“Correct” in what sense? In the actual agent using it, it’s a probability distribution over statements, not a statement itself! Do you mean “What is the probability that the universal prior has [certain property that we consider a reason to use it]?”
A probability distribution over statements is itself a statement (one states that the probability distribution is X). Maybe you use the word “statement” in a fancy way but I include anything.
And the property I was talking about is truth. But another one could be used.
No, because the universe has a state/law, not a probability distribution over states. A theory/universe/statement is either true or false, a probability distribution over theories is not, though it can be scored for accuracy in various ways. A probability distribution over theories is not a statement about the actual state of the universe.
Similarly, the universal prior is in no way “true”; it’s a distribution, not a statement at all. You shouldn’t even expect it to be “true” since it’s meant to be updated. What is important about it is that it has various nice properties such as eventually learning any computable distribution.
The first prior is where the regress bottoms out. Bayesian reasoning has to stop somewhere—and it stops at the first prior.
This area is known as “The problem of the priors”. For most agents it is no big deal—since they are rapidly swamped by evidence that overwhelms their priors, so there is little sensitivity to their exact values.
My bayesian reasoning finishes with a posterior. It starts at the first prior. I’m backwards like that.
So, you simply refuse to question the prior. Is this a matter of faith, or what? Why stop there?
More often a matter of birth. Agents usually start somewhere.
A few details about the process.