A quick observation: Perfect Bayesian mind is impossible to actually build, that much we all know, and nobody cares.
But it’s a lot worse—it is impossible even mathematically—even if we expected as little from it as consistently following the rule that P(b|a)=P(c|b)=100% implies P(c|a)=100% (without getting into choice of prior, infinite precision, transfinite induction, uncountable domains etc., merely the merest minimum still recognizable as Bayesian inference) over unbounded but finite chains of inference over countable set of statements, it can trivially solve the halting problem.
Yes, it will always tell you which theorem is true, and which is false, Goedel theorem be damned. It cannot say anything like P(Riemann hypothesis|basic math axioms)=50% as this automatically implies a violation of Bayes rule somewhere in the network (and there are no compartments to limit damage once it happens—the whole network becomes invalid).
Perfect Bayesian minds people here so willingly accepted as the gold standard of rationality are mathematically impossible, and there’s no workaround, and no approximation that is of much use.
Ironically, perfect Bayesian inference systems works really well inside finite or highly regular compartments, with something else limiting its interactions with rest of the universe.
If you want an outside view argument that this is a serious problem, if Bayesian minds were so awesome, how is it that even in the very limited machine learning world, Bayesian-inspired systems are only one of many competing paradigms, better applicable to some compartments, not working well in others.
I realize that I just explicitly rejected one of the most basic premises accepted by pretty much everyone here, including me until recently. It surprised me that we were all falling for something as obvious retrospectively.
Robin Hanson’s post on contrarians being wrong most of the time was amazingly accurate again. I’m still not sure which ideas I’ve came to believe that relied on perfect Bayesian minds being gold standard of rationality I’ll need to reevaluate, but it doesn’t bother me as much now that I fully accepted that compartmentalization is unavoidable, and a pretty good thing in practice.
I think there’s a nice correspondence between outside view and set of preferred reference classes and Bayesian inference and set of preferred priors. Except outside view can be very easily extended to say “I don’t know”, estimate accuracy of itself as applied to different compartments, give more complex answers, evolve in time by reference classes formerly too small to be of any use now having enough data to return useful answers, and so on.
For very simple systems, these two should correspond to each other in a straightforward way. For complex systems, we have a choice of sometimes answering “I don’t know” or being inconsistent.
I wanted to write this as a top level post, but “one of your most cherished beliefs is totally wrong, here’s a sketch of mathematical proof” post would take a lot more effort to write well.
I tried a few extensions of Bayesian inference that I hoped would be able to deal with it, but this is really fundamental.
You can still use subjective Bayesian worldview—that P(Riemann hypothesis|basic math axioms)=50% is just your intuition. But you must accept that your probabilities can change with no new data, by just more thinking. This sort of Bayesian inference is just another tool of limited use, with biases, inconsistencies, and compartments protecting it from rest of the universe.
There is no gold standard of rationality. There simply isn’t. I have a fall back position of outside view, otherwise it would be about as difficult to accept this as a Christian finally figuring out there is no God, but still wanting to keep the good parts of his or her faith.
Would anyone be willing to write a top level post out of my comment? You’ll either be richly rewarded by a lot of karma, or we’ll both be banned.
Perfect Bayesian minds people here so willingly accepted as the gold standard of rationality are mathematically impossible, and there’s no workaround, and no approximation that is of much use.
A perfect Bayesian is logically omniscient (and logically omniscient agents are perfect Bayesians) and come with the same problem (of being impossible). I don’t see why this fact should be particularly troubling.
If you want an outside view argument that this is a serious problem, if Bayesian minds were so awesome, how is it that even in the very limited machine learning world, Bayesian-inspired systems are only one of many competing paradigms, better applicable to some compartments, not working well in others.
An outside view is only as good as the reference class you use. Your reference class does not appear to have many infinitely long levers, infinitely fast processors or a Maxwell’s Demon. I don’t have any reason to expect your hunch to be accurate.
“Outside View” doesn’t mean go with your gut instinct and pick a few superficial similarities.
I have a fall back position of outside view, otherwise it would be about as difficult to accept this as a Christian finally figuring out there is no God, but still wanting to keep the good parts of his or her faith.
There is more to that analogy than you’d like to admit.
A perfect Bayesian is logically omniscient (and logically omniscient agents are perfect Bayesians) and come with the same problem (of being impossible). I don’t see why this fact should be particularly troubling.
The only way to be “omniscient” over even very simple countable universe is to be inconsistent. There is no way to assign probabilities to every node which obeys Bayes theorem. It’s a lot like Kolmogorov complexity—they can be useful philosophical tools, but neither is really part of mathematics, they’re just logically impossible.
Finite perfect Bayesian systems are complete and consistent. We’re so used to every example of a Bayesian system ever used being finite, that we totally forgot that they are not logically possible to expand to simplest countable systems. We just accepted handwaving results in finite systems into countable domain.
An outside view is only as good as the reference class you use.
This is a feature, not a bug.
No outside view systems you can build will be omniscient. But this is precisely what lets them be consistent.
Different outside view systems will give you different results. It’s not so different from Bayesian priors, except you can have some outside view systems for countable domains, and there are no Bayesian priors like it at all.
You can easily have nested outside view system judging outside view systems on which ones work and which don’t. Or some other interesting kind of nesting. Or use different reference classes for different compartments.
Or you could use something else. What we have here are in a way all computer programs anyway—and representing them as outside view systems is just human convenience.
But every single description of reality must either be allowed to say “I don’t know” or blatantly violate rules of logic. Either way, you will need some kind of compartmentalization to describe reality.
It’s not in any way related. Taleb’s point is purely practical—that we rely on very simple models that work reasonably well most of the time, but very rare cases where they fail often also have huge huge impact. You wouldn’t guess that life or human-level intelligence might happen looking at the universe up until that point. Their reference class was empty. And then they happened just once and had massive impact.
Taleb would be more convincing if he didn’t act as if nobody knew even the power law. Everything he writes is about how actual humans currently model things, and that can easily be improved (well, there are some people who don’t know even the power law...; or with prediction markets to overcome pundit groupthink).
You could easily imagine that while humans really suck at this, and there’s only so much improvement we can make, perhaps there’s a certain gold standard of rationality—something telling us how to do it right at least in theory, even if we cannot actually implement it ever due to physical constraints of the universe. Like perfect Bayesians.
My point is that perfect Bayesians can only deal with finite domains. Gold standard of rationality—basically something that would assign some probabilities to every outcome within some fairly regular countable domain, and they would merely be self-consistent and follow basic rules of probability—it turns that even the simplest such assignment of probabilities is not possible, even in theory.
You can be self-consistent by sacrificing completeness—for some questions you’d answer “no idea”; or you can be complete by sacrificing self-consistency (subjective Bayesianism is exactly like that, your probabilities will change if you just think more about something, even without observing any new data).
And not only perfect Bayesianism, nothing else can work the way people wish. Without some gold standard of rationality, without some one true way of describing reality, a lot of other common beliefs just fail.
Compartmentalization, biases, heuristics, and so on—they are not possible to avoid even in theory, in fact they’re necessary in nearly any useful model of reasoning. Extreme reductionism is out, emergence comes back as an important concept, it’d be a very different less wrong.
More down to earth subjects like akrasia, common human biases, prediction markets, religion, evopsy, cryonics, luminosity, winning, science, scepticism, techniques, self-deception, overconfidence, signaling etc. would be mostly unaffected.
On the other hand so much of theoretical side of less wrong is based on flawed assumption that perfect Bayesians are at least theoretically possible on infinite domains, so that true always answer exists even if we don’t know it, it would need something between a very serious update and simply throwing it away.
Some parts of theory would don’t rely on this at all—like outside view. But these are not terribly popular here.
I don’t think you’d see even much of sequences surviving without a major update.
My point is that perfect Bayesians can only deal with finite domains. Gold standard of rationality—basically something that would assign some probabilities to every outcome within some fairly regular countable domain, and they would merely be self-consistent and follow basic rules of probability—it turns that even the simplest such assignment of probabilities is not possible, even in theory.
What are the smallest and/or simplest domains which aren’t amenable to Bayesian analysis?
I’m not sure you’re doing either me or Taleb justice (though he may well be having too much fun going on about how much smarter he than just about everyone else) -- I don’t think he’s just talking about completely unknown unknowns, or implying that people could get things completely right—just that people could do a great deal better than they generally do.
For example, Taleb talks about a casino which had the probability and gaming part of its business completely nailed down. The biggest threats to the casino turned out to be a strike, embezzlement (I think), and one of its performers being mauled by his tiger. None of these are singularity-level game changers.
In any case, I would be quite interested in more about the limits of Bayesian analysis and how that affects the more theoretical side of LW, and I doubt you’d be downvoted into oblivion for posting about it.
What are the smallest and/or simplest domains which aren’t amenable to Bayesian analysis?
Notice that you’re talking domains already, you’ve accepted it, more or less.
I’d like to ask the opposite question—are there any non-finite domains where perfect Bayesian analysis makes sense?
On any domain where you can have even extremely limited local rules you can specify as conditions, and unbounded size of the world, you can use perfect Bayesian analysis to say if any Turing machine stops, or to prove any statement about natural number arithmetics.
The only difficulty is bridging language of Bayesian analysis and language of computational incompleteness. Because nobody seems to be really using Bayes like that, I cannot even give a convincing example how it fails. Nobody tried other than in handwaves.
A quick observation: Perfect Bayesian mind is impossible to actually build, that much we all know, and nobody cares.
But it’s a lot worse—it is impossible even mathematically—even if we expected as little from it as consistently following the rule that P(b|a)=P(c|b)=100% implies P(c|a)=100% (without getting into choice of prior, infinite precision, transfinite induction, uncountable domains etc., merely the merest minimum still recognizable as Bayesian inference) over unbounded but finite chains of inference over countable set of statements, it can trivially solve the halting problem.
Yes, it will always tell you which theorem is true, and which is false, Goedel theorem be damned. It cannot say anything like P(Riemann hypothesis|basic math axioms)=50% as this automatically implies a violation of Bayes rule somewhere in the network (and there are no compartments to limit damage once it happens—the whole network becomes invalid).
Perfect Bayesian minds people here so willingly accepted as the gold standard of rationality are mathematically impossible, and there’s no workaround, and no approximation that is of much use.
Ironically, perfect Bayesian inference systems works really well inside finite or highly regular compartments, with something else limiting its interactions with rest of the universe.
If you want an outside view argument that this is a serious problem, if Bayesian minds were so awesome, how is it that even in the very limited machine learning world, Bayesian-inspired systems are only one of many competing paradigms, better applicable to some compartments, not working well in others.
I realize that I just explicitly rejected one of the most basic premises accepted by pretty much everyone here, including me until recently. It surprised me that we were all falling for something as obvious retrospectively.
Robin Hanson’s post on contrarians being wrong most of the time was amazingly accurate again. I’m still not sure which ideas I’ve came to believe that relied on perfect Bayesian minds being gold standard of rationality I’ll need to reevaluate, but it doesn’t bother me as much now that I fully accepted that compartmentalization is unavoidable, and a pretty good thing in practice.
I think there’s a nice correspondence between outside view and set of preferred reference classes and Bayesian inference and set of preferred priors. Except outside view can be very easily extended to say “I don’t know”, estimate accuracy of itself as applied to different compartments, give more complex answers, evolve in time by reference classes formerly too small to be of any use now having enough data to return useful answers, and so on.
For very simple systems, these two should correspond to each other in a straightforward way. For complex systems, we have a choice of sometimes answering “I don’t know” or being inconsistent.
I wanted to write this as a top level post, but “one of your most cherished beliefs is totally wrong, here’s a sketch of mathematical proof” post would take a lot more effort to write well.
I tried a few extensions of Bayesian inference that I hoped would be able to deal with it, but this is really fundamental.
You can still use subjective Bayesian worldview—that P(Riemann hypothesis|basic math axioms)=50% is just your intuition. But you must accept that your probabilities can change with no new data, by just more thinking. This sort of Bayesian inference is just another tool of limited use, with biases, inconsistencies, and compartments protecting it from rest of the universe.
There is no gold standard of rationality. There simply isn’t. I have a fall back position of outside view, otherwise it would be about as difficult to accept this as a Christian finally figuring out there is no God, but still wanting to keep the good parts of his or her faith.
Would anyone be willing to write a top level post out of my comment? You’ll either be richly rewarded by a lot of karma, or we’ll both be banned.
A perfect Bayesian is logically omniscient (and logically omniscient agents are perfect Bayesians) and come with the same problem (of being impossible). I don’t see why this fact should be particularly troubling.
An outside view is only as good as the reference class you use. Your reference class does not appear to have many infinitely long levers, infinitely fast processors or a Maxwell’s Demon. I don’t have any reason to expect your hunch to be accurate.
“Outside View” doesn’t mean go with your gut instinct and pick a few superficial similarities.
There is more to that analogy than you’d like to admit.
I’m quite troubled by this downvote.
The only way to be “omniscient” over even very simple countable universe is to be inconsistent. There is no way to assign probabilities to every node which obeys Bayes theorem. It’s a lot like Kolmogorov complexity—they can be useful philosophical tools, but neither is really part of mathematics, they’re just logically impossible.
Finite perfect Bayesian systems are complete and consistent. We’re so used to every example of a Bayesian system ever used being finite, that we totally forgot that they are not logically possible to expand to simplest countable systems. We just accepted handwaving results in finite systems into countable domain.
This is a feature, not a bug.
No outside view systems you can build will be omniscient. But this is precisely what lets them be consistent.
Different outside view systems will give you different results. It’s not so different from Bayesian priors, except you can have some outside view systems for countable domains, and there are no Bayesian priors like it at all.
You can easily have nested outside view system judging outside view systems on which ones work and which don’t. Or some other interesting kind of nesting. Or use different reference classes for different compartments.
Or you could use something else. What we have here are in a way all computer programs anyway—and representing them as outside view systems is just human convenience.
But every single description of reality must either be allowed to say “I don’t know” or blatantly violate rules of logic. Either way, you will need some kind of compartmentalization to describe reality.
Just to check, is this an expansion of “Nature never tells you how many slots there are on the roulette wheel”?
I thought I’d gotten the idea about Nature and roulette wheels from Taleb, but a fast googling doesn’t confirm that.
It’s not in any way related. Taleb’s point is purely practical—that we rely on very simple models that work reasonably well most of the time, but very rare cases where they fail often also have huge huge impact. You wouldn’t guess that life or human-level intelligence might happen looking at the universe up until that point. Their reference class was empty. And then they happened just once and had massive impact.
Taleb would be more convincing if he didn’t act as if nobody knew even the power law. Everything he writes is about how actual humans currently model things, and that can easily be improved (well, there are some people who don’t know even the power law...; or with prediction markets to overcome pundit groupthink).
You could easily imagine that while humans really suck at this, and there’s only so much improvement we can make, perhaps there’s a certain gold standard of rationality—something telling us how to do it right at least in theory, even if we cannot actually implement it ever due to physical constraints of the universe. Like perfect Bayesians.
My point is that perfect Bayesians can only deal with finite domains. Gold standard of rationality—basically something that would assign some probabilities to every outcome within some fairly regular countable domain, and they would merely be self-consistent and follow basic rules of probability—it turns that even the simplest such assignment of probabilities is not possible, even in theory.
You can be self-consistent by sacrificing completeness—for some questions you’d answer “no idea”; or you can be complete by sacrificing self-consistency (subjective Bayesianism is exactly like that, your probabilities will change if you just think more about something, even without observing any new data).
And not only perfect Bayesianism, nothing else can work the way people wish. Without some gold standard of rationality, without some one true way of describing reality, a lot of other common beliefs just fail.
Compartmentalization, biases, heuristics, and so on—they are not possible to avoid even in theory, in fact they’re necessary in nearly any useful model of reasoning. Extreme reductionism is out, emergence comes back as an important concept, it’d be a very different less wrong.
More down to earth subjects like akrasia, common human biases, prediction markets, religion, evopsy, cryonics, luminosity, winning, science, scepticism, techniques, self-deception, overconfidence, signaling etc. would be mostly unaffected.
On the other hand so much of theoretical side of less wrong is based on flawed assumption that perfect Bayesians are at least theoretically possible on infinite domains, so that true always answer exists even if we don’t know it, it would need something between a very serious update and simply throwing it away.
Some parts of theory would don’t rely on this at all—like outside view. But these are not terribly popular here.
I don’t think you’d see even much of sequences surviving without a major update.
What are the smallest and/or simplest domains which aren’t amenable to Bayesian analysis?
I’m not sure you’re doing either me or Taleb justice (though he may well be having too much fun going on about how much smarter he than just about everyone else) -- I don’t think he’s just talking about completely unknown unknowns, or implying that people could get things completely right—just that people could do a great deal better than they generally do.
For example, Taleb talks about a casino which had the probability and gaming part of its business completely nailed down. The biggest threats to the casino turned out to be a strike, embezzlement (I think), and one of its performers being mauled by his tiger. None of these are singularity-level game changers.
In any case, I would be quite interested in more about the limits of Bayesian analysis and how that affects the more theoretical side of LW, and I doubt you’d be downvoted into oblivion for posting about it.
Notice that you’re talking domains already, you’ve accepted it, more or less.
I’d like to ask the opposite question—are there any non-finite domains where perfect Bayesian analysis makes sense?
On any domain where you can have even extremely limited local rules you can specify as conditions, and unbounded size of the world, you can use perfect Bayesian analysis to say if any Turing machine stops, or to prove any statement about natural number arithmetics.
The only difficulty is bridging language of Bayesian analysis and language of computational incompleteness. Because nobody seems to be really using Bayes like that, I cannot even give a convincing example how it fails. Nobody tried other than in handwaves.
Check things from Goedel incompleteness theorem and Turing completeness lists.
It seems that mainstream philosophy have figured it out long time ago. Contrarians turn out to be wrong once again. It’s not new stuff, we just never bothered checking.