I guess it works out if any number of bits of optimization being exerted given the existence of an optimizer is as probable as any other number, but if that is the prior we’re starting from then this seems worth stating (unless it follows from the rest in a way that I’m overlooking).
steven
The quantity we’re measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities. To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating
This seems to me to suggest the same fallacy as the one behind p-values… I don’t want to know the tail area, I want to know the probability for the event that actually happened (and only that event) under the hypothesis of no optimization divided by the same probability under the hypothesis of optimization. Example of how they can differ: if we know in advance that any optimizer would optimize at least 100 bits, then a 10-bit-optimized outcome is evidence against optimization even though the probability given no optimization of an event at least as preferred as the one that happened is only 1/1024.
re: calibration, it seems like what we want to do is ask ourselves what happens if an agent is asked lots of different probability questions, consider the the true probability as a function of probability stated by the agent, use some prior distribution (describing our uncertainty) on all such functions that the agent could have, update this prior using a finite set of answers we have seen the agent give and their correctness, and end up with a posterior distribution on functions (agent’s probability → true probability) from which we can get estimates of how over/underconfident the agent is at each probability level, and use those to determine what the agent “really means” when it says 90%; if the agent is overconfident at all probabilities then it’s “overconfident” period, if it’s underconfident at all probabilities then it’s “underconfident” period, if it’s over at some and under at some then I guess it’s just “misconfident”? (an agent could be usually overconfident in an environment that usually asked it difficult questions and usually underconfident in an environment that usually asked it easy questions, or vice versa) If we keep asking an agent that doesn’t learn the same question like in anon’s comment then that seems like a degenerate case. On first think it doesn’t seem like an agent’s calibration function necessarily depends on what questions you ask it; Solomonoff induction is well-calibrated in the long run in all environments (right?), and you could imagine an agent that was like SI but with all probability outputs twice as close to 1, say. Hope this makes any sense.
AFAIK the 9/11 people didn’t believe they would die in any real sense.
you will realize that Osama bin Laden would be far more likely to say, “I hate pornography” than “I hate freedom”
There’s a difference between hating freedom and saying you hate freedom. There’s also a difference between hating freedom and hating our freedom; the latter phrasing rules out Bin Laden misredefining the word to suit his own purposes. And thirdly it’s possible to hate freedom and hate pornography more than freedom.
Eliezer, that’s a John McCarthy quote.
Isn’t the problem often not that people betray their ideals, but that their ideals were harmful to begin with? Do we know that not-yet-powerful Stalin would have disagreed (internally) with a statement like “preserving Communism is worth the sacrifice of sending a lot of political opponents to gulags”? If not then maybe to that extent everyone is corrupt and it’s just the powerful that get to act on it. Maybe it’s also the case that the powerful are less idealistic and more selfish, but then there are two different “power corrupts” effects at play.
It’s important in these crisis things to remind yourself that 1) P does not imply “there are no important generally unappreciated arguments for not-P”, and 2) P does not imply “the proponents of P are not all idiots, dishonest, and/or users of bad arguments”. You can switch sides without deserting your favorite soldiers. IMO.
One more argument against deceiving epistemic peers when it seems to be in their interest is that if you are known to have the disposition to do so, this will cause others to trust your non-deceptive statements less; and here you could recommend that they shouldn’t trust you less, but then we’re back into doublethink territory.
In a PD, everyone having accurate information about the payoff matrix leads to a worse outcome for everyone, than some false payoff matrices you could misinform them with. That is the point.
Do you agree that in a PD, it is not the case for any individual that that individual is harmed by that individual’s knowledge? Your point goes through if we somehow think of the collective consisting as a single “you” with beliefs and preferences, but raises all sorts of issues and anyway isn’t what Eliezer was talking about.
If an entire community can be persuaded to adopt a false belief, it may enable them to overcome a tragedy-of-the-commons or prisoners’-dilemma situation.
In a PD, agents hurt each other, not themselves. Obviously false beliefs in my enemy can help me.
in any comparison of all possible combinations of bit/axiom strings up to any equal finite (long) length (many representing not only a world but also (using ‘spare’ string segments inside the total length) extraneous features such as other worlds, nothing in particular, or perhaps ‘invisible’ intra-world entities), it is reasonable to suppose that the simplest worlds (ie those with the shortest representing string segments) will occur most often across all strings, since they will have more ‘spare’ irrelevant bit/axiom combinations up to that equal comparison length, than those of more complex worlds (and so similarly for all long finite comparison lengths).
Thus out of all worlds inhabitable by SAS’s, we are most likely to be in one of the simplest (other things being equal) - any physics-violating events like flying rabbits or dragons would require more bits/axioms to (minimally) specify their worlds, and so we should not expect to find ourselves in such a world, at any time in its history.
(SAS means self-aware substructure)
Eliezer, doesn’t “math mysteriously exists and we live in it” have one less mystery than “math mysteriously exists and the universe mysteriously exists and we live in it”? (If you don’t think math exists it seems like you run into indispensability arguments.)
IIRC the argument for a low-entropy universe is anthropic, something like “most non-simple universes with observers in them look like undetectably different variants of a simple universe rather than universes with dragons in them”.
To show that hellish scenarios are worth ignoring, you have to show not only that they’re improbable, but also that they’re improbable enough to overcome the factor (utility of oblivionish scenario—utility of hellish scenario)/(utility of heavenish scenario—utility of oblivionish scenario), which as far as I can tell could be anywhere between tiny and huge.
As for global totalitarian dictatorships, I doubt they’d last for more than millions of years without something happening to them.
relative to the early theory that was put forward should have read relative to a random early theory given that it was consistent with the evidence.
I wrote:
To the extent that predictions 11-20 and 21 are generated by different independent “parts” of the theory, the quality of the former part is evidence about the quality of the latter part via the theorist’s competence.
...however, this is much less true of cases like Newton or GR where you can’t change a small part of the theory without changing all the predictions, than it is of cases like “evolution theory is true and by the way general relativity is also true”, which is really two theories, or cases like “Newton is true on weekdays and GR on weekends”, which is a bad theory.
So I think that to first order, Peter’s answer is still right; and moreover, I think it can be restated from Occam to Bayes as follows:
Experiments 11-20 have given the late theorizer more information on what false theories are consistent with the evidence, but they have not given the early theorizer any usable information on what false theories are consistent with the evidence. Experiments 11-20 have also given the late theorizer more information on what theories are consistent with the evidence, but this does not help the late theorizer relative to the early theorizer, whose theory after all turned out to be consistent with the evidence. So experiments 11-20 made it more likely for a random false late theory to be consistent with the evidence, relative to a random false early theory; but they did not make it more likely for a random late theory to be consistent with the evidence, relative to the early theory that was put forward. Therefore, according to some Bayes math that I’m too lazy to do, it must be the case that there are more false theories among late theories consistent with the evidence, than among early theories consistent with the evidence.
Does this make sense? I think I will let it stand as my final answer, with the caveat about theories with independent parts predicting different experiments, in which case our new information about the theorists matters.
(“21-30” should read “21″, “then the” should read “the”)
Actually I’d like to take back my last comment. To the extent that predictions 11-20 and 21-30 are generated by different independent “parts” of the theory, then the quality of the former part is evidence about the quality of the latter part via the theorist’s competence.
Peter de Blanc got it right, IMHO. I don’t agree with any of the answers that involve inference about the theorists themselves; they each did only one thing, so it is not the case that you can take one thing they did as evidence for the nature of some other thing they did.
Awesome post, but somebody should do the pessimist version, rewriting various normal facets of the human condition as horrifying angsty undead curses.