Do you assign literally zero probability to the simulation hypothesis? Because in-universe irreducible things are possible, conditional on it being true.
Assigning a slightly-too-high prior is a recoverable error: evidence will push you towards a nearly-correct posterior. For an AI with enough info-gathering capabilities, it will push it there fast enough that you could assign a prior of .99 to “the sky is orange” but it will figure out the truth in an instant. Assigning a literally zero prior is a fatal flaw that can’t be recovered from by gathering evidence.
It’s very possible that what’s possible for AIs should be a proper subset of what’s possible for humans. Or, to put it less counter-intuitively: The AI’s hypothesis space might need to be more restrictive than our own. (Plausibly, it will be more restrictive in some ways, less in others; e.g., it can entertain more complicated propositions than we can.)
On my view, the reason for that isn’t ‘humans think silly things, haha look how dumb they are, we’ll make our AI smarter than them by ruling out the dumbest ideas a priori’. If we give the AI silly-looking hypotheses with reasonable priors and reasonable bridge rules, then presumably it will just update to demote the silly ideas and do fine; so a priori ruling out the ideas we don’t like isn’t an independently useful goal. For superficially bizarre ideas that are actually at least somewhat plausible, like ‘there are Turing-uncomputable processes’ or ‘there are uncountably many universes’, this is just extra true. See my response to koko.
Instead, the reason AIs may need restrictive hypothesis spaces is that building a self-correcting epistemology is harder than living inside of one. We need to design a prior that’s simple enough for a human being (or somewhat enhanced human, or very weak AI) to evaluate its domain-general usefulness. That’s tough, especially if ‘domain-general usefulness’ requires something like an infinite-in-theory hypothesis space. We need a way to define a prior that’s simple and uniform enough for something at approximately human-level intelligence to assess and debug before we deploy it. But that’s likely to become increasingly difficult the more bizarre we allow the AI’s ruminations to become.
‘What are the properties of square circles? Could the atoms composing brains be made of tiny partless mental states? Could the atoms composing wombats be made of tiny partless wombats? Is it possible that colorless green ideas really do sleep furiously?’
All of these feel to me, a human (of an unusually philosophical and not-especially-positivistic bent), like they have a lot more cognitive content than ‘Is it possible that flibbleclabble?‘. I could see philosophers productively debating ‘does the nothing noth?’, and vaguely touching on some genuinely substantive issues. But to the extent those issues are substantive, they could probably be better addressed with a formalization that’s a lot less colorful and strange, and disposes of most of the vaguenesses and ambiguities of human language and thought.
An example of why we might need to simplify and precisify an AI’s hypotheses is Kolmogorov complexity. K-complexity provides a very simple and uniform method for assigning a measure to hypotheses, out of which we might be able to construct a sensible, converges-in-bounded-time-upon-reasonable-answers prior that can be vetted in advance by non-superintelligent programmers.
But K-complexity only works for computable hypotheses. So it suddenly becomes very urgent that we figure out how likely we think it is that the AI will run into uncomputable scenarios, figure out how well/poorly an AI without any way of representing uncomputable hypotheses would do in various uncomputable worlds, and figure out whether there are alternatives to K-complexity that generalize in reasonable, simple-enough-to-vet ways to wider classes of hypothesis.
This is not a trivial mathematical task, and it seems very likely that we’ll only have the time and intellectual resources to safely generalize AI hypothesis spaces in some ways before the UFAI clock strikes 0. We can’t generalize the hypothesis space in every programmable-in-principle way, so we should prioritize the generalizations that seem likely to actually make a difference in the AI’s decision-making, and that can’t be delegated to the seed AI in safe and reliable ways.
How would you tell if the the simulation hypothesis is a good model? How would you change your behavior if it were? If the answers are “there is no way” or “do nothing differently”, then it is as good as assigning zero probability to it.
If it’s a perfect simulation with no deliberate irregularities, and no dev-tools, and no pattern-matching functions that look for certain things and exert influences in response, or anything else of that ilk, you wouldn’t expect to see any supernatural phenomena, of course.
If you observe magic or something else that’s sufficiently highly improbable given known physical laws, you’d update in favor of someone trying to trick you, or you misunderstanding something, of course, but you’d also update at least slightly in favor of hypotheses in which magic can exist. Such as simulation, aliens, huge conspiracy, etc. If you assigned zero prior probability to it, you couldn’t update in that direction at all.
As for what would raise the simulation hypothesis relative to non-simulation hypotheses that explain supernatural things, I don’t know. Look at the precise conditions under which supernatural phenomena occur, see if they fit a pattern you’d expect an intelligence to devise? See if they can modify universal constants?
As for what you could do, if you discovered a non-reductionist effect? If it seems sufficiently safe take advantage of it, if it’s dangerous ignore it or try to keep other people from discovering it, if you’re an AI try to break out of the universe-box (or do whatever), I guess. Try to use the information to increase your utility.
Do you assign literally zero probability to the simulation hypothesis? Because in-universe irreducible things are possible, conditional on it being true.
Assigning a slightly-too-high prior is a recoverable error: evidence will push you towards a nearly-correct posterior. For an AI with enough info-gathering capabilities, it will push it there fast enough that you could assign a prior of .99 to “the sky is orange” but it will figure out the truth in an instant. Assigning a literally zero prior is a fatal flaw that can’t be recovered from by gathering evidence.
It’s very possible that what’s possible for AIs should be a proper subset of what’s possible for humans. Or, to put it less counter-intuitively: The AI’s hypothesis space might need to be more restrictive than our own. (Plausibly, it will be more restrictive in some ways, less in others; e.g., it can entertain more complicated propositions than we can.)
On my view, the reason for that isn’t ‘humans think silly things, haha look how dumb they are, we’ll make our AI smarter than them by ruling out the dumbest ideas a priori’. If we give the AI silly-looking hypotheses with reasonable priors and reasonable bridge rules, then presumably it will just update to demote the silly ideas and do fine; so a priori ruling out the ideas we don’t like isn’t an independently useful goal. For superficially bizarre ideas that are actually at least somewhat plausible, like ‘there are Turing-uncomputable processes’ or ‘there are uncountably many universes’, this is just extra true. See my response to koko.
Instead, the reason AIs may need restrictive hypothesis spaces is that building a self-correcting epistemology is harder than living inside of one. We need to design a prior that’s simple enough for a human being (or somewhat enhanced human, or very weak AI) to evaluate its domain-general usefulness. That’s tough, especially if ‘domain-general usefulness’ requires something like an infinite-in-theory hypothesis space. We need a way to define a prior that’s simple and uniform enough for something at approximately human-level intelligence to assess and debug before we deploy it. But that’s likely to become increasingly difficult the more bizarre we allow the AI’s ruminations to become.
‘What are the properties of square circles? Could the atoms composing brains be made of tiny partless mental states? Could the atoms composing wombats be made of tiny partless wombats? Is it possible that colorless green ideas really do sleep furiously?’
All of these feel to me, a human (of an unusually philosophical and not-especially-positivistic bent), like they have a lot more cognitive content than ‘Is it possible that flibbleclabble?‘. I could see philosophers productively debating ‘does the nothing noth?’, and vaguely touching on some genuinely substantive issues. But to the extent those issues are substantive, they could probably be better addressed with a formalization that’s a lot less colorful and strange, and disposes of most of the vaguenesses and ambiguities of human language and thought.
An example of why we might need to simplify and precisify an AI’s hypotheses is Kolmogorov complexity. K-complexity provides a very simple and uniform method for assigning a measure to hypotheses, out of which we might be able to construct a sensible, converges-in-bounded-time-upon-reasonable-answers prior that can be vetted in advance by non-superintelligent programmers.
But K-complexity only works for computable hypotheses. So it suddenly becomes very urgent that we figure out how likely we think it is that the AI will run into uncomputable scenarios, figure out how well/poorly an AI without any way of representing uncomputable hypotheses would do in various uncomputable worlds, and figure out whether there are alternatives to K-complexity that generalize in reasonable, simple-enough-to-vet ways to wider classes of hypothesis.
This is not a trivial mathematical task, and it seems very likely that we’ll only have the time and intellectual resources to safely generalize AI hypothesis spaces in some ways before the UFAI clock strikes 0. We can’t generalize the hypothesis space in every programmable-in-principle way, so we should prioritize the generalizations that seem likely to actually make a difference in the AI’s decision-making, and that can’t be delegated to the seed AI in safe and reliable ways.
How would you tell if the the simulation hypothesis is a good model? How would you change your behavior if it were? If the answers are “there is no way” or “do nothing differently”, then it is as good as assigning zero probability to it.
If it’s a perfect simulation with no deliberate irregularities, and no dev-tools, and no pattern-matching functions that look for certain things and exert influences in response, or anything else of that ilk, you wouldn’t expect to see any supernatural phenomena, of course.
If you observe magic or something else that’s sufficiently highly improbable given known physical laws, you’d update in favor of someone trying to trick you, or you misunderstanding something, of course, but you’d also update at least slightly in favor of hypotheses in which magic can exist. Such as simulation, aliens, huge conspiracy, etc. If you assigned zero prior probability to it, you couldn’t update in that direction at all.
As for what would raise the simulation hypothesis relative to non-simulation hypotheses that explain supernatural things, I don’t know. Look at the precise conditions under which supernatural phenomena occur, see if they fit a pattern you’d expect an intelligence to devise? See if they can modify universal constants?
As for what you could do, if you discovered a non-reductionist effect? If it seems sufficiently safe take advantage of it, if it’s dangerous ignore it or try to keep other people from discovering it, if you’re an AI try to break out of the universe-box (or do whatever), I guess. Try to use the information to increase your utility.