And since we happen to live in a world made of physics, the kind of monist we want in practice is a reductive physicalist AI. We want a ‘physicalist’ as opposed to a reductive monist that thinks everything is made of monads, or abstract objects, or morality fluid, or what-have-you
This may be nitpicky, but I’d like our AI to leave open the possibility of a non-physical ontology. We don’t yet know that our world is made of physics. Even though it seems like it is. An analogy: It would be bad to hard-code our AI to have an ontology of wave-particles, since things might turn out to be made of strings/branes. So we shouldn’t rule out other possibilities either.
I’m not sure what you have in mind when you say ‘non-physical ontology’. Physics at this point is pretty well empirically confirmed, so it doesn’t seem likely we’ll discover it’s All A Lie tomorrow. On the other hand, you might have in mind a worry like:
How much detail of our contemporary scientific world-view is it safe to presuppose in building the AI, without our needing to seriously worry that tomorrow we’ll have a revolution in physics that’s outside of our AI’s hypothesis space?
In particular: Might we discover that physics as we know it is a high-level approximation of a mathematical structure that looks nothing like physics as we know it?
To what extent is it OK if the world turns out to be non-computable but the AI can only hypothesize computable environments?
These are all very serious, and certainly not nitpicky. My last couple of posts in this sequence will be about the open problem ‘Given that we want our AGI’s hypotheses to look like immersive worlds rather than like communicating programs, how do we formalize “world”?’ If we were building this thing at the turn of the 20th century, we might have assumed that it was safe to build ‘made of atoms’ into our conception of ‘physical’, and let the AI only think in terms of configurations of atoms. What revisable assumptions about the world might be in the background of our current thinking, that we ought to have the AI treat as revisable hypotheses and not as fixed axioms?
The worry I had in mind is pretty well captured by your three bullet points there, though I think you are phrasing it in a weaker way than it deserves. Consider the Simulation Hypothesis combined with the hypothesis that the higher-level universe running the simulation does not follow rules remotely like those of modern physics. If it is true, then an AI which is hard-coded to only consider “physical” theories will be bad.
I’m not sure what you mean by (paraphrase) ‘we want our AI to be a reductive physicalist monist.’ I worried that you meant something like “We want our AI to be incapable of assigning any probability whatsoever to the existence of abstract objects, monads, or for that matter anything that doesn’t look like the stuff physicists would talk about.” It is quite possible that you meant something much less strong, in which case I was just being nitpicky about your language. If you truly meant that though, then I think myself to be raising a serious issue here.
By ‘non-physical ontology’ I meant mainly (a) an ontology that is radically different from modern physics, but also (b) in particular, an ontology that involves monads, or ideas, or abstract objects. (I exclude morality fluid because I’m pretty sure you just made that up to serve as an example of ridiculousness. The other options are not ridiculous though. Not that I know much about monads.)
I worried that you meant something like “We want our AI to be incapable of assigning any probability whatsoever to the existence of abstract objects, monads, or for that matter anything that doesn’t look like the stuff physicists would talk about.”
What I meant was a conjunctive claim: ‘We want our AI’s beliefs to rapidly approach the truth’, and ‘the truth probably looks reasonably similar to contemporary physical theory’. I think it’s an open question how strict ‘reasonably similar’ is, but the three examples I gave are very plausibly outside that category.
However, I independently suspect that an FAI won’t be able to hypothesize all three of those things. That’s not a requirement for naturalized agents; a naturalized agent should in principle be able to hypothesize anything a human or Cartesian can and do fine, by having vanishingly small priors for a lot of the weirder ideas. But I suspect that in practice it won’t be pragmatically important to make the AI’s hypothesis space that large. And I also suspect that it would be too difficult and time-consuming for us to formalize ‘monad’ and ‘morality fluid’ and assign sensible priors to those formalizations. See my response to glomerulus.
So, ‘assign 0 probability to those hypotheses’ isn’t part of what I mean by ‘physicalist’, but it’s not at all implausible that that’s the sort of thing human beings need to do in order to build a working, able-to-be-vetted superintelligent physicalist. Being unable to think about false things (or a fortiori not-even-false things) can make an agent converge upon the truth faster and with less chance of getting stuck in an epistemic dead end.
(Edit: And the agent will still be able to predict our beliefs about incoherent things; our brains are computable, even if some of the objects of our thoughts are not.)
I exclude morality fluid because I’m pretty sure you just made that up to serve as an example of ridiculousness.
? Why exactly is it sillier to think our universe is made of morality-stuff than to think our universe is made of mind-stuff? Is it because morality is more abstract than mind stuff? But abstract objects are too, presumably.… I wasn’t being entirely serious, no, but now I’m curious about your beliefs about morality.
What I meant was a conjunctive claim: ‘We want our AI’s beliefs to rapidly approach the truth’, and ‘the truth probably looks reasonably similar to contemporary physical theory’
Then I agree with you. This was all a misunderstanding. Read my original comment as a nitpick about your choice of words, then.
...
The truth does probably look reasonably similar to contemporary physical theory, but we can handle that by giving the AI the appropriate priors. We don’t need to make it actually rule stuff out entirely, even though it would probably work out OK if we did.
I don’t think it would be that difficult for us to formalize “monad.” Monads are actually pretty straightforward as I understand them. Ideas would be harder. At any rate, I don’t think we need to formalize lots of different fundamental ontologies and have it choose between them. Instead, all we need to do is formalize a general open-mindedness towards considering different ontologies. I admit this may be difficult, but it seems doable. Correct me if I’m wrong.
? Why exactly is it sillier to think our universe is made of morality-stuff than to think our universe is made of mind-stuff?
I didn’t exclude morality fluid because I thought it was sillier; I excluded it because I thought it wasn’t even a thing. You might as well have said “aslkdj theory” and then challenged me to explain why “aslkdj theory” is sillier than monads or ideas. It’s an illegitimate challenge, since you don’t mean anything by “aslkdj theory.” By contrast, there are actual bodies of literature on idealism and on monads, so it is legitimate to ask me what I think about them.
To put it another way: He who introduces a term decides what that term means. “Monads” and “Ideas,” having been introduced by very smart, thoughtful people and discussed by hundreds more, definitely are meaningful, at least meaningful enough to talk about. (Meaningfulness comes in degrees) If we talk about morality fluid, which I suspect is something you made up, then we rely on whatever meaning you assigned to it when you made it up—but since you (I suspect) assigned no meaning to it, we can’t even talk about it.
EDIT: So, in conclusion, if you tell me what morality fluid means, then I’ll tell you what I think about it.
Ah, OK. What I mean by ‘the world is made of morality’ is that physics reduces to (is fully, accurately, parsimoniously, asymetrically explainable in terms of) some structure isomorphic to the complex machinery we call ‘morality’. For example, it turns out that the mathematical properties of human-style Fairness are what explains the mathematical properties of dark energy or quantum gravity.
This doesn’t necessarily mean that the universe is ‘fair’ in any intuitive sense, though karmic justice might be another candidate for an unphysicalistic hypothesis. It’s more like the hypothesis that a simulation deity created our moral intuitions, then built our universe out of the patterns in that moral code. Like a somewhat less arbitrary variant on ‘I’m going to use a simple set of letter-to-note transition rules to convert the works of Shakespeare into a new musical piece’.
I think this view is fully analogous to idealism. If it makes complete sense to ask whether our world is made of mental stuff, it can’t be because our mental stuff is simultaneously a complex human brain operation and an irreducible simple; rather, it’s because the complex human brain operation could have been a key ingredient in the laws and patterns of our universe, especially if some god or simulator built our universe.
I don’t think we need to formalize lots of different fundamental ontologies and have it choose between them. Instead, all we need to do is formalize a general open-mindedness towards considering different ontologies. I admit this may be difficult, but it seems doable. Correct me if I’m wrong.
I don’t think I know enough to correct you. But I can express my doubts. I suspect ‘a general open-mindedness towards considering different ontologies’ can’t be formalized, or can’t be both formalized and humanly vetted. At a minimum, we’ll need to decide what gets to count as an ‘ontology’, which means drawing the line somewhere and declaring everything outside a certain set of boundaries nonsensical. And I’m skeptical that there’s any strongly principled way to determine that ‘colorless green ideas sleep furiously’ is contentless or nonsensical or ‘non-ontological’, while ‘the world is made of partless fundamental ideas’ is contentful and meaningful and picks out an ontology.
(Which doesn’t mean I think we should be rude or dismissive toward idealists in ordinary conversation. We should be very careful not to conflate the question ‘what questions should we treat with respect or inquire into in human social settings’ with the question ‘what questions should we program a Friendly AI to be able to natively consider’.)
Thanks for that explanation of mental stuff. My opinion? Sounds implausible, but fine, in the sense that we shouldn’t build our AI in a way that makes it incapable of considering that hypothesis. As an aside, I think it is less plausible than idealism, because it lacks the main cluster of motivations for idealism. The whole point of idealism is to be monist (and thus achieve ontological parsimony) whilst also “taking consciousness seriously.” As seriously as possible, in fact. Perhaps more seriously than is necessary, but anyhow that’s the appeal. Morality fluid takes morals seriously (maybe? Maybe not, actually, given your construction) but it doesn’t take consciousness any more seriously than physicalism, it seems. And, I think, it is more important that our theories take consciousness seriously than that they take morality seriously.
I suspect ‘a general open-mindedness towards considering different ontologies’ can’t be formalized, or can’t be both formalized and humanly vetted.
Humans do it. If intelligent humans can consider a hypothesis, an AI should be able to as well. In most cases it will quickly realize the hypothesis is silly or even self-contradictory, but at least it should be able to give them an honest try, rather than classify them as nonsense from the beginning.
At a minimum, we’ll need to decide what gets to count as an ‘ontology’, which means drawing the line somewhere and declaring everything outside a certain set of boundaries nonsensical.
Doesn’t seem to difficult to me. It isn’t really an ontology/nonontology distinction we are looking for, but a “hypothesis about the lowest level of description of the world / not that” distinction. Since the hypothesis itself states whether or not it is about the lowest level of description of the world, really all this comes down to is the distinction between a hypothesis and something other than a hypothesis. Right?
My general idea is, we don’t want to make our AI more limited than ourselves. In fact, we probably want our AI to reason “as we wish we ourselves would reason.” You don’t wish you were incapable of considering idealism, do you? If you do, why?
… Are you claiming that not only is the world dualistic, but that not only humans but also AIs that we program in enough detail that what ontology we program them with matters have souls? Or that there exist metaphysical souls that are not computable but you expect an AI lacking one to understand them and act appropriately? just… wut?
I don’t think that’s what they’re saying at all. I think they mean, don’t hardcode physics understanding into them the way that humans have a hardcoded intuition for newtonian-physics, because our current understanding of the universe isn’t so strong as to be confident we’re not missing something. So it should be able to figure out the mechanism by which its map is written on the territory, and update it’s map of its map accordingly.
E.g., in case it thinks it’s flipping q-bits to store memory, and defends its databases accordingly, but actually q-bits aren’t the lowest level of abstraction and it’s really wiggling a hyperdimensional membrane in a way that makes it behave like q-bits under most circumstances, or in case the universe isn’t 100% reductionistic and some psychic comes along and messes with it’s mind using mystical woo-woo. (The latter being incredibly unlikely, but hey, might as well have an AI that can prepare itself for anything)
Oh. OH. Yea that makes more sense, and is so obviously true that I didn’t even consider the hypothesis someone’d feel the need to say it, but in hindsight I was wrong and it’s probably a good thing someone did.
in case the universe isn’t 100% reductionistic and some psychic comes along and messes with it’s mind using mystical woo-woo. (The latter being incredibly unlikely, but hey, might as well have an AI that can prepare itself for anything)
This isn’t a free lunch; letting the AI form really weird hypotheses might be a bad idea, because we might give those weird hypotheses the wrong prior. Non-reductive hypotheses, and especially non-Turing-computable non-reductive hypotheses, might not be able to be assigned complexity penalties in any of the obvious or intuitive ways we assign complexity penalties to absurd physical hypotheses or absurd computable hypotheses.
It could be a big mistake if we gave the AI a really weird formalism for thinking thoughts like ‘the irreducible witch down the street did it’ and assigned a slightly-too-high prior probability to at least one of those non-reductive or non-computable hypotheses.
Do you assign literally zero probability to the simulation hypothesis? Because in-universe irreducible things are possible, conditional on it being true.
Assigning a slightly-too-high prior is a recoverable error: evidence will push you towards a nearly-correct posterior. For an AI with enough info-gathering capabilities, it will push it there fast enough that you could assign a prior of .99 to “the sky is orange” but it will figure out the truth in an instant. Assigning a literally zero prior is a fatal flaw that can’t be recovered from by gathering evidence.
It’s very possible that what’s possible for AIs should be a proper subset of what’s possible for humans. Or, to put it less counter-intuitively: The AI’s hypothesis space might need to be more restrictive than our own. (Plausibly, it will be more restrictive in some ways, less in others; e.g., it can entertain more complicated propositions than we can.)
On my view, the reason for that isn’t ‘humans think silly things, haha look how dumb they are, we’ll make our AI smarter than them by ruling out the dumbest ideas a priori’. If we give the AI silly-looking hypotheses with reasonable priors and reasonable bridge rules, then presumably it will just update to demote the silly ideas and do fine; so a priori ruling out the ideas we don’t like isn’t an independently useful goal. For superficially bizarre ideas that are actually at least somewhat plausible, like ‘there are Turing-uncomputable processes’ or ‘there are uncountably many universes’, this is just extra true. See my response to koko.
Instead, the reason AIs may need restrictive hypothesis spaces is that building a self-correcting epistemology is harder than living inside of one. We need to design a prior that’s simple enough for a human being (or somewhat enhanced human, or very weak AI) to evaluate its domain-general usefulness. That’s tough, especially if ‘domain-general usefulness’ requires something like an infinite-in-theory hypothesis space. We need a way to define a prior that’s simple and uniform enough for something at approximately human-level intelligence to assess and debug before we deploy it. But that’s likely to become increasingly difficult the more bizarre we allow the AI’s ruminations to become.
‘What are the properties of square circles? Could the atoms composing brains be made of tiny partless mental states? Could the atoms composing wombats be made of tiny partless wombats? Is it possible that colorless green ideas really do sleep furiously?’
All of these feel to me, a human (of an unusually philosophical and not-especially-positivistic bent), like they have a lot more cognitive content than ‘Is it possible that flibbleclabble?‘. I could see philosophers productively debating ‘does the nothing noth?’, and vaguely touching on some genuinely substantive issues. But to the extent those issues are substantive, they could probably be better addressed with a formalization that’s a lot less colorful and strange, and disposes of most of the vaguenesses and ambiguities of human language and thought.
An example of why we might need to simplify and precisify an AI’s hypotheses is Kolmogorov complexity. K-complexity provides a very simple and uniform method for assigning a measure to hypotheses, out of which we might be able to construct a sensible, converges-in-bounded-time-upon-reasonable-answers prior that can be vetted in advance by non-superintelligent programmers.
But K-complexity only works for computable hypotheses. So it suddenly becomes very urgent that we figure out how likely we think it is that the AI will run into uncomputable scenarios, figure out how well/poorly an AI without any way of representing uncomputable hypotheses would do in various uncomputable worlds, and figure out whether there are alternatives to K-complexity that generalize in reasonable, simple-enough-to-vet ways to wider classes of hypothesis.
This is not a trivial mathematical task, and it seems very likely that we’ll only have the time and intellectual resources to safely generalize AI hypothesis spaces in some ways before the UFAI clock strikes 0. We can’t generalize the hypothesis space in every programmable-in-principle way, so we should prioritize the generalizations that seem likely to actually make a difference in the AI’s decision-making, and that can’t be delegated to the seed AI in safe and reliable ways.
How would you tell if the the simulation hypothesis is a good model? How would you change your behavior if it were? If the answers are “there is no way” or “do nothing differently”, then it is as good as assigning zero probability to it.
If it’s a perfect simulation with no deliberate irregularities, and no dev-tools, and no pattern-matching functions that look for certain things and exert influences in response, or anything else of that ilk, you wouldn’t expect to see any supernatural phenomena, of course.
If you observe magic or something else that’s sufficiently highly improbable given known physical laws, you’d update in favor of someone trying to trick you, or you misunderstanding something, of course, but you’d also update at least slightly in favor of hypotheses in which magic can exist. Such as simulation, aliens, huge conspiracy, etc. If you assigned zero prior probability to it, you couldn’t update in that direction at all.
As for what would raise the simulation hypothesis relative to non-simulation hypotheses that explain supernatural things, I don’t know. Look at the precise conditions under which supernatural phenomena occur, see if they fit a pattern you’d expect an intelligence to devise? See if they can modify universal constants?
As for what you could do, if you discovered a non-reductionist effect? If it seems sufficiently safe take advantage of it, if it’s dangerous ignore it or try to keep other people from discovering it, if you’re an AI try to break out of the universe-box (or do whatever), I guess. Try to use the information to increase your utility.
This may be nitpicky, but I’d like our AI to leave open the possibility of a non-physical ontology. We don’t yet know that our world is made of physics. Even though it seems like it is. An analogy: It would be bad to hard-code our AI to have an ontology of wave-particles, since things might turn out to be made of strings/branes. So we shouldn’t rule out other possibilities either.
I’m not sure what you have in mind when you say ‘non-physical ontology’. Physics at this point is pretty well empirically confirmed, so it doesn’t seem likely we’ll discover it’s All A Lie tomorrow. On the other hand, you might have in mind a worry like:
How much detail of our contemporary scientific world-view is it safe to presuppose in building the AI, without our needing to seriously worry that tomorrow we’ll have a revolution in physics that’s outside of our AI’s hypothesis space?
In particular: Might we discover that physics as we know it is a high-level approximation of a mathematical structure that looks nothing like physics as we know it?
To what extent is it OK if the world turns out to be non-computable but the AI can only hypothesize computable environments?
These are all very serious, and certainly not nitpicky. My last couple of posts in this sequence will be about the open problem ‘Given that we want our AGI’s hypotheses to look like immersive worlds rather than like communicating programs, how do we formalize “world”?’ If we were building this thing at the turn of the 20th century, we might have assumed that it was safe to build ‘made of atoms’ into our conception of ‘physical’, and let the AI only think in terms of configurations of atoms. What revisable assumptions about the world might be in the background of our current thinking, that we ought to have the AI treat as revisable hypotheses and not as fixed axioms?
The worry I had in mind is pretty well captured by your three bullet points there, though I think you are phrasing it in a weaker way than it deserves. Consider the Simulation Hypothesis combined with the hypothesis that the higher-level universe running the simulation does not follow rules remotely like those of modern physics. If it is true, then an AI which is hard-coded to only consider “physical” theories will be bad.
I’m not sure what you mean by (paraphrase) ‘we want our AI to be a reductive physicalist monist.’ I worried that you meant something like “We want our AI to be incapable of assigning any probability whatsoever to the existence of abstract objects, monads, or for that matter anything that doesn’t look like the stuff physicists would talk about.” It is quite possible that you meant something much less strong, in which case I was just being nitpicky about your language. If you truly meant that though, then I think myself to be raising a serious issue here.
By ‘non-physical ontology’ I meant mainly (a) an ontology that is radically different from modern physics, but also (b) in particular, an ontology that involves monads, or ideas, or abstract objects. (I exclude morality fluid because I’m pretty sure you just made that up to serve as an example of ridiculousness. The other options are not ridiculous though. Not that I know much about monads.)
What I meant was a conjunctive claim: ‘We want our AI’s beliefs to rapidly approach the truth’, and ‘the truth probably looks reasonably similar to contemporary physical theory’. I think it’s an open question how strict ‘reasonably similar’ is, but the three examples I gave are very plausibly outside that category.
However, I independently suspect that an FAI won’t be able to hypothesize all three of those things. That’s not a requirement for naturalized agents; a naturalized agent should in principle be able to hypothesize anything a human or Cartesian can and do fine, by having vanishingly small priors for a lot of the weirder ideas. But I suspect that in practice it won’t be pragmatically important to make the AI’s hypothesis space that large. And I also suspect that it would be too difficult and time-consuming for us to formalize ‘monad’ and ‘morality fluid’ and assign sensible priors to those formalizations. See my response to glomerulus.
So, ‘assign 0 probability to those hypotheses’ isn’t part of what I mean by ‘physicalist’, but it’s not at all implausible that that’s the sort of thing human beings need to do in order to build a working, able-to-be-vetted superintelligent physicalist. Being unable to think about false things (or a fortiori not-even-false things) can make an agent converge upon the truth faster and with less chance of getting stuck in an epistemic dead end.
(Edit: And the agent will still be able to predict our beliefs about incoherent things; our brains are computable, even if some of the objects of our thoughts are not.)
? Why exactly is it sillier to think our universe is made of morality-stuff than to think our universe is made of mind-stuff? Is it because morality is more abstract than mind stuff? But abstract objects are too, presumably.… I wasn’t being entirely serious, no, but now I’m curious about your beliefs about morality.
Then I agree with you. This was all a misunderstanding. Read my original comment as a nitpick about your choice of words, then.
...
The truth does probably look reasonably similar to contemporary physical theory, but we can handle that by giving the AI the appropriate priors. We don’t need to make it actually rule stuff out entirely, even though it would probably work out OK if we did.
I don’t think it would be that difficult for us to formalize “monad.” Monads are actually pretty straightforward as I understand them. Ideas would be harder. At any rate, I don’t think we need to formalize lots of different fundamental ontologies and have it choose between them. Instead, all we need to do is formalize a general open-mindedness towards considering different ontologies. I admit this may be difficult, but it seems doable. Correct me if I’m wrong.
I didn’t exclude morality fluid because I thought it was sillier; I excluded it because I thought it wasn’t even a thing. You might as well have said “aslkdj theory” and then challenged me to explain why “aslkdj theory” is sillier than monads or ideas. It’s an illegitimate challenge, since you don’t mean anything by “aslkdj theory.” By contrast, there are actual bodies of literature on idealism and on monads, so it is legitimate to ask me what I think about them.
To put it another way: He who introduces a term decides what that term means. “Monads” and “Ideas,” having been introduced by very smart, thoughtful people and discussed by hundreds more, definitely are meaningful, at least meaningful enough to talk about. (Meaningfulness comes in degrees) If we talk about morality fluid, which I suspect is something you made up, then we rely on whatever meaning you assigned to it when you made it up—but since you (I suspect) assigned no meaning to it, we can’t even talk about it.
EDIT: So, in conclusion, if you tell me what morality fluid means, then I’ll tell you what I think about it.
Ah, OK. What I mean by ‘the world is made of morality’ is that physics reduces to (is fully, accurately, parsimoniously, asymetrically explainable in terms of) some structure isomorphic to the complex machinery we call ‘morality’. For example, it turns out that the mathematical properties of human-style Fairness are what explains the mathematical properties of dark energy or quantum gravity.
This doesn’t necessarily mean that the universe is ‘fair’ in any intuitive sense, though karmic justice might be another candidate for an unphysicalistic hypothesis. It’s more like the hypothesis that a simulation deity created our moral intuitions, then built our universe out of the patterns in that moral code. Like a somewhat less arbitrary variant on ‘I’m going to use a simple set of letter-to-note transition rules to convert the works of Shakespeare into a new musical piece’.
I think this view is fully analogous to idealism. If it makes complete sense to ask whether our world is made of mental stuff, it can’t be because our mental stuff is simultaneously a complex human brain operation and an irreducible simple; rather, it’s because the complex human brain operation could have been a key ingredient in the laws and patterns of our universe, especially if some god or simulator built our universe.
I don’t think I know enough to correct you. But I can express my doubts. I suspect ‘a general open-mindedness towards considering different ontologies’ can’t be formalized, or can’t be both formalized and humanly vetted. At a minimum, we’ll need to decide what gets to count as an ‘ontology’, which means drawing the line somewhere and declaring everything outside a certain set of boundaries nonsensical. And I’m skeptical that there’s any strongly principled way to determine that ‘colorless green ideas sleep furiously’ is contentless or nonsensical or ‘non-ontological’, while ‘the world is made of partless fundamental ideas’ is contentful and meaningful and picks out an ontology.
(Which doesn’t mean I think we should be rude or dismissive toward idealists in ordinary conversation. We should be very careful not to conflate the question ‘what questions should we treat with respect or inquire into in human social settings’ with the question ‘what questions should we program a Friendly AI to be able to natively consider’.)
Thanks for that explanation of mental stuff. My opinion? Sounds implausible, but fine, in the sense that we shouldn’t build our AI in a way that makes it incapable of considering that hypothesis. As an aside, I think it is less plausible than idealism, because it lacks the main cluster of motivations for idealism. The whole point of idealism is to be monist (and thus achieve ontological parsimony) whilst also “taking consciousness seriously.” As seriously as possible, in fact. Perhaps more seriously than is necessary, but anyhow that’s the appeal. Morality fluid takes morals seriously (maybe? Maybe not, actually, given your construction) but it doesn’t take consciousness any more seriously than physicalism, it seems. And, I think, it is more important that our theories take consciousness seriously than that they take morality seriously.
Humans do it. If intelligent humans can consider a hypothesis, an AI should be able to as well. In most cases it will quickly realize the hypothesis is silly or even self-contradictory, but at least it should be able to give them an honest try, rather than classify them as nonsense from the beginning.
Doesn’t seem to difficult to me. It isn’t really an ontology/nonontology distinction we are looking for, but a “hypothesis about the lowest level of description of the world / not that” distinction. Since the hypothesis itself states whether or not it is about the lowest level of description of the world, really all this comes down to is the distinction between a hypothesis and something other than a hypothesis. Right?
My general idea is, we don’t want to make our AI more limited than ourselves. In fact, we probably want our AI to reason “as we wish we ourselves would reason.” You don’t wish you were incapable of considering idealism, do you? If you do, why?
… Are you claiming that not only is the world dualistic, but that not only humans but also AIs that we program in enough detail that what ontology we program them with matters have souls? Or that there exist metaphysical souls that are not computable but you expect an AI lacking one to understand them and act appropriately? just… wut?
I don’t think that’s what they’re saying at all. I think they mean, don’t hardcode physics understanding into them the way that humans have a hardcoded intuition for newtonian-physics, because our current understanding of the universe isn’t so strong as to be confident we’re not missing something. So it should be able to figure out the mechanism by which its map is written on the territory, and update it’s map of its map accordingly.
E.g., in case it thinks it’s flipping q-bits to store memory, and defends its databases accordingly, but actually q-bits aren’t the lowest level of abstraction and it’s really wiggling a hyperdimensional membrane in a way that makes it behave like q-bits under most circumstances, or in case the universe isn’t 100% reductionistic and some psychic comes along and messes with it’s mind using mystical woo-woo. (The latter being incredibly unlikely, but hey, might as well have an AI that can prepare itself for anything)
Oh. OH. Yea that makes more sense, and is so obviously true that I didn’t even consider the hypothesis someone’d feel the need to say it, but in hindsight I was wrong and it’s probably a good thing someone did.
This isn’t a free lunch; letting the AI form really weird hypotheses might be a bad idea, because we might give those weird hypotheses the wrong prior. Non-reductive hypotheses, and especially non-Turing-computable non-reductive hypotheses, might not be able to be assigned complexity penalties in any of the obvious or intuitive ways we assign complexity penalties to absurd physical hypotheses or absurd computable hypotheses.
It could be a big mistake if we gave the AI a really weird formalism for thinking thoughts like ‘the irreducible witch down the street did it’ and assigned a slightly-too-high prior probability to at least one of those non-reductive or non-computable hypotheses.
Do you assign literally zero probability to the simulation hypothesis? Because in-universe irreducible things are possible, conditional on it being true.
Assigning a slightly-too-high prior is a recoverable error: evidence will push you towards a nearly-correct posterior. For an AI with enough info-gathering capabilities, it will push it there fast enough that you could assign a prior of .99 to “the sky is orange” but it will figure out the truth in an instant. Assigning a literally zero prior is a fatal flaw that can’t be recovered from by gathering evidence.
It’s very possible that what’s possible for AIs should be a proper subset of what’s possible for humans. Or, to put it less counter-intuitively: The AI’s hypothesis space might need to be more restrictive than our own. (Plausibly, it will be more restrictive in some ways, less in others; e.g., it can entertain more complicated propositions than we can.)
On my view, the reason for that isn’t ‘humans think silly things, haha look how dumb they are, we’ll make our AI smarter than them by ruling out the dumbest ideas a priori’. If we give the AI silly-looking hypotheses with reasonable priors and reasonable bridge rules, then presumably it will just update to demote the silly ideas and do fine; so a priori ruling out the ideas we don’t like isn’t an independently useful goal. For superficially bizarre ideas that are actually at least somewhat plausible, like ‘there are Turing-uncomputable processes’ or ‘there are uncountably many universes’, this is just extra true. See my response to koko.
Instead, the reason AIs may need restrictive hypothesis spaces is that building a self-correcting epistemology is harder than living inside of one. We need to design a prior that’s simple enough for a human being (or somewhat enhanced human, or very weak AI) to evaluate its domain-general usefulness. That’s tough, especially if ‘domain-general usefulness’ requires something like an infinite-in-theory hypothesis space. We need a way to define a prior that’s simple and uniform enough for something at approximately human-level intelligence to assess and debug before we deploy it. But that’s likely to become increasingly difficult the more bizarre we allow the AI’s ruminations to become.
‘What are the properties of square circles? Could the atoms composing brains be made of tiny partless mental states? Could the atoms composing wombats be made of tiny partless wombats? Is it possible that colorless green ideas really do sleep furiously?’
All of these feel to me, a human (of an unusually philosophical and not-especially-positivistic bent), like they have a lot more cognitive content than ‘Is it possible that flibbleclabble?‘. I could see philosophers productively debating ‘does the nothing noth?’, and vaguely touching on some genuinely substantive issues. But to the extent those issues are substantive, they could probably be better addressed with a formalization that’s a lot less colorful and strange, and disposes of most of the vaguenesses and ambiguities of human language and thought.
An example of why we might need to simplify and precisify an AI’s hypotheses is Kolmogorov complexity. K-complexity provides a very simple and uniform method for assigning a measure to hypotheses, out of which we might be able to construct a sensible, converges-in-bounded-time-upon-reasonable-answers prior that can be vetted in advance by non-superintelligent programmers.
But K-complexity only works for computable hypotheses. So it suddenly becomes very urgent that we figure out how likely we think it is that the AI will run into uncomputable scenarios, figure out how well/poorly an AI without any way of representing uncomputable hypotheses would do in various uncomputable worlds, and figure out whether there are alternatives to K-complexity that generalize in reasonable, simple-enough-to-vet ways to wider classes of hypothesis.
This is not a trivial mathematical task, and it seems very likely that we’ll only have the time and intellectual resources to safely generalize AI hypothesis spaces in some ways before the UFAI clock strikes 0. We can’t generalize the hypothesis space in every programmable-in-principle way, so we should prioritize the generalizations that seem likely to actually make a difference in the AI’s decision-making, and that can’t be delegated to the seed AI in safe and reliable ways.
How would you tell if the the simulation hypothesis is a good model? How would you change your behavior if it were? If the answers are “there is no way” or “do nothing differently”, then it is as good as assigning zero probability to it.
If it’s a perfect simulation with no deliberate irregularities, and no dev-tools, and no pattern-matching functions that look for certain things and exert influences in response, or anything else of that ilk, you wouldn’t expect to see any supernatural phenomena, of course.
If you observe magic or something else that’s sufficiently highly improbable given known physical laws, you’d update in favor of someone trying to trick you, or you misunderstanding something, of course, but you’d also update at least slightly in favor of hypotheses in which magic can exist. Such as simulation, aliens, huge conspiracy, etc. If you assigned zero prior probability to it, you couldn’t update in that direction at all.
As for what would raise the simulation hypothesis relative to non-simulation hypotheses that explain supernatural things, I don’t know. Look at the precise conditions under which supernatural phenomena occur, see if they fit a pattern you’d expect an intelligence to devise? See if they can modify universal constants?
As for what you could do, if you discovered a non-reductionist effect? If it seems sufficiently safe take advantage of it, if it’s dangerous ignore it or try to keep other people from discovering it, if you’re an AI try to break out of the universe-box (or do whatever), I guess. Try to use the information to increase your utility.