Assigning zero probability to claims is bad because then one can’t ever update to accept the claim no matter what evidence one has. Moreover, this doesn’t seem to have much to do with “infinite claims” given that there are claims involving infinity that you would probably accept. For example, if we got what looked like a working Theory of Everything that implied that the universe is infinite, you’d probably assign a non-zero probability to the universe being infinite. You can’t assign all hypotheses involving infinity zero probability if you want to be able to update to include them.
Suppose I randomly pick a coin from all of Coinspace and flip it. What probability do you assign to the coin landing heads? Probably around 1⁄2.
Now suppose I do the same thing, but pick N coins and flip them all. The probability that they all come up heads is roughly 1/2^N.
Suppose I halt time to allow this experiment to continue as long as we want, then keep flipping coins randomly picked from Coinspace until I get a tail. What is the probability I will never get a tail? It should be the limit of 1/2^N as N goes to infinity, which is 0. Events with probability of 0 are allowed—indeed, expected—when you are dealing with infinite probability spaces such as this one.
It’s also not true that we can’t ever update if our prior probability for something is 0. It is just that we need infinite evidence, which is a scary way of saying that the probability of receiving said evidence is also 0. For instance, if you flip coins infinitely many times, and I observe all but the first 10 and never see “tails” (which has a probability of 0 of happening) then my belief that all the coins landed “heads” has gone up from 0 to 1/2^10 = 1/1024.
There are only countably many hypotheses that one can consider. In the coin flip context as you’ve constructed the probability space there are uncountably many possible results. If one presumes that there’s a really a Turing computable (or even just explicitly definable in some axiomatic framework like ZFC) set of possibilities for the behavior of the coin, then there are only countably many each with a finite probability.
Obviously, this in some respects makes the math much ickier, so for most purposes it is more helpful to assume that the coin is really random.
Note also that your updating took an infinite amount of evidence (since you observed all but the first 10 flips) . So it is at least fair to say that if one assigns probability zero to something then one can’t ever update in finite time, which is about as bad as not being able to update.
I introduced the concept of CoinSpace to make it clear that all the coinflips are independent of each other: if I were actually flipping a single coin I would assign it a nonzero (though very small) probability that it never lands “tails”. Possibly I should have just said the independence assumption.
And yes, I agree that if we postulate a finite time condition, then P(X) = 0 means one can’t ever update on X. However, in the context of this post, we don’t have a finite time condition: God-TimFreeman explicitly needs to stop time in order to be able to flip the coin as many times as necessary. Once we have that, then we need to be able to assign probabilities of 0 to events that almost never happen.
Assigning zero probability to claims is bad because then one can’t ever update to accept the claim no matter what evidence one has.
I linked to the article expressing that view. It makes a valid point.
Moreover, this doesn’t seem to have much to do with “infinite claims” given that there are claims involving infinity that you would probably accept.
I am not saying anything about all claims involving infinity. I am addressing the particular claim in the original post.
Yes, assigning GOD(infinity) a probability of zero means that no finite amount of evidence will shift that. For this particular infinite claim I don’t see a problem with that.
Thoroughgoing rejection of 0 and 1 as probabilities means that you have to assign positive probability to P(A & ~A). You also have to reject real-valued variables—the probability of a randomly thrown dart hitting a particular number on the real line is zero. Unless you can actually do these things—actually reconstruct probability theory in a way that makes P(A|B) and P(~A|B) sum to less than 1, and prohibit uncountable measure spaces—then claiming that you should do them anyway is to make the real insight of Eliezer’s article into an empty slogan.
Yes, assigning GOD(infinity) a probability of zero means that no finite amount of evidence will shift that. For this particular infinite claim I don’t see a problem with that.
So how do you determine which claims you are giving a prior probability of zero and which you don’t?
Thoroughgoing rejection of 0 and 1 as probabilities means that you have to assign positive probability to P(A & ~A).
This connects to a deep open problem- how do we assign probabilities to the chances that we’ve made a logical error or miscalculated. However, even if one is willing to assign zero probability to events that contain inherent logical contradictions, that’s not at all the same as assigning zero probability to a claim about the empirical world.
However, even if one is willing to assign zero probability to events that contain inherent logical contradictions, that’s not at all the same as assigning zero probability to a claim about the empirical world.
If claims about the empirical world can have arbitrarily small probability, then a suitable infinite conjunction of such claims has probability zero, just as surely as P(A&~A) does.
So how do you determine which claims you are giving a prior probability of zero and which you don’t?
For Pascal’s Mugging scenarios it just seems a reasonable thing to do. Gigantic promises undermine their own credibility, converging to zero in the limit. I don’t have a formally expressed rule, but if I was going to work on decision theory I’d look into the possibility of codifying that intuition as an axiom.
Yes, assigning GOD(infinity) a probability of zero means that no finite amount of evidence will shift that. For this particular infinite claim I don’t see a problem with that.
What if we came up with a well-evidenced theory of everything that implied GOD(infinity)?
So how do you determine which claims you are giving a prior probability of zero and which you don’t?
For Pascal’s Mugging scenarios it just seems a reasonable thing to do.
It’s not just contrived scenarios; see http://arxiv.org/abs/0712.4318. If utility is unbounded, infinitely many hypotheses can result in utility higher than N for any N.
Unless you can actually do these things—actually reconstruct probability theory in a way that makes P(A|B) and P(~A|B) sum to less than 1, and prohibit uncountable measure spaces—then claiming that you should do them anyway is to make the real insight of Eliezer’s article into an empty slogan.
How is this any different than saying “until you can actually make unbounded utility functions converge properly as discussed in Peter de Blanc’s paper, using expected utility maximization is an empty slogan”?
How is this any different than saying “until you can actually make unbounded utility functions converge properly as discussed in Peter de Blanc’s paper, using expected utility maximization is an empty slogan”?
I’m not convinced by expected utility maximization either, and I can see various possibilities of ways around de Blanc’s argument besides bounding utility, but those are whole nother questions.
ETA: Also, if someone claims their utility function is bounded, does that mean they’re attaching probability zero to it being unbounded? If they attach non-zero probability, they run into de Blanc’s argument, and if they attach zero, they’ve just used zero as a probability. Or is having a probability distribution over what one’s utility function actually is too self-referential? But if you can’t do that, how can you model uncertainty about what your utility function is?
I’m not convinced by expected utility maximization either,
Do you reject the VNM axioms? I have my own quibbles with them—I don’t like they way they just assume that probability exists and is a real number and I don’t like axiom 3 because it rules out unbounded utility functions—but they do apply in some contexts.
I can see various possibilities of ways around de Blanc’s argument besides bounding utility, but those are whole nother questions.
Can you elaborate on these?
how can you model uncertainty about what your utility function is?
There is no good theory of this yet. One wild speculation is to model each possible utility function as a separate agent and have them come to an agreement. Unfortunately, there is no good theory of bargaining yet either.
I can see various possibilities of ways around de Blanc’s argument besides bounding utility, but those are whole nother questions.
Can you elaborate on these?
Not with any great weight, it’s just a matter of looking at each hypothesis and thinking up a way of making it fail.
Maybe utility isn’t bounded below by a computable function (and a fortiori is not itself computable). That might be unfortunate for the would-be utility maximizer, but if that’s the way it is, too bad.
Or—this is a possibility that de Blanc himself mentions in the 2009 version—maybe the environment should not be allowed to range over all computable functions. That seems quite a strong possibility to me. Known physical bounds on the density of information processing would appear to require it. Of course, those bounds apply equally to the utility function, which might open the way for a complexity-bounded version of the proof of bounded utility.
Maybe utility isn’t bounded below by a computable function (and a fortiori is not itself computable). That might be unfortunate for the would-be utility maximizer, but if that’s the way it is, too bad.
Good point, but I find it unlikely.
Or—this is a possibility that de Blanc himself mentions in the 2009 version—maybe the environment should not be allowed to range over all computable functions. That seems quite a strong possibility to me. Known physical bounds on the density of information processing would appear to require it.
This requires assigning zero probability to the hypothesis that there is no limit on the density of information processing.
I’m not convinced by expected utility maximization either,
Do you reject the VNM axioms?
I don’t see any reason to dispute Axioms 2 (transitivity) and 4 (independence of alternatives), although I know some people dispute Axiom 4.
For Axiom 3 (continuity), I don’t have an argument against, but it feels a bit dodgy to me. The lack of inferential distance between the construction of lotteries and the conclusion of the theorem gives me the impression of begging the question. But that isn’t my main problem with the axioms.
The sticking point for me is Axiom 1, the totality of the preference relation. Why should an ideal rational agent, whatever that is, have a preference—even one of indifference—between every possible pair of alternatives?
“An ideal rational agent, whatever that is.” Does the concept of an ideal rational agent make sense, even as an idealisation? An ideal rational agent, as described by the VNM axioms, cannot change its utility function. It cannot change its ultimate priors. These are simply what they are and define that agent. It is logically omniscient and can compute anything computable in constant time. What is this concept useful for?
It’s the small world/large world issue again. In small situations, such as industrial process control, that are readily posed as optimisation problems, the VNM axioms are trivially true. This is what gives them their plausibility. In large situations, constructing a universal utility function is as hard a problem as constructing a universal prior.
The sticking point for me is Axiom 1, the totality of the preference relation. Why should an ideal rational agent, whatever that is, have a preference—even one of indifference—between every possible pair of alternatives?
How would it act if asked to choose between two options that it does not have a preference between?
An ideal rational agent, as described by the VNM axioms, cannot change its utility function. It cannot change its ultimate priors.
It can, it just would not want to, ceteris paribus.
What is this concept useful for?
It is a starting point (well, a middle point). I see no reason to change my utility function or my priors; I do not desire those almost by definition. Infinite computational ability is an approximation to be correct in the future, as is, IMO, VNM axiom 3. This is what we have so far and we are working on improving it.
How would it act if asked to choose between two options that it does not have a preference between?
The point is that there will be options that it could never be asked to choose between.
What is this concept useful for?
It is a starting point (well, a middle point).
I become less and less convinced that utility maximisation is a useful place to start. An ideal rational agent must be an idealisation of real, imperfectly rational agents—of us, that is. What can I do with a preference between steak and ice cream? Sometimes one of those will satisfy a purpose for me and sometimes the other; most of the time neither is in my awareness at all. I do not need to have a preference, even between such everyday things, because I will never be faced with a choice between them. So I find the idea of a universal preference uncompelling.
When faced with practical trolley problems, the practical rational first response is not to weigh the two offered courses of action, but to look for other alternatives. They don’t always exist, but they have to be looked for. Hard-core Bayesian utility maximisation requires a universal prior that automatically thinks of all possible alternatives. I am not yet persuaded (e.g. by AIXI) that a practical implementation of such a prior is possible.
How would it act if asked to choose between two options that it does not have a preference between?
The point is that there will be options that it could never be asked to choose between.
Does this involve probabilities of zero or just ignoring sufficiently unlikely events?
What can I do with a preference between steak and ice cream? Sometimes one of those will satisfy a purpose for me and sometimes the other; most of the time neither is in my awareness at all. I do not need to have a preference, even between such everyday things, because I will never be faced with a choice between them.
I’m not sure I understand this; is this a choice between objects or between outcomes? If it is between outcomes, it can occur. If it is between objects, it is not the kind of thing described by the frameworks that we are discussing since it is not actually a choice that anyone makes; one may choose for an object to existed or to be possessed, but it is a category error to choose an object (though that phrase can be used as a shorthand for a different type of choice, I think it is clear what it means).
Does this involve probabilities of zero or just ignoring sufficiently unlikely events?
I don’t think there’s any way to avoid probabilities of zero. Even the Solomonoff universal prior assigns zero probability to uncomputable hypotheses. And you never have probabilities at the meta-level, which is always conducted in the language of plain old logic.
What can I do with a preference between steak and ice cream? …
I’m not sure I understand this; is this a choice between objects or between outcomes? If it is between outcomes, it can occur.
Between outcomes. How is this choice going to occur?
More generally, what is an outcome? In large-world reasoning, it seems to me that an outcome cannot be anything less than the entire history of one’s forward light-cone, or in TDT something even larger. Those are the things you are choosing between, when you make a choice. Decision theory on that scale is very much a work in progress, which I’m not going to scoff at, but I have low expectations of AGI being developed on that basis.
However, that is somewhat tangential. Are you proposing that decision making should be significantly altered by ignoring certain computable hypotheses—since Solomonoff induction, despite its limits, does manifest this problem—in order to make utility functions converge? That sounds horribly ad-hoc (see second paragraph of this).
In large-world reasoning, it seems to me that an outcome cannot be anything less than the entire history of one’s forward light-cone, or in TDT something even larger. Those are the things you are choosing between, when you make a choice.
I agree.
Decision theory on that scale is very much a work in progress, which I’m not going to scoff at, but I have low expectations of AGI being developed on that basis.
Any decision process that does not explicitly mention outcomes is only useful insofar as its outputs are correlated with our actual desires, which are about outcomes. If outcomes are not part of an AGI’s decision process, they are therefore still necessary for the design of the AGI. They are probably also necessary for the AGI to know which self-modifications are justified, since we cannot foresee which modifications could at some point be considered.
Are you proposing that decision making should be significantly altered by ignoring certain computable hypotheses—since Solomonoff induction, despite its limits, does manifest this problem—in order to make utility functions converge? That sounds horribly ad-hoc (see second paragraph of this).
If I was working on that, I could say it was being worked on. I agree that an ad-hoc hack is not what’s called for. It needs to be a principled hack. :-)
Any decision process that does not explicitly mention outcomes is only useful insofar as its outputs are correlated with our actual desires, which are about outcomes.
Are they really? That is, about outcomes in the large-world sense we just agreed on. Ask people what they want, and few will talk about the entire future history of the universe, even if you press them to go farther than what they want right now. I’m sure Eliezer would, and others operating in that sphere of thought, including many on LessWrong, but that is a rather limited sense of “us”.
Can you come up with a historical example of a mathematical or scientific problem being solved—not made to work for some specific purpose, but solved completely—with a principled hack?
I’m sure Eliezer would, and others operating in that sphere of thought, including many on LessWrong, but that is a rather limited sense of “us”.
I don’t see your point. Other people don’t care about outcomes but a) their extrapolated volitions probably do and b) if people’s extrapolated volitions don’t care about outcomes, I don’t think I’d want to use them as the basis of a FAI.
Can you come up with a historical example of a mathematical or scientific problem being solved—not made to work for some specific purpose, but solved completely—with a principled hack?
Limited comprehension in ZF set theory is the example I had in mind in coining the term “principled hack”. Russell said to Frege, “what about the set of sets not members of themselves?”, whereupon Frege was embarrassed, and eventually a way was found of limiting self-reference enough to avoid the contradiction. There’s a principle there—unrestricted self-reference can’t be done—but all the methods of limiting self-reference that have yet been devised look like hacks. They work, though. ZF appears to be consistent, and all of mathematics can be expressed in it. As a universal language, it completely solves the problem of formalising mathematics.
(I am aware that there are mathematicians who would disagree with that triumphalist claim, but as far as I know none of them are mainstream.)
Being a mathematician who at least considers himself mainstream, I would think that ZFC and the existence of a large cardinal is probably the minimum one would need to express a reasonable fragment of mathematics.
If you can’t talk about the set of all subsets of the set of all subsets of the real numbers, I think analysis would become a bit… bondage and discipline.
Ok, ZFC is a more convenient background theory than ZF (although I’m not sure where it becomes awkward to do without choice). That’s still short of needing large cardinal axioms.
The idea of programming ZF into an AGI horrifies my aesthetics, but that is no reason not to use it (well it is an indication that it might not be a good idea but in this specific case ZF does have the evidence on its side). If expected utility, or anything else necessary for an AGI, could benefit from a principled hack as well-tested as limited comprehension, I would accept it.
if we got what looked like a working Theory of Everything that implied that the universe is infinite, you’d probably assign a non-zero probability to the universe being infinite.
The hypothesis that the universe is infinite is equivalent to the hypothesis that no matter how far you travel (in a straight line through space), you can be infinitely certain that it won’t take you someplace you’ve been. Convincing you that the universe is infinite should be roughly as hard as convincing you that there’s zero probability that the universe is infinite, because they’re both claims of infinite certainty in something. (I think.)
I’d like to be able to boil that down to “infinite claims require infinite evidence”, but it seems to be not quite true.
Assigning zero probability to claims is bad because then one can’t ever update to accept the claim no matter what evidence one has. Moreover, this doesn’t seem to have much to do with “infinite claims” given that there are claims involving infinity that you would probably accept. For example, if we got what looked like a working Theory of Everything that implied that the universe is infinite, you’d probably assign a non-zero probability to the universe being infinite. You can’t assign all hypotheses involving infinity zero probability if you want to be able to update to include them.
Suppose I randomly pick a coin from all of Coinspace and flip it. What probability do you assign to the coin landing heads? Probably around 1⁄2.
Now suppose I do the same thing, but pick N coins and flip them all. The probability that they all come up heads is roughly 1/2^N.
Suppose I halt time to allow this experiment to continue as long as we want, then keep flipping coins randomly picked from Coinspace until I get a tail. What is the probability I will never get a tail? It should be the limit of 1/2^N as N goes to infinity, which is 0. Events with probability of 0 are allowed—indeed, expected—when you are dealing with infinite probability spaces such as this one.
It’s also not true that we can’t ever update if our prior probability for something is 0. It is just that we need infinite evidence, which is a scary way of saying that the probability of receiving said evidence is also 0. For instance, if you flip coins infinitely many times, and I observe all but the first 10 and never see “tails” (which has a probability of 0 of happening) then my belief that all the coins landed “heads” has gone up from 0 to 1/2^10 = 1/1024.
There are only countably many hypotheses that one can consider. In the coin flip context as you’ve constructed the probability space there are uncountably many possible results. If one presumes that there’s a really a Turing computable (or even just explicitly definable in some axiomatic framework like ZFC) set of possibilities for the behavior of the coin, then there are only countably many each with a finite probability. Obviously, this in some respects makes the math much ickier, so for most purposes it is more helpful to assume that the coin is really random.
Note also that your updating took an infinite amount of evidence (since you observed all but the first 10 flips) . So it is at least fair to say that if one assigns probability zero to something then one can’t ever update in finite time, which is about as bad as not being able to update.
I introduced the concept of CoinSpace to make it clear that all the coinflips are independent of each other: if I were actually flipping a single coin I would assign it a nonzero (though very small) probability that it never lands “tails”. Possibly I should have just said the independence assumption.
And yes, I agree that if we postulate a finite time condition, then P(X) = 0 means one can’t ever update on X. However, in the context of this post, we don’t have a finite time condition: God-TimFreeman explicitly needs to stop time in order to be able to flip the coin as many times as necessary. Once we have that, then we need to be able to assign probabilities of 0 to events that almost never happen.
I linked to the article expressing that view. It makes a valid point.
I am not saying anything about all claims involving infinity. I am addressing the particular claim in the original post.
Yes, assigning GOD(infinity) a probability of zero means that no finite amount of evidence will shift that. For this particular infinite claim I don’t see a problem with that.
Thoroughgoing rejection of 0 and 1 as probabilities means that you have to assign positive probability to P(A & ~A). You also have to reject real-valued variables—the probability of a randomly thrown dart hitting a particular number on the real line is zero. Unless you can actually do these things—actually reconstruct probability theory in a way that makes P(A|B) and P(~A|B) sum to less than 1, and prohibit uncountable measure spaces—then claiming that you should do them anyway is to make the real insight of Eliezer’s article into an empty slogan.
So how do you determine which claims you are giving a prior probability of zero and which you don’t?
This connects to a deep open problem- how do we assign probabilities to the chances that we’ve made a logical error or miscalculated. However, even if one is willing to assign zero probability to events that contain inherent logical contradictions, that’s not at all the same as assigning zero probability to a claim about the empirical world.
If claims about the empirical world can have arbitrarily small probability, then a suitable infinite conjunction of such claims has probability zero, just as surely as P(A&~A) does.
For Pascal’s Mugging scenarios it just seems a reasonable thing to do. Gigantic promises undermine their own credibility, converging to zero in the limit. I don’t have a formally expressed rule, but if I was going to work on decision theory I’d look into the possibility of codifying that intuition as an axiom.
What if we came up with a well-evidenced theory of everything that implied GOD(infinity)?
It’s not just contrived scenarios; see http://arxiv.org/abs/0712.4318. If utility is unbounded, infinitely many hypotheses can result in utility higher than N for any N.
How is this any different than saying “until you can actually make unbounded utility functions converge properly as discussed in Peter de Blanc’s paper, using expected utility maximization is an empty slogan”?
I’m not convinced by expected utility maximization either, and I can see various possibilities of ways around de Blanc’s argument besides bounding utility, but those are whole nother questions.
ETA: Also, if someone claims their utility function is bounded, does that mean they’re attaching probability zero to it being unbounded? If they attach non-zero probability, they run into de Blanc’s argument, and if they attach zero, they’ve just used zero as a probability. Or is having a probability distribution over what one’s utility function actually is too self-referential? But if you can’t do that, how can you model uncertainty about what your utility function is?
Do you reject the VNM axioms? I have my own quibbles with them—I don’t like they way they just assume that probability exists and is a real number and I don’t like axiom 3 because it rules out unbounded utility functions—but they do apply in some contexts.
Can you elaborate on these?
There is no good theory of this yet. One wild speculation is to model each possible utility function as a separate agent and have them come to an agreement. Unfortunately, there is no good theory of bargaining yet either.
Not with any great weight, it’s just a matter of looking at each hypothesis and thinking up a way of making it fail.
Maybe utility isn’t bounded below by a computable function (and a fortiori is not itself computable). That might be unfortunate for the would-be utility maximizer, but if that’s the way it is, too bad.
Or—this is a possibility that de Blanc himself mentions in the 2009 version—maybe the environment should not be allowed to range over all computable functions. That seems quite a strong possibility to me. Known physical bounds on the density of information processing would appear to require it. Of course, those bounds apply equally to the utility function, which might open the way for a complexity-bounded version of the proof of bounded utility.
Good point, but I find it unlikely.
This requires assigning zero probability to the hypothesis that there is no limit on the density of information processing.
I don’t see any reason to dispute Axioms 2 (transitivity) and 4 (independence of alternatives), although I know some people dispute Axiom 4.
For Axiom 3 (continuity), I don’t have an argument against, but it feels a bit dodgy to me. The lack of inferential distance between the construction of lotteries and the conclusion of the theorem gives me the impression of begging the question. But that isn’t my main problem with the axioms.
The sticking point for me is Axiom 1, the totality of the preference relation. Why should an ideal rational agent, whatever that is, have a preference—even one of indifference—between every possible pair of alternatives?
“An ideal rational agent, whatever that is.” Does the concept of an ideal rational agent make sense, even as an idealisation? An ideal rational agent, as described by the VNM axioms, cannot change its utility function. It cannot change its ultimate priors. These are simply what they are and define that agent. It is logically omniscient and can compute anything computable in constant time. What is this concept useful for?
It’s the small world/large world issue again. In small situations, such as industrial process control, that are readily posed as optimisation problems, the VNM axioms are trivially true. This is what gives them their plausibility. In large situations, constructing a universal utility function is as hard a problem as constructing a universal prior.
How would it act if asked to choose between two options that it does not have a preference between?
It can, it just would not want to, ceteris paribus.
It is a starting point (well, a middle point). I see no reason to change my utility function or my priors; I do not desire those almost by definition. Infinite computational ability is an approximation to be correct in the future, as is, IMO, VNM axiom 3. This is what we have so far and we are working on improving it.
The point is that there will be options that it could never be asked to choose between.
I become less and less convinced that utility maximisation is a useful place to start. An ideal rational agent must be an idealisation of real, imperfectly rational agents—of us, that is. What can I do with a preference between steak and ice cream? Sometimes one of those will satisfy a purpose for me and sometimes the other; most of the time neither is in my awareness at all. I do not need to have a preference, even between such everyday things, because I will never be faced with a choice between them. So I find the idea of a universal preference uncompelling.
When faced with practical trolley problems, the practical rational first response is not to weigh the two offered courses of action, but to look for other alternatives. They don’t always exist, but they have to be looked for. Hard-core Bayesian utility maximisation requires a universal prior that automatically thinks of all possible alternatives. I am not yet persuaded (e.g. by AIXI) that a practical implementation of such a prior is possible.
Does this involve probabilities of zero or just ignoring sufficiently unlikely events?
I’m not sure I understand this; is this a choice between objects or between outcomes? If it is between outcomes, it can occur. If it is between objects, it is not the kind of thing described by the frameworks that we are discussing since it is not actually a choice that anyone makes; one may choose for an object to existed or to be possessed, but it is a category error to choose an object (though that phrase can be used as a shorthand for a different type of choice, I think it is clear what it means).
I don’t think there’s any way to avoid probabilities of zero. Even the Solomonoff universal prior assigns zero probability to uncomputable hypotheses. And you never have probabilities at the meta-level, which is always conducted in the language of plain old logic.
Between outcomes. How is this choice going to occur?
More generally, what is an outcome? In large-world reasoning, it seems to me that an outcome cannot be anything less than the entire history of one’s forward light-cone, or in TDT something even larger. Those are the things you are choosing between, when you make a choice. Decision theory on that scale is very much a work in progress, which I’m not going to scoff at, but I have low expectations of AGI being developed on that basis.
There are people working on this. EY explained his position here.
However, that is somewhat tangential. Are you proposing that decision making should be significantly altered by ignoring certain computable hypotheses—since Solomonoff induction, despite its limits, does manifest this problem—in order to make utility functions converge? That sounds horribly ad-hoc (see second paragraph of this).
I agree.
Any decision process that does not explicitly mention outcomes is only useful insofar as its outputs are correlated with our actual desires, which are about outcomes. If outcomes are not part of an AGI’s decision process, they are therefore still necessary for the design of the AGI. They are probably also necessary for the AGI to know which self-modifications are justified, since we cannot foresee which modifications could at some point be considered.
If I was working on that, I could say it was being worked on. I agree that an ad-hoc hack is not what’s called for. It needs to be a principled hack. :-)
Are they really? That is, about outcomes in the large-world sense we just agreed on. Ask people what they want, and few will talk about the entire future history of the universe, even if you press them to go farther than what they want right now. I’m sure Eliezer would, and others operating in that sphere of thought, including many on LessWrong, but that is a rather limited sense of “us”.
Can you come up with a historical example of a mathematical or scientific problem being solved—not made to work for some specific purpose, but solved completely—with a principled hack?
I don’t see your point. Other people don’t care about outcomes but a) their extrapolated volitions probably do and b) if people’s extrapolated volitions don’t care about outcomes, I don’t think I’d want to use them as the basis of a FAI.
Limited comprehension in ZF set theory is the example I had in mind in coining the term “principled hack”. Russell said to Frege, “what about the set of sets not members of themselves?”, whereupon Frege was embarrassed, and eventually a way was found of limiting self-reference enough to avoid the contradiction. There’s a principle there—unrestricted self-reference can’t be done—but all the methods of limiting self-reference that have yet been devised look like hacks. They work, though. ZF appears to be consistent, and all of mathematics can be expressed in it. As a universal language, it completely solves the problem of formalising mathematics.
(I am aware that there are mathematicians who would disagree with that triumphalist claim, but as far as I know none of them are mainstream.)
Being a mathematician who at least considers himself mainstream, I would think that ZFC and the existence of a large cardinal is probably the minimum one would need to express a reasonable fragment of mathematics.
If you can’t talk about the set of all subsets of the set of all subsets of the real numbers, I think analysis would become a bit… bondage and discipline.
Surely the power set axiom gets you that?
That it exists, yes. But what good is that without choice?
Ok, ZFC is a more convenient background theory than ZF (although I’m not sure where it becomes awkward to do without choice). That’s still short of needing large cardinal axioms.
The idea of programming ZF into an AGI horrifies my aesthetics, but that is no reason not to use it (well it is an indication that it might not be a good idea but in this specific case ZF does have the evidence on its side). If expected utility, or anything else necessary for an AGI, could benefit from a principled hack as well-tested as limited comprehension, I would accept it.
The hypothesis that the universe is infinite is equivalent to the hypothesis that no matter how far you travel (in a straight line through space), you can be infinitely certain that it won’t take you someplace you’ve been. Convincing you that the universe is infinite should be roughly as hard as convincing you that there’s zero probability that the universe is infinite, because they’re both claims of infinite certainty in something. (I think.)
I’d like to be able to boil that down to “infinite claims require infinite evidence”, but it seems to be not quite true.