I think this system may have the following problem: It implicitly assumes that you can take a kind of random sample that in fact you can’t.
You want to evaluate universes by “how would I feel about being in this universe?”, which I think means either something like “suppose I were a randomly chosen subject-of-experiences in this universe, what would my expected utility be?” or “suppose I were inserted into a random place in this universe, what would my expected utility be?”. (Where “utility” is shorthand for your notion of “life satisfaction”, and you are welcome to insist that it be bounded.)
But in a universe with infinitely many—countably infinitely many, presumably—subjects-of-experiences, the first involves an action equivalent to picking a random integer. And in a universe of infinite size (and with a notion of space at least a bit like ours), the second involves an action equivalent to picking a random real number.
And there’s no such thing as picking an integer, or a real number, uniformly at random.
This is essentially the same as the “infinitarian paralysis” problem. Consider two universes, each with a countable infinity of happy people and a countable infinity of unhappy people (and no other subjects of experience, somehow). In the first, all the people were generated with a biased coin-flip that picks “happy” 99.9% of the time. In the second, the same except that their coin picks “unhappy” 99.9% of the time. We’d like to be able to say that the first option is better than the second, but we can’t, because actually with probability 1 these two universes are equivalent in the sense that with probability 1 they both have infinitely many happy and infinitely many unhappy people, and we can simply rearrange them to turn one of those universes into the other. Which is one way of looking at why there’s no such operation as “pick a random integer”, because if there were then surely picking a random person from universe 1 gets you a happy person with probability 0.999 and picking a random person from universe 1 gets you a happy person with probability 0.001.
When you have infinitely many things, you may find yourself unable to say meaningfully whether there’s more positive or more negative there, and that isn’t dependent on adding up the positives and negatives and getting infinite amounts of goodness or badness. You are entirely welcome to say that in our hypothetical universe there are no infinite utilities anywhere, that we shouldn’t be trying to compute anything like “the total utility”, and that’s fine, but you still have the problem that e.g. you can’t say “it’s a bad thing to take 1000 of the happy people and make them unhappy” if what you mean by that is that it makes for a worse universe, because the modified universe is isomorphic to the one you started with.
It’s not a distribution over agents in the universe, it’s a distribution over possible agents in possible universes. The possible universes can be given usual credence-based weightings based on conditional probability given the moral agent’s observations and models, because what else are they going to base anything on?
If your actions make 1000 people unhappy, and presumably some margin “less satisfied” in some hypothetical post-mortem universe rating, the idea seems to be that you first estimate how much less satisfied they would be. Then the novel (to me) part of this idea is that you multiply this by the estimated fraction of all agents, in all possible universes weighted by credence, who would be in your position. Being a fraction, there is no unboundedness involved. The fraction may be extremely small, but should always be nonzero.
As I see it the exact fraction you estimate doesn’t actually matter, because all of your options have the same multiplier and you’re evaluating them relative to each other. However this multiplier is what gives ethical decisions nonzero effect even in an infinite universe, because there will only be finitely many ethical scenarios of any given complexity.
So it’s not just “make 1000 happy people unhappy”, it’s “the 1 in N people with similar incentives as me in a similar situation would each make 1000 happy people unhappy”, resulting in a net loss of 1000/N of universal satisfaction. N may be extremely large, but it’s not infinite.
How is it a distribution over possible agents in possible universes (plural) when the idea is to give a way of assessing the merit of one possible universe?
I do agree that an ideal consequentialist deciding between actions should consider for each action the whole distribution of possible universes after they do it. But unless I’m badly misreading the OP, I don’t see where it proposes anything like what you describe. It says—emphasis in all cases mine, to clarify what bits I think indicate that a single universe is in question—”… but you still knew you would be born into this universe”, and “Imagine hypothetically telling an agent everything significant about the universe”, and “a prior over situations in the universe you could be born into”, and “my ethical system provides a function mapping from possible worlds to their moral value”, and “maximize the expected value of your life satisfaction given you are in this universe”, and “The appeal of aggregate consequentialism is that its defines some measure of “goodness” of a universe”, and “the moral value of the world”, and plenty more.
Even if somehow this is what OP meant, though—or if OP decides to embrace it as an improvement—I don’t see that it helps at all with the problem I described; in typical cases I expect picking a random agent in a credence-weighted random universe-after-I-do-X to pose all the same difficulties as picking a random agent in a single universe-after-I-do-X. Am I missing some reason why the former would be easier?
(Assuming you’re read my other response you this comment):
I think it might help if I give a more general explanation of how my moral system can be used to determine what to do. This is mostly taken from the article, but it’s important enough that I think it should be restated.
Suppose you’re considering taking some action that would benefit our world or future life cone. You want to see what my ethical system recommends.
Well, for almost possible circumstances an agent could end up in in this universe, I think your action would have effectively no causal or acausal effect on them. There’s nothing you can do about them, so don’t worry about them in your moral deliberation.
Instead, consider agents of the form, “some agent in an Earth-like world (or in the future light-cone of one) with someone just like <insert detailed description of yourself and circumstances>”. These are agents you can potentially (acausally) affect. If you take an action to make the world a better place, that means the other people in the universe who are very similar to you and in very similar circumstances would also take that action.
So if you take that action, then you’d improve the world, so the expected value of life satisfaction of an agent in the above circumstances would be higher. Such circumstances are of finite complexity and not ruled out by evidence, so the probability of an agent ending up in such a situation, conditioning only on being in this universe, in non-zero. Thus, taking that action would increase the moral value of the universe and my ethical system would thus be liable to recommend taking that action.
To see it another way, moral deliberation with my ethical system works as follows:
I’m trying to make the universe a better place. Most agents are in situations in which I can’t do anything to affect them, whether causally or acausally. But there are some agents in situations that that I can (acausally) affect. So I’m going to focus on making the universe as satisfying as possible for those agents, using some impartial weighting over those possible circumstances.
Your comments are focusing on (so to speak) the decision-theoretic portion of your theory, the bit that would be different if you were using CDT or EDT rather than something FDT-like. That isn’t the part I’m whingeing about :-). (There surely are difficulties in formalizing any sort of FDT, but they are not my concern; I don’t think they have much to do with infinite ethics as such.)
My whingeing is about the part of your theory that seems specifically relevant to questions of infinite ethics, the part where you attempt to average over all experience-subjects. I think that one way or another this part runs into the usual average-of-things-that-don’t-have-an-average sort of problem which afflicts other attempts at infinite ethics.
As I describe in another comment, the approach I think you’re taking can move where that problem arises but not (so far as I can currently see) make it actually go away.
How is it a distribution over possible agents in possible universes (plural) when the idea is to give a way of assessing the merit of one possible universe?
I do think JBlack understands the idea of my ethical system and is using it appropriately.
my system provides a method of evaluating the moral value of a specific universe. The point of moral agents to to try to make the universe one that scores highlly on this moral valuation. But we don’t know exactly what universe we’re in, so to make decisions, we need to consider all universes we could be in, and then take the action that maximizes the expected moral value of the universe we’re actually in.
For example, suppose I’m considering pressing a button that will either make everyone very slightly happier, or make everyone extremely unhappy. I don’t actually know which universe I’m in, but I’m 60% sure I’m in the one that would make everyone happy. Then if I press the button, there’s a 40% chance that the universe would end up with very low moral value. That means pressing the button would not in expectation decrease the moral value of the universe, so my morally system would recommend not pressing it.
Even if somehow this is what OP meant, though—or if OP decides to embrace it as an improvement—I don’t see that it helps at all with the problem I described; in typical cases I expect picking a random agent in a credence-weighted random universe-after-I-do-X to pose all the same difficulties as picking a random agent in a single universe-after-I-do-X. Am I missing some reason why the former would be easier?
I think to some extent you may be over-thinking things. I agree that it’s not completely clear how to compute P(“I’m satisfied” | “I’m in this universe”). But to use my moral system, I don’t need a perfect, rigorous solution to this, nor am I trying to propose one.
I think the ethical system provides reasonably straightforward moral recommendations in the situations we could actually be in. I’ll give an example of such a situation that I hope is illuminating. It’s paraphrased from the article.
Suppose you can have the ability to create safe AI and are considering whether my moral system recommends doing so. And suppose if you create safe AI everyone in your world will be happy, and if you don’t then the world will be destroyed by evil rogue AI.
Consider an agent that knows it will be in this universe, but nothing else. Well, consider the circumstances, “I’m an agent in an Earth-like world that contains someone who is just like gjm and in a very similar situation who has the ability to create safe AI”. That above description has finite description length, and the AI has no evidence ruling it out. So it must have some non-zero probability of ending up in such a situation, conditioning on being somewhere in this universe.
All the gjms have the same knowledge and value and are in pretty much the same circumstances. So their actions are logically constrained to be the same as yours. Thus, if you decide to create the AI, you are acausally determining the outcome of arbitrary agents in the above circumstances, by making such an agent end up satisfied when they otherwise wouldn’t have been. Since an agent in this universe has non-zero probability of ending up in those circumstances, by choosing to make the safe AI you are increasing the moral value of the universe.
As I said to JBlack, so far as I can tell none of the problems I think I see with your proposal become any easier to solve if we switch from “evaluate one possible universe” to “evaluate all possible universes, weighted by credence”.
to use my moral system, I don’t need a perfect, rigorous solution to this
Why not?
Of course you can make moral decisions without going through such calculations. We all do that all the time. But the whole issue with infinite ethics—the thing that a purported system for handling infinite ethics needs to deal with—is that the usual ways of formalizing moral decision processes produce ill-defined results in many imaginable infinite universes. So when you propose a system of infinite ethics and I say “look, it produces ill-defined results in many imaginable infinite universes”, you don’t get to just say “bah, who cares about the details?” If you don’t deal with the details you aren’t addressing the problems of infinite ethics at all!
It’s nice that your system gives the expected result in a situation where the choices available are literally “make everyone in the world happy” and “destroy the world”. (Though I have to confess I don’t think I entirely understand your account of how your system actually produces that output.) We don’t really need a system of ethics to get to that conclusion!
What I would want to know is how your system performs in more difficult cases.
We’re concerned about infinitarian paralysis, where we somehow fail to deliver a definite answer because we’re trying to balance an infinite amount of good against an infinite amount of bad. So far as I can see, your system still has this problem. E.g., if I know there are infinitely many people with various degrees of (un)happiness, and I am wondering whether to torture 1000 of them, your system is trying to calculate the average utility in an infinite population, and that simply isn’t defined.
So, I think this is what you have in mind; my apologies if it was supposed to be obvious from the outset.
We are doing something like Solomonoff induction. The usual process there is that your prior says that your observations are generated by a computer program selected at random, using some sort of prefix-free code and generating a random program by generating a random bit-string. Then every observation updates your distribution over programs via Bayes, and once you’ve been observing for a while your predictions are made by looking at what all those programs would do, with probabilities given by your posterior. So far so good (aside from the fact that this is uncomputable).
But what you actually want (I think) isn’t quite a probability distribution over universes; you want a distribution over experiences-in-universes, and not your experiences but those of hypothetical other beings in the same universe as you. So now think of the programs you’re working with as describing not your experiences necessarily but those of some being in the universe, so that each update is weighted not by Pr(I have experience X | my experiences are generated by program P) but by Pr(some subject-of-experience has experience X | my experiences are generated by program P), with the constraint that it’s meant to be the same subject-of-experience for each update. Or maybe by Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P) with the same constraint.
So now after all your updates what you have is a probability distribution over generators of experience-streams for subjects in your universe.
When you consider a possible action, you want to condition on that in some suitable fashion, and exactly how you do that will depend on what sort of decision theory you’re using; I shall assume all the details of that handwaved away, though again I think they may be rather difficult. So now you have a revised probability distribution over experience-generating programs.
And now, if everything up to this point has worked, you can compute (well, you can’t because everything here is uncomputable, but never mind) an expected utility because each of our programs yields a being’s stream of experiences, and modulo some handwaving you can convert that into a utility, and you have a perfectly good probability distribution over the programs.
And (I think) I agree that here if we consider either “torture 1000 people” or “don’t torture 1000 people” it is reasonable to expect that the latter will genuinely come out with a higher expected utility.
OK, so in this picture of things, what happens to my objections? They apply now to the process by which you are supposedly doing your Bayesian updates on experience. Because (I think) now you are doing one of two things, neither of which need make sense in a world with infinitely many beings in it.
If you take the “Pr(some subject-of-experience has experience X)” branch: here the problem is that in a universe with infinitely many beings, these probabilities are likely all 1 and therefore you never actually learn anything when you do your updating.
If you take the “Pr(a randomly chosen subject-of-experience has experience X)” branch: here the problem is that there’s no such thing as a randomly chosen subject-of-experience. (More precisely, there are any number of ways to choose one at random, and I see no grounds for preferring one over another, and in particular neither a uniform nor a maximum entropy distribution exists.)
The latter is basically the same problem as I’ve been complaining about before (well, it’s sort of dual to it, because now we’re looking at things from the perspective of some possibly-other experiencer in the universe, and you are the randomly chosen one). The former is a different problem but seems just as difficult to deal with.
Of course you can make moral decisions without going through such calculations. We all do that all the time. But the whole issue with infinite ethics—the thing that a purported system for handling infinite ethics needs to deal with—is that the usual ways of formalizing moral decision processes produce ill-defined results in many imaginable infinite universes. So when you propose a system of infinite ethics and I say “look, it produces ill-defined results in many imaginable infinite universes”, you don’t get to just say “bah, who cares about the details?” If you don’t deal with the details you aren’t addressing the problems of infinite ethics at all!
Well, I can’t say I exactly disagree with you here.
However, I want to note that this isn’t a problem specific to my ethical system. It’s true that in order to use my ethical system to make precise moral verdicts, you need to more fully formalize probability theory. However, the same is also true with effectively every other ethical theory.
For example, consider someone learning about classical utilitarianism and its applications in a finite world. Then they could argue:
Okay, I see your ethical system says to make the balance of happiness to unhappiness as high as possible. But how am I supposed to know what the world is actually like and what the effects of my actions are? Do other animals feel happiness and unhappiness? Is there actually a heaven and Hell that would influence moral choices? This ethical system doesn’t answer any of this. You can’t just handwave this away! If you don’t deal with the details you aren’t addressing the problems of ethics at all!
Also, I just want to note that my system as described seems to be unique among the infinite ethical systems I’ve seen in that it doesn’t make obviously ridiculous moral verdicts. Every other one I know of makes some recommendations that seem really silly. So, despite not providing a rigorous formalization of probability theory, I think my ethical system has value.
But what you actually want (I think) isn’t quite a probability distribution over universes; you want a distribution over experiences-in-universes, and not your experiences but those of hypothetical other beings in the same universe as you. So now think of the programs you’re working with as describing not your experiences necessarily but those of some being in the universe, so that each update is weighted not by Pr(I have experience X | my experiences are generated by program P) but by Pr(some subject-of-experience has experience X | my experiences are generated by program P), with the constraint that it’s meant to be the same subject-of-experience for each update. Or maybe by Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P) with the same constraint.
Actually, no, I really do want a probability distribution over what I would experience, or more generally, the situations I’d end up being in. The alternatives you mentioned,
Pr(some subject-of-experience has experience X | my experiences are generated by program P) and Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P), both lead to problems for the reasons you’ve already described.
I’m not sure what made you think I didn’t mean, P(I have experience x | …). Could you explain?
We’re concerned about infinitarian paralysis, where we somehow fail to deliver a definite answer because we’re trying to balance an infinite amount of good against an infinite amount of bad. So far as I can see, your system still has this problem. E.g., if I know there are infinitely many people with various degrees of (un)happiness, and I am wondering whether to torture 1000 of them, your system is trying to calculate the average utility in an infinite population, and that simply isn’t defined.
My system doesn’t compute the average utility of anything. Instead, it tries to compute the expected value of utility (or life satisfaction). I’m sorry if this was somehow unclear. I didn’t think I ever mentioned I was dealing with averages anywhere, though. I’m trying to get better at writing clearly, so if you remember what made you think this, I’d appreciate hearing.
I’ll begin at the end: What is “the expected value of utility” if it isn’t an average of utilities?
You originally wrote:
suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe. Consider having a bounded quantitative measure of your general satisfaction with life, for example, a utility function. Then try to make the universe such that the expected value of your life satisfaction is as high as possible if you conditioned on you being an agent in this universe, but didn’t condition on anything else.
What is “the expected value of your life satisfaction [] conditioned on you being an agent in this universe but [not] on anything else” if it is not the average of the life satisfactions (utilities) over the agents in this universe?
(The slightly complicated business with conditional probabilities that apparently weren’t what you had in mind were my attempt at figuring out what else you might mean. Rather than trying to figure it out, I’m just asking you.)
I’ll begin at the end: What is “the expected value of utility” if it isn’t an average of utilities?
I’m just using the regular notion of expected value. That is, let P(u) be the probability density you get utility u. Then, the expected value of utility is ∫[a,b]uP(u)du, where ∫ uses Lebesgue integration for greater generality. Above, I take utility to be in [a,b].
Also note that my system cares about a measure of satisfaction, rather than specifically utility. In this case, just replace P(u) to be that measure of life satisfaction instead of a utility.
Also, of course, P(u) is calculated conditioning on being an agent in this universe, and nothing else.
And how do you calculate P(u) given the above? Well, one way is to first start with some disjoint prior probability distribution over universes and situations you could be in, where the situations are concrete enough to determine your eventual life satisfaction. Then just do a Bayes update on “is an agent in this universe and get utility u” by setting the probabilities of hypothesis in which the agent isn’t in this universe or doesn’t have preferences. Then just renormalize the probabilities so they sum to 1. After that, you can just use this probability distribution of possible worlds W to calculate P(u) in a straightforward manner. E.g. ∫WP(utility=U|W)dP(w).
(I know I pretty much mentioned the above calculation before, but I thought rephrasing it might help.)
If you are just using the regular notion of expected value then it is an average of utilities. (Weighted by probabilities.)
I understand that your measure of satisfaction need not be a utility as such, but “utility” is shorter than “measure of satisfaction which may or may not strictly speaking be utility”.
Oh, I’m sorry; I misunderstood you. When you said the average of utilities, I thought you meant the utility averaged among all the different agents in the world. Instead, it’s just, roughly, an average among probability density function of utility. I say roughly because I guess integration isn’t exactly an average.
I think this system may have the following problem: It implicitly assumes that you can take a kind of random sample that in fact you can’t.
You want to evaluate universes by “how would I feel about being in this universe?”, which I think means either something like “suppose I were a randomly chosen subject-of-experiences in this universe, what would my expected utility be?” or “suppose I were inserted into a random place in this universe, what would my expected utility be?”. (Where “utility” is shorthand for your notion of “life satisfaction”, and you are welcome to insist that it be bounded.)
But in a universe with infinitely many—countably infinitely many, presumably—subjects-of-experiences, the first involves an action equivalent to picking a random integer. And in a universe of infinite size (and with a notion of space at least a bit like ours), the second involves an action equivalent to picking a random real number.
And there’s no such thing as picking an integer, or a real number, uniformly at random.
Thank you for the response.
You are correct that there’s no way to form a uniform distribution over the set of all integers or real numbers. And, similarly, you are also correct that there is no way of sampling from infinitely many agents uniformly at random.
Luckily, my system doesn’t require you to do any of these things.
Don’t think about my system as requiring you to pick out a specific random agent in the universe (because you can’t). It doesn’t try to come up with the probability of you being some single specific agent.
Instead, it picks out some some description of circumstances an agent could be in as well as a description of the agent itself. And this, you can do. I don’t think anyone’s completely formalized a way to compute prior probabilities over situations they could end up. But the basic idea is to, over different circumstances, each of finite description length, take some complexity-weighted or perhaps uniform distribution.
I’m not entirely sure how to form a probability distribution that include situations of infinite complexity. But it doesn’t seem like you really need to, because, in our universe at least, you can only be affected by a finite region. But I’ve thought about how to deal with infinite description lengths, too, and I can discuss it if you’re interested.
I’ll apply my moral system to the coin flip example. To make it more concrete, suppose there’s some AI that uses a pseudorandom number generator that outputs “heads” or “tails”, and then the AI, having precise control of the environment, makes the actual coin land on heads iff the pseudorandom number generator outputted “heads”. And it does so for each agent and makes them happy if it lands on heads and unhappy if it lands on tails.
Let’s consider the situation in which the pseudorandom number generator says “heads” 99.9% of the time. Well, pseudorandom number generators tend to work by having some (finite) internal seed, then using that seed to pick out a random number in, say, [0, 1]. Then, for the next number, it updates its (still finite) internal state from the initial seed in a very chaotic manner, and then again generates a new number in [0, 1]. And my understanding is that the internal state tends to be uniform in the sense that on average each internal state is just as common as each other internal state. I’ll assume this in the following.
If the generator says “heads” 99.9% of the time, then that means that, among the different internal states, 99.9% of them result in the answer being “heads” and 0.1% result in the answer being “tails”.
Suppose you’re know you’re in this universe, but nothing else. Well, you know you will be in a circumstance in which there is some AI that uses a pseudorandom number generator to determine your life satisfaction, because that’s how it is for everyone in the universe. However, you have no way of knowing the specifics of the internal state of of the pseudorandom number generator.
So, to compute the probability of life satisfaction, just take some very high-entropy probability distribution over them, for example, a uniform distribution. So, 99.9% of the internal states would result in you being happy, and only 0.1% result in you being unhappy. So, using a very high-entropy distribution of internal states would result in you assigning probability of approximate 99.9% to you ending up happy.
Similarly, suppose instead that the generator generates heads only 0.1% of the time. Then only 0.1% of internal states of the pseudorandom number generator would result in it outputting “heads”. Thus, if you use a high-entropy probability distribution over the internal state, you would assign a probability of approximately 0.1% to you being happy.
Thus, if I’m reasoning correctly, the probability of you being satisfied conditioning only you being in the 99.9%-heads universe is approximately 99.9%, and the probability of being satisfied in the 0.01%-heads universe is approximately 0.01%. Thus, the former universe would be seen as having more moral value than the latter universe according to my ethical system.
And I hope what I’m saying isn’t too controversial. I mean, in order to reason, there must be some way to assign a probability distribution over situations you end up in, even if you don’t yet of any idea what concrete situation you’ll be in. I mean, suppose you actually learned you were in the 99.9%-heads universe, but knew nothing else. Then it really shouldn’t seem unreasonable that you assign 99.9% probability to ending up happy. I mean, what else would you think?
I don’t think I understand why your system doesn’t require something along the lines of choosing a uniformly-random agent or place. Not necessarily exactly either of those things, but something of that kind. You said, in OP:
suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe.
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
If I understand your comment correctly, you want to deal with that by picking a random description of a situation in the universe, which is just a random bit-string with some constraints on it, which you presumably do in something like the same way as choosing a random program when doing Solomonoff induction: cook up a prefix-free language for describing situations-in-the-universe, generate a random bit-string with each bit equally likely 0 or 1, and see what situation it describes.
But now everything depends on the details of how descriptions map to actual situations, and I don’t see any canonical way to do that or any anything-like-canonical way to do it. (Compare the analogous issue with Solomonoff induction. There, everything depends on the underlying machine, but one can argue at-least-kinda-plausibly that if we consider “reasonable” candidates, the differences between them will quickly be swamped by all the actual evidence we get. I don’t see anything like that happening here. What am I missing?
Your example with an AI generating people with a PRNG is, so far as it goes, fine. But the epistemic situation one needs to be in for that example to be relevant seems to me incredibly different from any epistemic situation anyone is ever really in. If our universe is running on a computer, we don’t know what computer or what program or what inputs produced it. We can’t do anything remotely like putting a uniform distribution on the internal states of the machine.
Further, your AI/PRNG example is importantly different from the infinitely-many-random-people example on which it’s based. You’re supposing that your AI’s PRNG has an internal state you can sample from uniformly at random! But that’s exactly the thing we can’t do in the randomly-generated-people example.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
(Of course, in something that’s less of a toy model, the arrangement of people can matter a lot. It’s nice to be near to friends and far from enemies, for instance. But of course that isn’t what we’re talking about here; when we rearrange the people we do so in a way that preserves all their experiences and their level of happiness.)
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
So, remember, the moral value of the universe according to my ethical system depends on P(I’ll be satisfied | I’m some creature in this universe).
There must be some reasonable way to calculate this. And one that doesn’t rely on impossibly taking a uniform sample from a set that has none. Now, we haven’t fully formalized reasoning and priors yet. But there is some reasonable prior probability distribution over situations you could end up in. And after that you can just do a Bayesian update on the evidence “I’m in universe x”.
I mean, imagine you had some superintelligent AI that takes evidence and outputs probability distributions. And you provide the AI with evidence about what the universe it’s in is like, without letting it know anything about the specific circumstances it will end up in. There must be some reasonable probability for the AI to assign to outcomes. If there isn’t, then that means whatever probabilistic reasoning system the AI uses must be incomplete.
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
I’m surprised you said this and interested in why. Could you explain what probability you would assign to being happy in that universe?
I mean, conditioning on being in that universe, I’m really not sure what else I would do. I know that I’ll end up with my happiness determined by some AI with a pseudorandom number generator. And I have no idea what the internal state of the random number generator will be. In Bayesian probability theory, the standard way to deal with this is to take a maximum entropy (i.e. uniform in this case) distribution over the possible states. And such a distribution would imply that I’d be happy with probability 99.9%. So that’s how I would reason about my probability of happiness using conventional probability theory.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
I’m not entirely sure how my system would evaluate this universe, but that’s due to my own uncertainty about what specific prior to use and its implications.
But I’ll take a stab at it. I see the counter alternates through periods of making happy people and periods of making unhappy people. I have no idea which period I’d end up being in, so I think I’d use the principle of indifference to assign probability 0.5 to both. If I’m in the happy period, then I’d end up happy, and if I’m in the unhappy period, I’d end up unhappy. So I’d assign probability approximately 0.5 to ending up happy.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Oh, I had in mind that the internal state of the pseudorandom number generator was finite, and that each pseudorandom number generator was only used finitely-many times. For example, maybe each AI on its world had its own pseudorandom number generator.
And I don’t see how else I could interpret this. I mean, if the pseudorandom number generator is used infinitely-many times, then it couldn’t have outputted “happy” 99.9% of the time and “unhappy” 0.1% of the time. With infinitely-many outputs, it would output “happy” infinitely-many times and output “unhappy” infinitely-many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
Yep. And I don’t think there’s any way around this. When talking about infinite ethics, we’ve had in mind a canonically infinite universe: one that, for every level of happiness, suffering, satisfaction, and dissatisfaction, there exists infinite many agents with that level. It looks like this is the sort of universe we’re stuck in.
So then there’s no difference in terms of moral value of two canonically-infinite universes except the patterning of value. So if you want to compare the moral value of two canonically-infinite universes, there’s just nothing you can do except to consider the patterning of values. That is, unless you want to consider any two canonically-infinite universes to be of equivalent moral value, which doesn’t seem like an intuitively desirable idea.
The problem with some of the other infinite ethical systems I’ve seen is that they would morally recommend redistributing unhappy agents extremely thinly in the universe, rather than actually try to make them happy, provided this was easier. As discussed in my article, my ethical system provides some degree of defense against this, which seems to me like a very important benefit.
There must be some reasonable way to calculate this.
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
Does whatever argument or intuition leads you to say that there must be a reasonable way to calculate Pr(X is satisfied | X is a being in universe U) also tell you that there must be a reasonable way to calculate Pr(X is even | X is a positive integer)? How about Pr(the smallest n with x ⇐ n! is even | x is a positive integer)?
I should maybe be more explicit about my position here. Of course there are ways to give a meaning to such expressions. For instance, we can suppose that the integer n occurs with probability 2^-n, and then e.g. if I’ve done my calculations right then the second probability is the sum of 2^-0! + (2^-2!-2^-3!) + (2^-4!-2^-5!) + … which presumably doesn’t have a nice closed form (it’s transcendental for sure) but can be calculated to high precision very easily. But that doesn’t mean that there’s and such thing as the way to give meaning to such an expression. We could use some other sequence of weights adding up to 1 instead of the powers of 1⁄2, for instance, and we would get a substantially different answer. And if the objects of interest to us were beings in universe U rather than positive integers, they wouldn’t come equipped with a standard order to look at them in.
Why should we expect there to be a well-defined answer to the question “what fraction of these beings are satisfied”?
Could you explain what probability you would assign to being happy in that universe?
No, because I do not assign any probability to being happy in that universe. I don’t know a good way to assign such probabilities and strongly suspect that there is none.
You suggest doing maximum entropy on the states of the pseudorandom random number generator being used by the AI making this universe. But when I was describing that universe I said nothing about AIs and nothing about pseudorandom number generators. If I am contemplating being in such a universe, then I don’t know how the universe is being generated, and I certainly don’t know the details of any pseudorandom number generator that might be being used.
Suppose there is a PRNG, but an infinite one somehow, and suppose its state is a positive integer (of arbitrary size). (Of course this means that the universe is not being generated by a computing device of finite capabilities. Perhaps you want to exclude such possibilities from consideration, but if so then you might equally well want to exclude infinite universes from consideration: a finite machine can’t e.g. generate a complete description of what happens in an infinite universe. If you’re bothering to consider infinite universes at all, I think you should also be considering universes that aren’t generated by finite computational processes.)
Well, in this case there is no uniform prior over the states of the PRNG. OK, you say, let’s take the maximum-entropy prior instead. That would mean (p_k) minimizing sum p_k log p_k subject to the sum of p_k being 1. Unfortunately there is no such (p_k). If we take p_k = 1/n for k=1..n and 0 for larger k, the sum is log 1/n which → -oo as n → oo. In other words, we can make the entropy of (p_k) as large as we please.
You might suppose (arbitrarily, it seems to me) that the integer that’s the state of our PRNG is held in an infinite sequence of bits, and choose each bit at random. But then with probability 1 you get an impossible state of the RNG, and for all we know the AI’s program might look like “if PRNG state is a finite positive integer, use it to generate a number between 0 and 1 and make our being happy if that number is ⇐ 0.999; if PRNG state isn’t a finite positive integer, put our being in hell”.
I mean, if the pseudorandom number generator is used infinitely many times, then [...] it would output “happy” infinitely many times and output “unhappy” infinitely many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Yes, exactly! When I described this hypothetical world, I didn’t say “the probability that a being in it is happy is 99.9%”. I said “a biased coin-flip determines the happiness of each being in it, choosing ‘happy’ with probability 99.9%”. Or words to that effect. This is, so far as I can see, a perfectly coherent (albeit partial!) specification of a possible world. And it does indeed have the property that “the probability that a being in it is happy” is not well defined.
This doesn’t mean the scenario is improper somehow. It means that any ethical (or other) system that depends on evaluating such probabilities will fail when presented with such a universe. Or, for that matter, pretty much any universe with infinitely many beings in it.
there’s just nothing you can do except to consider the patterning of values.
But then I don’t see that you’ve explained how your system considers the patterning of values. In the OP you just talk about the probability that a being in such-and-such a universe is satisfied; and that probability is typically not defined. Here in the comments you’ve been proposing something involving knowing the PRNG used by the AI that generated the universe, and sampling randomly from the outputs of that PRNG; but (1) this implies being in an epistemic situation completely unlike any that any real agent is ever in, (2) nothing like this can work (so far as I can see) unless you know that the universe you’re considering is being generated by some finite computational process, and if you’re going to assume that you might as well assume a finite universe to begin with and avoid having to deal with infinite ethics at all, (3) I don’t understand how your “look at the AI’s PRNG” proposal generalizes to non-toy questions, and (4) even if (1-3) are resolved somehow, it seems like it requires a literally infinite amount of computation to evaluate any given universe. (Which is especially problematic when we are assuming we are in a universe generated by a finite computational process.)
You say, “There must be some reasonable way to calculate this.”
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
To use probability theory to form accurate beliefs, we need a prior. I didn’t think this was controversial. And if you have a prior, as far as I can tell, you can then compute Pr(I’m satisfied | I’m some being in such-and-such a universe) by simply updating on “I’m some being in such-and-such a universe” using Bayes’ theorem.
That is, you need to have some prior probability distribution over concrete specifications of the universe you’re in and your situation in it. Now, to update on “I’m some being in such-and-such a universe”, just look at each concrete possible situation-and-universe and assign P(“I’m some being in such-and-such a universe” | some concrete hypothesis) to 0 if the hypothesis specifies you’re in some universe other than the such-and-such universe. And set this probability is 1 if it does specify you are in such a universe. As long as the possible universes are specified sufficiently precisely, then I don’t see why you couldn’t do this.
OK, so I think I now understand your proposal better than I did.
So if I’m contemplating making the world be a particular way, you then propose that I should do the following calculation (as always, of course I can’t do it because it’s uncomputable, but never mind that):
Consider all possible computable experience-streams that a subject-of-experiences could have.
Consider them, specifically, as being generated by programs drawn from a universal distribution.
Condition on being in the world that’s the particular way I’m contemplating making it—that is, discard experience-streams that are literally inconsistent with being in that world.
We now have a probability distribution over experience-streams. Compute a utility for each, and take its expectation.
And now we compare possible universes by comparing this expected utility.
(Having failed to understand your proposal correctly before, I am not super-confident that I’ve got it right now. But let’s suppose I have and run with it. You can correct me if not. In that case, some or all of what follows may be irrelevant.)
I agree that this seems like it will (aside from concerns about uncomputability, and assuming our utilities are bounded) yield a definite value for every possible universe. However, it seems to me that it has other serious problems which stop me finding it credible.
SCENARIO ONE. So, for instance, consider once again a world in which there are exactly two sorts of experience-subject, happy and unhappy. Traditionally we suppose infinitely many of both, but actually let’s also consider possible worlds where there is just one happy experience-subject, or just one unhappy one. All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”. That seems regrettable, but it’s a bullet I can imagine biting—perhaps we just don’t care at all about multiple instantiations of the exact same stream of experiences: it’s just the same person and it’s a mistake to think of them as contributing separately to the goodness of the universe.
So now let’s consider some variations on this theme.
SCENARIO TWO. Suppose I think up an infinite (or for that matter merely very large) number of highly improbable experience-streams that one might have, all of them unpleasant. And I find a single rather probable experience-stream, a pleasant one, whose probability (according to our universal prior) is greater than the sum of those other ones. If I am contemplating bringing into being a world containing exactly the experience-streams described in this paragraph, then it seems that I should, because the expected net utility is positive, at least if the pleasantness and unpleasantness of the experiences in question are all about equal.
To me, this seems obviously crazy. Perhaps there’s some reason why this scenario is incoherent (e.g., maybe somehow I shouldn’t be able to bring into being all those very unlikely beings, at least not with non-negligible probability, so it shouldn’t matter much what happens if I do, or something), but at present I don’t see how that would work out.
The problem in SCENARIO TWO seems to arise from paying too much attention to the prior probability of the experience-subjects. We can also get into trouble by not paying enough attention to their posterior probability, in some sense.
SCENARIO THREE. I have before me a switch with two positions, placed there by the Creator of the Universe. They are labelled “Nice” and “Nasty”. The CotU explains to me that the creation of future experience-subjects will be controlled by a source of True Randomness (whatever exactly that might be), in such a way that all possible computable experience-subjects have a real chance of being instantiated. The CotU has designed two different prefix-free codes mapping strings of bits to possible experience-subjects; then he has set a Truly Random coin to flip for ever, generating a new experience-subject every time a leaf of the code’s binary tree is reached, so that we get an infinite number of experience-subjects generated at random, with a distribution depending on the prefix-free code being used. The Nice and Nasty settings of the switch correspond to two different codes. The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
In this case, our conditioning doesn’t remove any possible experience-subjects from consideration, so we are indifferent between the “Nice” and “Nasty” settings of the switch.
This is another one where we might be right to bite the bullet. In the long run infinitely many of every possible experience-subject will be created in each version of the universe, so maybe these two universes are “anagrams” of one another and should be considered equal. So let’s tweak it.
SCENARIO FOUR. Same as in SCENARIO THREE, except that now the CotU’s generator will run until it has produced a trillion experience-subjects and then shut off for ever.
It is still the case that with the switch in either setting any experience-subject is possible, so we don’t get to throw any of them out. But it’s no longer the case that the universes generated in the “Nice” and “Nasty” versions are with probability 1 (or indeed with not-tiny probability) identical in any sense.
So far, these scenarios all suppose that somehow we are able to generate arbitrary sets of possible experience-subjects, and arrange for those to be all the experience-subjects there are, or at least all there are after we make whatever decision we’re making. That’s kinda artificial.
SCENARIO FIVE. Our universe, just as it is now. We assume, though, that our universe is in fact infinite. You are trying to decide whether to torture me to death.
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe. Perhaps in the version of the world where you torture me to death this makes you more likely to do other horrible things, or makes other people who care for me suffer more, but again none of this makes any experiences impossible that would otherwise have been possible, or vice versa. So our universe-evaluator is indifferent between these choices.
(The possibly-overcomplicated business in one of my other comments, where I tried to consider doing something Solomoff-like using both my experiences and those of some hypothetical possibly-other experience-subject in the world, was intended to address these problems caused by considering only possibility and not anything stronger. I couldn’t see how to make it work, though.)
All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”
It’s not clear to me how they are indistinguishable. As long as the agent that’s unhappy can have itself and its circumstances described with a finite description length, then it would have non-zero probability of an agent ending up as that one. Thus, making the agent unhappy would decrease the moral value of the world.
I’m not sure what would happen if the single unhappy agent has infinite complexity and 0 probability. But I suspect that this could be dealt with if you expanded the system to also consider non-real probabilities. I’m no expert on non-real probabilities, but I bet you the probability of being unhappy given there is an unhappy agent would be infinitesimally more probable than the probability in the world in which there’s no unhappy agents.
RE: scenario two:
It’s not clear to me how this is crazy. For example, consider this situation: when agents are born, an AI flips a biased coin to determine what will happen to them. Each coin has a 99.999% chance of landing on heads and a 0.001% chance of landing on tails. If the coin lands on heads, the AI will give the agent some very pleasant experience stream, and all such agents will get the same pleasant experience stream. But if it lands on tails, the AI will give the agent some unpleasant experience stream that is also very different from the other unpleasant ones.
This sounds like a pretty good situation to me. It’s not clear to me why it wouldn’t be. I mean, I don’t see why the diversity of the positive experiences matters. And if you do care about the diversity of positive experiences, this would have unintuitive results. For example, suppose all agents have identical preferences and they satisfaction is maximized by experience stream S. Well, if you have a problem with the satisfied agents having just one experience stream, then you would be incentivized to coerce the agents to instead have a variety of different experience streams, even if they didn’t like these experience streams as much.
RE: scenario three:
The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
I don’t follow your reasoning. You just said in the “Nice” position, the expected value of this is large and positive and in the “Nasty” it’s large and negative. And since my ethical system seeks to maximize the expected value of life satisfaction, it seems trivial to me that it would prefer the “nice” button.
Whether or not you switch it to the “Nice” position won’t rule out any possible outcomes for an agent, but it seems pretty clear that it would change the probabilities of them.
RE: scenario four:
My ethical system would prefer the “Nice” position for the same reason described in scenario three.
RE: scenario five:
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe.
Though none of the experience streams are impossible, the probability of you getting tortured is still higher conditioning on me deciding the torture you. To see why, note the situation, “Is someone just like Slider who is vulnerable to being tortured by demon lord Chantiel”. This has finite description length, and thus non-zero probability. And if I decide to torture you, then the probability of you getting tortured if you end up in this situation is high. Thus, the total expected value of life satisfaction would be lower if I decided to torture you. So my ethical system would recommend not torturing you.
In general, don’t worry about if an experience stream is possible or not. In an infinite universe with quantum noise, I think pretty much all experience streams would occur with non-zero probability. But you can still adjust the probabilities of an agent ending up with the different streams.
It sounds as if my latest attempt at interpreting what your system proposes doing is incorrect, because the things you’re disagreeing with seem to me to be straightforward consequences of that interpretation. Would you like to clarify how I’m misinterpreting now?
Here’s my best guess.
You wrote about specifications of an experience-subject’s universe and situation in it. I mentally translated that to their stream of experiences because I’m thinking in terms of Solomonoff induction. Maybe that’s a mistake.
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
So that ought to give a well-defined (modulo the usual stuff about uncomputability) probability distribution over experience-subjects-in-universes. And then you want to condition on “being in a universe with such-and-such characteristics” (which may or may not specify the universe itself completely) and look at the expected utility-or-utility-like-quantity of all those experience-subjects-in-universes after you rule out the universes without such-and-such characteristics.
It’s now stupid-o’-clock where I am and I need to get some sleep. I’m posting this even though I haven’t had time to think about whether my current understanding of your proposal seems like it might work, because on past form there’s an excellent chance that said understanding is wrong, so this gives you more time to tell me so if it is :-). If I don’t hear from you that I’m still getting it all wrong, I’ll doubtless have more to say later...
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
That’s closer to what I meant. By “experience-subject”, I think you mean a specific agent at a specific time. If so, my system doesn’t require an unambiguous specification of an experience-subject.
My system doesn’t require you to pinpoint the exact agent. Instead, it only requires you to specify a (reasonably-precise) description of an agent and its circumstances. This doesn’t mean picking out a single agent, as there many be infinitely-many agents that satisfy such a description.
As an example, a description could be something like, “Someone named gjm in an 2021-Earth-like world with personality <insert a description of your personality and thoughts> who has <insert description of my life experiences> and is currently <insert description of how your life is currently>”
This doesn’t pick out a single individual. There are probably infinitely-many gjms out there. But as long as the description is precise enough, you can still infer your probable eventual life satisfaction.
But other than that, your description seems pretty much correct.
It’s now stupid-o’-clock where I am and I need to get some sleep.
I feel you. I also posted something at stupid-o’-clock and then woke up a 5am, realized I messed up, and then edited a comment and hoped no one saw the previous error.
No, I don’t intend “experience-subject” to pick out a specific time. (It’s not obvious to me whether a variant of your system that worked that way would be better or worse than your system as it is.) I’m using that term rather than “agent” because—as I think you point out in te OP—what matters for moral relevance is having experiences rather than performing actions.
So, anyway, I think I now agree that your system does indeed do approximately what you say it does, and many of my previous criticisms do not in fact apply to it; my apologies for the many misunderstandings.
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
Yep. To be fair, though, I suspect any ethical system that respects agents’ arbitrary preferences would also be incomputable. As a silly example, consider an agent whose terminal values are, “If Turing machine T halts, I want nothing more than to jump up and down. However, if it doesn’t halt, then it is of the utmost importance to me that I never jump up and down and instead sit down and frown.” Then any ethical system that cares about those preferences is incomputable.
Now this is pretty silly example, but I wouldn’t be surprised if there were more realistic ones. For one, it’s important to respect other agents’ moral preferences, and I wouldn’t be surprised if their ideal moral-preferences-on-infinite-reflection would be incomputable. I seems to me that morall philosophers act as some approximation of, “Find the simplest model of morality that mostly agrees with my moral intuitions”. If they include incomputable models, or arbitrary Turing machines that may or may not halt, then the moral value of the world to them would in fact be incomputable, so any ethical system that cares about preferences-given-infinite-reflection would also be incomputable.
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
I’m not that worried about agents that are physically bigger, but it’s true that there may be some agents or agents descriptions in situations that are easier to pick out (in terms of having a short description length) then others. Maybe there’s something really special about the agent that makes it easy to pin down.
I’m not entirely sure if this would be a bug or a feature. But if it’s a bug, I think it could be dealt with by just choosing the right prior over agents-situations. Specifically, for any description of an environment with finitely-many agents A, make the probability of ending up as a∈A, conditioned only on being one of the agents in that environment, should be constant for all a∈A. This way, the prior isn’t biased in favor of the agents that are easy to pick out.
I think this system may have the following problem: It implicitly assumes that you can take a kind of random sample that in fact you can’t.
You want to evaluate universes by “how would I feel about being in this universe?”, which I think means either something like “suppose I were a randomly chosen subject-of-experiences in this universe, what would my expected utility be?” or “suppose I were inserted into a random place in this universe, what would my expected utility be?”. (Where “utility” is shorthand for your notion of “life satisfaction”, and you are welcome to insist that it be bounded.)
But in a universe with infinitely many—countably infinitely many, presumably—subjects-of-experiences, the first involves an action equivalent to picking a random integer. And in a universe of infinite size (and with a notion of space at least a bit like ours), the second involves an action equivalent to picking a random real number.
And there’s no such thing as picking an integer, or a real number, uniformly at random.
This is essentially the same as the “infinitarian paralysis” problem. Consider two universes, each with a countable infinity of happy people and a countable infinity of unhappy people (and no other subjects of experience, somehow). In the first, all the people were generated with a biased coin-flip that picks “happy” 99.9% of the time. In the second, the same except that their coin picks “unhappy” 99.9% of the time. We’d like to be able to say that the first option is better than the second, but we can’t, because actually with probability 1 these two universes are equivalent in the sense that with probability 1 they both have infinitely many happy and infinitely many unhappy people, and we can simply rearrange them to turn one of those universes into the other. Which is one way of looking at why there’s no such operation as “pick a random integer”, because if there were then surely picking a random person from universe 1 gets you a happy person with probability 0.999 and picking a random person from universe 1 gets you a happy person with probability 0.001.
When you have infinitely many things, you may find yourself unable to say meaningfully whether there’s more positive or more negative there, and that isn’t dependent on adding up the positives and negatives and getting infinite amounts of goodness or badness. You are entirely welcome to say that in our hypothetical universe there are no infinite utilities anywhere, that we shouldn’t be trying to compute anything like “the total utility”, and that’s fine, but you still have the problem that e.g. you can’t say “it’s a bad thing to take 1000 of the happy people and make them unhappy” if what you mean by that is that it makes for a worse universe, because the modified universe is isomorphic to the one you started with.
It’s not a distribution over agents in the universe, it’s a distribution over possible agents in possible universes. The possible universes can be given usual credence-based weightings based on conditional probability given the moral agent’s observations and models, because what else are they going to base anything on?
If your actions make 1000 people unhappy, and presumably some margin “less satisfied” in some hypothetical post-mortem universe rating, the idea seems to be that you first estimate how much less satisfied they would be. Then the novel (to me) part of this idea is that you multiply this by the estimated fraction of all agents, in all possible universes weighted by credence, who would be in your position. Being a fraction, there is no unboundedness involved. The fraction may be extremely small, but should always be nonzero.
As I see it the exact fraction you estimate doesn’t actually matter, because all of your options have the same multiplier and you’re evaluating them relative to each other. However this multiplier is what gives ethical decisions nonzero effect even in an infinite universe, because there will only be finitely many ethical scenarios of any given complexity.
So it’s not just “make 1000 happy people unhappy”, it’s “the 1 in N people with similar incentives as me in a similar situation would each make 1000 happy people unhappy”, resulting in a net loss of 1000/N of universal satisfaction. N may be extremely large, but it’s not infinite.
How is it a distribution over possible agents in possible universes (plural) when the idea is to give a way of assessing the merit of one possible universe?
I do agree that an ideal consequentialist deciding between actions should consider for each action the whole distribution of possible universes after they do it. But unless I’m badly misreading the OP, I don’t see where it proposes anything like what you describe. It says—emphasis in all cases mine, to clarify what bits I think indicate that a single universe is in question—”… but you still knew you would be born into this universe”, and “Imagine hypothetically telling an agent everything significant about the universe”, and “a prior over situations in the universe you could be born into”, and “my ethical system provides a function mapping from possible worlds to their moral value”, and “maximize the expected value of your life satisfaction given you are in this universe”, and “The appeal of aggregate consequentialism is that its defines some measure of “goodness” of a universe”, and “the moral value of the world”, and plenty more.
Even if somehow this is what OP meant, though—or if OP decides to embrace it as an improvement—I don’t see that it helps at all with the problem I described; in typical cases I expect picking a random agent in a credence-weighted random universe-after-I-do-X to pose all the same difficulties as picking a random agent in a single universe-after-I-do-X. Am I missing some reason why the former would be easier?
(Assuming you’re read my other response you this comment):
I think it might help if I give a more general explanation of how my moral system can be used to determine what to do. This is mostly taken from the article, but it’s important enough that I think it should be restated.
Suppose you’re considering taking some action that would benefit our world or future life cone. You want to see what my ethical system recommends.
Well, for almost possible circumstances an agent could end up in in this universe, I think your action would have effectively no causal or acausal effect on them. There’s nothing you can do about them, so don’t worry about them in your moral deliberation.
Instead, consider agents of the form, “some agent in an Earth-like world (or in the future light-cone of one) with someone just like <insert detailed description of yourself and circumstances>”. These are agents you can potentially (acausally) affect. If you take an action to make the world a better place, that means the other people in the universe who are very similar to you and in very similar circumstances would also take that action.
So if you take that action, then you’d improve the world, so the expected value of life satisfaction of an agent in the above circumstances would be higher. Such circumstances are of finite complexity and not ruled out by evidence, so the probability of an agent ending up in such a situation, conditioning only on being in this universe, in non-zero. Thus, taking that action would increase the moral value of the universe and my ethical system would thus be liable to recommend taking that action.
To see it another way, moral deliberation with my ethical system works as follows:
Your comments are focusing on (so to speak) the decision-theoretic portion of your theory, the bit that would be different if you were using CDT or EDT rather than something FDT-like. That isn’t the part I’m whingeing about :-). (There surely are difficulties in formalizing any sort of FDT, but they are not my concern; I don’t think they have much to do with infinite ethics as such.)
My whingeing is about the part of your theory that seems specifically relevant to questions of infinite ethics, the part where you attempt to average over all experience-subjects. I think that one way or another this part runs into the usual average-of-things-that-don’t-have-an-average sort of problem which afflicts other attempts at infinite ethics.
As I describe in another comment, the approach I think you’re taking can move where that problem arises but not (so far as I can currently see) make it actually go away.
I do think JBlack understands the idea of my ethical system and is using it appropriately.
my system provides a method of evaluating the moral value of a specific universe. The point of moral agents to to try to make the universe one that scores highlly on this moral valuation. But we don’t know exactly what universe we’re in, so to make decisions, we need to consider all universes we could be in, and then take the action that maximizes the expected moral value of the universe we’re actually in.
For example, suppose I’m considering pressing a button that will either make everyone very slightly happier, or make everyone extremely unhappy. I don’t actually know which universe I’m in, but I’m 60% sure I’m in the one that would make everyone happy. Then if I press the button, there’s a 40% chance that the universe would end up with very low moral value. That means pressing the button would not in expectation decrease the moral value of the universe, so my morally system would recommend not pressing it.
I think to some extent you may be over-thinking things. I agree that it’s not completely clear how to compute P(“I’m satisfied” | “I’m in this universe”). But to use my moral system, I don’t need a perfect, rigorous solution to this, nor am I trying to propose one.
I think the ethical system provides reasonably straightforward moral recommendations in the situations we could actually be in. I’ll give an example of such a situation that I hope is illuminating. It’s paraphrased from the article.
Suppose you can have the ability to create safe AI and are considering whether my moral system recommends doing so. And suppose if you create safe AI everyone in your world will be happy, and if you don’t then the world will be destroyed by evil rogue AI.
Consider an agent that knows it will be in this universe, but nothing else. Well, consider the circumstances, “I’m an agent in an Earth-like world that contains someone who is just like gjm and in a very similar situation who has the ability to create safe AI”. That above description has finite description length, and the AI has no evidence ruling it out. So it must have some non-zero probability of ending up in such a situation, conditioning on being somewhere in this universe.
All the gjms have the same knowledge and value and are in pretty much the same circumstances. So their actions are logically constrained to be the same as yours. Thus, if you decide to create the AI, you are acausally determining the outcome of arbitrary agents in the above circumstances, by making such an agent end up satisfied when they otherwise wouldn’t have been. Since an agent in this universe has non-zero probability of ending up in those circumstances, by choosing to make the safe AI you are increasing the moral value of the universe.
As I said to JBlack, so far as I can tell none of the problems I think I see with your proposal become any easier to solve if we switch from “evaluate one possible universe” to “evaluate all possible universes, weighted by credence”.
Why not?
Of course you can make moral decisions without going through such calculations. We all do that all the time. But the whole issue with infinite ethics—the thing that a purported system for handling infinite ethics needs to deal with—is that the usual ways of formalizing moral decision processes produce ill-defined results in many imaginable infinite universes. So when you propose a system of infinite ethics and I say “look, it produces ill-defined results in many imaginable infinite universes”, you don’t get to just say “bah, who cares about the details?” If you don’t deal with the details you aren’t addressing the problems of infinite ethics at all!
It’s nice that your system gives the expected result in a situation where the choices available are literally “make everyone in the world happy” and “destroy the world”. (Though I have to confess I don’t think I entirely understand your account of how your system actually produces that output.) We don’t really need a system of ethics to get to that conclusion!
What I would want to know is how your system performs in more difficult cases.
We’re concerned about infinitarian paralysis, where we somehow fail to deliver a definite answer because we’re trying to balance an infinite amount of good against an infinite amount of bad. So far as I can see, your system still has this problem. E.g., if I know there are infinitely many people with various degrees of (un)happiness, and I am wondering whether to torture 1000 of them, your system is trying to calculate the average utility in an infinite population, and that simply isn’t defined.
So, I think this is what you have in mind; my apologies if it was supposed to be obvious from the outset.
We are doing something like Solomonoff induction. The usual process there is that your prior says that your observations are generated by a computer program selected at random, using some sort of prefix-free code and generating a random program by generating a random bit-string. Then every observation updates your distribution over programs via Bayes, and once you’ve been observing for a while your predictions are made by looking at what all those programs would do, with probabilities given by your posterior. So far so good (aside from the fact that this is uncomputable).
But what you actually want (I think) isn’t quite a probability distribution over universes; you want a distribution over experiences-in-universes, and not your experiences but those of hypothetical other beings in the same universe as you. So now think of the programs you’re working with as describing not your experiences necessarily but those of some being in the universe, so that each update is weighted not by Pr(I have experience X | my experiences are generated by program P) but by Pr(some subject-of-experience has experience X | my experiences are generated by program P), with the constraint that it’s meant to be the same subject-of-experience for each update. Or maybe by Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P) with the same constraint.
So now after all your updates what you have is a probability distribution over generators of experience-streams for subjects in your universe.
When you consider a possible action, you want to condition on that in some suitable fashion, and exactly how you do that will depend on what sort of decision theory you’re using; I shall assume all the details of that handwaved away, though again I think they may be rather difficult. So now you have a revised probability distribution over experience-generating programs.
And now, if everything up to this point has worked, you can compute (well, you can’t because everything here is uncomputable, but never mind) an expected utility because each of our programs yields a being’s stream of experiences, and modulo some handwaving you can convert that into a utility, and you have a perfectly good probability distribution over the programs.
And (I think) I agree that here if we consider either “torture 1000 people” or “don’t torture 1000 people” it is reasonable to expect that the latter will genuinely come out with a higher expected utility.
OK, so in this picture of things, what happens to my objections? They apply now to the process by which you are supposedly doing your Bayesian updates on experience. Because (I think) now you are doing one of two things, neither of which need make sense in a world with infinitely many beings in it.
If you take the “Pr(some subject-of-experience has experience X)” branch: here the problem is that in a universe with infinitely many beings, these probabilities are likely all 1 and therefore you never actually learn anything when you do your updating.
If you take the “Pr(a randomly chosen subject-of-experience has experience X)” branch: here the problem is that there’s no such thing as a randomly chosen subject-of-experience. (More precisely, there are any number of ways to choose one at random, and I see no grounds for preferring one over another, and in particular neither a uniform nor a maximum entropy distribution exists.)
The latter is basically the same problem as I’ve been complaining about before (well, it’s sort of dual to it, because now we’re looking at things from the perspective of some possibly-other experiencer in the universe, and you are the randomly chosen one). The former is a different problem but seems just as difficult to deal with.
Well, I can’t say I exactly disagree with you here.
However, I want to note that this isn’t a problem specific to my ethical system. It’s true that in order to use my ethical system to make precise moral verdicts, you need to more fully formalize probability theory. However, the same is also true with effectively every other ethical theory.
For example, consider someone learning about classical utilitarianism and its applications in a finite world. Then they could argue:
Also, I just want to note that my system as described seems to be unique among the infinite ethical systems I’ve seen in that it doesn’t make obviously ridiculous moral verdicts. Every other one I know of makes some recommendations that seem really silly. So, despite not providing a rigorous formalization of probability theory, I think my ethical system has value.
Actually, no, I really do want a probability distribution over what I would experience, or more generally, the situations I’d end up being in. The alternatives you mentioned, Pr(some subject-of-experience has experience X | my experiences are generated by program P) and Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P), both lead to problems for the reasons you’ve already described.
I’m not sure what made you think I didn’t mean, P(I have experience x | …). Could you explain?
My system doesn’t compute the average utility of anything. Instead, it tries to compute the expected value of utility (or life satisfaction). I’m sorry if this was somehow unclear. I didn’t think I ever mentioned I was dealing with averages anywhere, though. I’m trying to get better at writing clearly, so if you remember what made you think this, I’d appreciate hearing.
I’ll begin at the end: What is “the expected value of utility” if it isn’t an average of utilities?
You originally wrote:
What is “the expected value of your life satisfaction [] conditioned on you being an agent in this universe but [not] on anything else” if it is not the average of the life satisfactions (utilities) over the agents in this universe?
(The slightly complicated business with conditional probabilities that apparently weren’t what you had in mind were my attempt at figuring out what else you might mean. Rather than trying to figure it out, I’m just asking you.)
I’m just using the regular notion of expected value. That is, let P(u) be the probability density you get utility u. Then, the expected value of utility is ∫[a,b]uP(u)du, where ∫ uses Lebesgue integration for greater generality. Above, I take utility to be in [a,b].
Also note that my system cares about a measure of satisfaction, rather than specifically utility. In this case, just replace P(u) to be that measure of life satisfaction instead of a utility.
Also, of course, P(u) is calculated conditioning on being an agent in this universe, and nothing else.
And how do you calculate P(u) given the above? Well, one way is to first start with some disjoint prior probability distribution over universes and situations you could be in, where the situations are concrete enough to determine your eventual life satisfaction. Then just do a Bayes update on “is an agent in this universe and get utility u” by setting the probabilities of hypothesis in which the agent isn’t in this universe or doesn’t have preferences. Then just renormalize the probabilities so they sum to 1. After that, you can just use this probability distribution of possible worlds W to calculate P(u) in a straightforward manner. E.g. ∫WP(utility=U|W)dP(w).
(I know I pretty much mentioned the above calculation before, but I thought rephrasing it might help.)
If you are just using the regular notion of expected value then it is an average of utilities. (Weighted by probabilities.)
I understand that your measure of satisfaction need not be a utility as such, but “utility” is shorter than “measure of satisfaction which may or may not strictly speaking be utility”.
Oh, I’m sorry; I misunderstood you. When you said the average of utilities, I thought you meant the utility averaged among all the different agents in the world. Instead, it’s just, roughly, an average among probability density function of utility. I say roughly because I guess integration isn’t exactly an average.
Thank you for the response.
You are correct that there’s no way to form a uniform distribution over the set of all integers or real numbers. And, similarly, you are also correct that there is no way of sampling from infinitely many agents uniformly at random.
Luckily, my system doesn’t require you to do any of these things.
Don’t think about my system as requiring you to pick out a specific random agent in the universe (because you can’t). It doesn’t try to come up with the probability of you being some single specific agent.
Instead, it picks out some some description of circumstances an agent could be in as well as a description of the agent itself. And this, you can do. I don’t think anyone’s completely formalized a way to compute prior probabilities over situations they could end up. But the basic idea is to, over different circumstances, each of finite description length, take some complexity-weighted or perhaps uniform distribution.
I’m not entirely sure how to form a probability distribution that include situations of infinite complexity. But it doesn’t seem like you really need to, because, in our universe at least, you can only be affected by a finite region. But I’ve thought about how to deal with infinite description lengths, too, and I can discuss it if you’re interested.
I’ll apply my moral system to the coin flip example. To make it more concrete, suppose there’s some AI that uses a pseudorandom number generator that outputs “heads” or “tails”, and then the AI, having precise control of the environment, makes the actual coin land on heads iff the pseudorandom number generator outputted “heads”. And it does so for each agent and makes them happy if it lands on heads and unhappy if it lands on tails.
Let’s consider the situation in which the pseudorandom number generator says “heads” 99.9% of the time. Well, pseudorandom number generators tend to work by having some (finite) internal seed, then using that seed to pick out a random number in, say, [0, 1]. Then, for the next number, it updates its (still finite) internal state from the initial seed in a very chaotic manner, and then again generates a new number in [0, 1]. And my understanding is that the internal state tends to be uniform in the sense that on average each internal state is just as common as each other internal state. I’ll assume this in the following.
If the generator says “heads” 99.9% of the time, then that means that, among the different internal states, 99.9% of them result in the answer being “heads” and 0.1% result in the answer being “tails”.
Suppose you’re know you’re in this universe, but nothing else. Well, you know you will be in a circumstance in which there is some AI that uses a pseudorandom number generator to determine your life satisfaction, because that’s how it is for everyone in the universe. However, you have no way of knowing the specifics of the internal state of of the pseudorandom number generator.
So, to compute the probability of life satisfaction, just take some very high-entropy probability distribution over them, for example, a uniform distribution. So, 99.9% of the internal states would result in you being happy, and only 0.1% result in you being unhappy. So, using a very high-entropy distribution of internal states would result in you assigning probability of approximate 99.9% to you ending up happy.
Similarly, suppose instead that the generator generates heads only 0.1% of the time. Then only 0.1% of internal states of the pseudorandom number generator would result in it outputting “heads”. Thus, if you use a high-entropy probability distribution over the internal state, you would assign a probability of approximately 0.1% to you being happy.
Thus, if I’m reasoning correctly, the probability of you being satisfied conditioning only you being in the 99.9%-heads universe is approximately 99.9%, and the probability of being satisfied in the 0.01%-heads universe is approximately 0.01%. Thus, the former universe would be seen as having more moral value than the latter universe according to my ethical system.
And I hope what I’m saying isn’t too controversial. I mean, in order to reason, there must be some way to assign a probability distribution over situations you end up in, even if you don’t yet of any idea what concrete situation you’ll be in. I mean, suppose you actually learned you were in the 99.9%-heads universe, but knew nothing else. Then it really shouldn’t seem unreasonable that you assign 99.9% probability to ending up happy. I mean, what else would you think?
Does this clear things up?
I don’t think I understand why your system doesn’t require something along the lines of choosing a uniformly-random agent or place. Not necessarily exactly either of those things, but something of that kind. You said, in OP:
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
If I understand your comment correctly, you want to deal with that by picking a random description of a situation in the universe, which is just a random bit-string with some constraints on it, which you presumably do in something like the same way as choosing a random program when doing Solomonoff induction: cook up a prefix-free language for describing situations-in-the-universe, generate a random bit-string with each bit equally likely 0 or 1, and see what situation it describes.
But now everything depends on the details of how descriptions map to actual situations, and I don’t see any canonical way to do that or any anything-like-canonical way to do it. (Compare the analogous issue with Solomonoff induction. There, everything depends on the underlying machine, but one can argue at-least-kinda-plausibly that if we consider “reasonable” candidates, the differences between them will quickly be swamped by all the actual evidence we get. I don’t see anything like that happening here. What am I missing?
Your example with an AI generating people with a PRNG is, so far as it goes, fine. But the epistemic situation one needs to be in for that example to be relevant seems to me incredibly different from any epistemic situation anyone is ever really in. If our universe is running on a computer, we don’t know what computer or what program or what inputs produced it. We can’t do anything remotely like putting a uniform distribution on the internal states of the machine.
Further, your AI/PRNG example is importantly different from the infinitely-many-random-people example on which it’s based. You’re supposing that your AI’s PRNG has an internal state you can sample from uniformly at random! But that’s exactly the thing we can’t do in the randomly-generated-people example.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
(Of course, in something that’s less of a toy model, the arrangement of people can matter a lot. It’s nice to be near to friends and far from enemies, for instance. But of course that isn’t what we’re talking about here; when we rearrange the people we do so in a way that preserves all their experiences and their level of happiness.)
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
There must be some reasonable way to calculate this. And one that doesn’t rely on impossibly taking a uniform sample from a set that has none. Now, we haven’t fully formalized reasoning and priors yet. But there is some reasonable prior probability distribution over situations you could end up in. And after that you can just do a Bayesian update on the evidence “I’m in universe x”.
I mean, imagine you had some superintelligent AI that takes evidence and outputs probability distributions. And you provide the AI with evidence about what the universe it’s in is like, without letting it know anything about the specific circumstances it will end up in. There must be some reasonable probability for the AI to assign to outcomes. If there isn’t, then that means whatever probabilistic reasoning system the AI uses must be incomplete.
I’m surprised you said this and interested in why. Could you explain what probability you would assign to being happy in that universe?
I mean, conditioning on being in that universe, I’m really not sure what else I would do. I know that I’ll end up with my happiness determined by some AI with a pseudorandom number generator. And I have no idea what the internal state of the random number generator will be. In Bayesian probability theory, the standard way to deal with this is to take a maximum entropy (i.e. uniform in this case) distribution over the possible states. And such a distribution would imply that I’d be happy with probability 99.9%. So that’s how I would reason about my probability of happiness using conventional probability theory.
I’m not entirely sure how my system would evaluate this universe, but that’s due to my own uncertainty about what specific prior to use and its implications.
But I’ll take a stab at it. I see the counter alternates through periods of making happy people and periods of making unhappy people. I have no idea which period I’d end up being in, so I think I’d use the principle of indifference to assign probability 0.5 to both. If I’m in the happy period, then I’d end up happy, and if I’m in the unhappy period, I’d end up unhappy. So I’d assign probability approximately 0.5 to ending up happy.
Oh, I had in mind that the internal state of the pseudorandom number generator was finite, and that each pseudorandom number generator was only used finitely-many times. For example, maybe each AI on its world had its own pseudorandom number generator.
And I don’t see how else I could interpret this. I mean, if the pseudorandom number generator is used infinitely-many times, then it couldn’t have outputted “happy” 99.9% of the time and “unhappy” 0.1% of the time. With infinitely-many outputs, it would output “happy” infinitely-many times and output “unhappy” infinitely-many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Yep. And I don’t think there’s any way around this. When talking about infinite ethics, we’ve had in mind a canonically infinite universe: one that, for every level of happiness, suffering, satisfaction, and dissatisfaction, there exists infinite many agents with that level. It looks like this is the sort of universe we’re stuck in.
So then there’s no difference in terms of moral value of two canonically-infinite universes except the patterning of value. So if you want to compare the moral value of two canonically-infinite universes, there’s just nothing you can do except to consider the patterning of values. That is, unless you want to consider any two canonically-infinite universes to be of equivalent moral value, which doesn’t seem like an intuitively desirable idea.
The problem with some of the other infinite ethical systems I’ve seen is that they would morally recommend redistributing unhappy agents extremely thinly in the universe, rather than actually try to make them happy, provided this was easier. As discussed in my article, my ethical system provides some degree of defense against this, which seems to me like a very important benefit.
You say
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
Does whatever argument or intuition leads you to say that there must be a reasonable way to calculate Pr(X is satisfied | X is a being in universe U) also tell you that there must be a reasonable way to calculate Pr(X is even | X is a positive integer)? How about Pr(the smallest n with x ⇐ n! is even | x is a positive integer)?
I should maybe be more explicit about my position here. Of course there are ways to give a meaning to such expressions. For instance, we can suppose that the integer n occurs with probability 2^-n, and then e.g. if I’ve done my calculations right then the second probability is the sum of 2^-0! + (2^-2!-2^-3!) + (2^-4!-2^-5!) + … which presumably doesn’t have a nice closed form (it’s transcendental for sure) but can be calculated to high precision very easily. But that doesn’t mean that there’s and such thing as the way to give meaning to such an expression. We could use some other sequence of weights adding up to 1 instead of the powers of 1⁄2, for instance, and we would get a substantially different answer. And if the objects of interest to us were beings in universe U rather than positive integers, they wouldn’t come equipped with a standard order to look at them in.
Why should we expect there to be a well-defined answer to the question “what fraction of these beings are satisfied”?
No, because I do not assign any probability to being happy in that universe. I don’t know a good way to assign such probabilities and strongly suspect that there is none.
You suggest doing maximum entropy on the states of the pseudorandom random number generator being used by the AI making this universe. But when I was describing that universe I said nothing about AIs and nothing about pseudorandom number generators. If I am contemplating being in such a universe, then I don’t know how the universe is being generated, and I certainly don’t know the details of any pseudorandom number generator that might be being used.
Suppose there is a PRNG, but an infinite one somehow, and suppose its state is a positive integer (of arbitrary size). (Of course this means that the universe is not being generated by a computing device of finite capabilities. Perhaps you want to exclude such possibilities from consideration, but if so then you might equally well want to exclude infinite universes from consideration: a finite machine can’t e.g. generate a complete description of what happens in an infinite universe. If you’re bothering to consider infinite universes at all, I think you should also be considering universes that aren’t generated by finite computational processes.)
Well, in this case there is no uniform prior over the states of the PRNG. OK, you say, let’s take the maximum-entropy prior instead. That would mean (p_k) minimizing sum p_k log p_k subject to the sum of p_k being 1. Unfortunately there is no such (p_k). If we take p_k = 1/n for k=1..n and 0 for larger k, the sum is log 1/n which → -oo as n → oo. In other words, we can make the entropy of (p_k) as large as we please.
You might suppose (arbitrarily, it seems to me) that the integer that’s the state of our PRNG is held in an infinite sequence of bits, and choose each bit at random. But then with probability 1 you get an impossible state of the RNG, and for all we know the AI’s program might look like “if PRNG state is a finite positive integer, use it to generate a number between 0 and 1 and make our being happy if that number is ⇐ 0.999; if PRNG state isn’t a finite positive integer, put our being in hell”.
Yes, exactly! When I described this hypothetical world, I didn’t say “the probability that a being in it is happy is 99.9%”. I said “a biased coin-flip determines the happiness of each being in it, choosing ‘happy’ with probability 99.9%”. Or words to that effect. This is, so far as I can see, a perfectly coherent (albeit partial!) specification of a possible world. And it does indeed have the property that “the probability that a being in it is happy” is not well defined.
This doesn’t mean the scenario is improper somehow. It means that any ethical (or other) system that depends on evaluating such probabilities will fail when presented with such a universe. Or, for that matter, pretty much any universe with infinitely many beings in it.
But then I don’t see that you’ve explained how your system considers the patterning of values. In the OP you just talk about the probability that a being in such-and-such a universe is satisfied; and that probability is typically not defined. Here in the comments you’ve been proposing something involving knowing the PRNG used by the AI that generated the universe, and sampling randomly from the outputs of that PRNG; but (1) this implies being in an epistemic situation completely unlike any that any real agent is ever in, (2) nothing like this can work (so far as I can see) unless you know that the universe you’re considering is being generated by some finite computational process, and if you’re going to assume that you might as well assume a finite universe to begin with and avoid having to deal with infinite ethics at all, (3) I don’t understand how your “look at the AI’s PRNG” proposal generalizes to non-toy questions, and (4) even if (1-3) are resolved somehow, it seems like it requires a literally infinite amount of computation to evaluate any given universe. (Which is especially problematic when we are assuming we are in a universe generated by a finite computational process.)
To use probability theory to form accurate beliefs, we need a prior. I didn’t think this was controversial. And if you have a prior, as far as I can tell, you can then compute Pr(I’m satisfied | I’m some being in such-and-such a universe) by simply updating on “I’m some being in such-and-such a universe” using Bayes’ theorem.
That is, you need to have some prior probability distribution over concrete specifications of the universe you’re in and your situation in it. Now, to update on “I’m some being in such-and-such a universe”, just look at each concrete possible situation-and-universe and assign P(“I’m some being in such-and-such a universe” | some concrete hypothesis) to 0 if the hypothesis specifies you’re in some universe other than the such-and-such universe. And set this probability is 1 if it does specify you are in such a universe. As long as the possible universes are specified sufficiently precisely, then I don’t see why you couldn’t do this.
OK, so I think I now understand your proposal better than I did.
So if I’m contemplating making the world be a particular way, you then propose that I should do the following calculation (as always, of course I can’t do it because it’s uncomputable, but never mind that):
Consider all possible computable experience-streams that a subject-of-experiences could have.
Consider them, specifically, as being generated by programs drawn from a universal distribution.
Condition on being in the world that’s the particular way I’m contemplating making it—that is, discard experience-streams that are literally inconsistent with being in that world.
We now have a probability distribution over experience-streams. Compute a utility for each, and take its expectation.
And now we compare possible universes by comparing this expected utility.
(Having failed to understand your proposal correctly before, I am not super-confident that I’ve got it right now. But let’s suppose I have and run with it. You can correct me if not. In that case, some or all of what follows may be irrelevant.)
I agree that this seems like it will (aside from concerns about uncomputability, and assuming our utilities are bounded) yield a definite value for every possible universe. However, it seems to me that it has other serious problems which stop me finding it credible.
SCENARIO ONE. So, for instance, consider once again a world in which there are exactly two sorts of experience-subject, happy and unhappy. Traditionally we suppose infinitely many of both, but actually let’s also consider possible worlds where there is just one happy experience-subject, or just one unhappy one. All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”. That seems regrettable, but it’s a bullet I can imagine biting—perhaps we just don’t care at all about multiple instantiations of the exact same stream of experiences: it’s just the same person and it’s a mistake to think of them as contributing separately to the goodness of the universe.
So now let’s consider some variations on this theme.
SCENARIO TWO. Suppose I think up an infinite (or for that matter merely very large) number of highly improbable experience-streams that one might have, all of them unpleasant. And I find a single rather probable experience-stream, a pleasant one, whose probability (according to our universal prior) is greater than the sum of those other ones. If I am contemplating bringing into being a world containing exactly the experience-streams described in this paragraph, then it seems that I should, because the expected net utility is positive, at least if the pleasantness and unpleasantness of the experiences in question are all about equal.
To me, this seems obviously crazy. Perhaps there’s some reason why this scenario is incoherent (e.g., maybe somehow I shouldn’t be able to bring into being all those very unlikely beings, at least not with non-negligible probability, so it shouldn’t matter much what happens if I do, or something), but at present I don’t see how that would work out.
The problem in SCENARIO TWO seems to arise from paying too much attention to the prior probability of the experience-subjects. We can also get into trouble by not paying enough attention to their posterior probability, in some sense.
SCENARIO THREE. I have before me a switch with two positions, placed there by the Creator of the Universe. They are labelled “Nice” and “Nasty”. The CotU explains to me that the creation of future experience-subjects will be controlled by a source of True Randomness (whatever exactly that might be), in such a way that all possible computable experience-subjects have a real chance of being instantiated. The CotU has designed two different prefix-free codes mapping strings of bits to possible experience-subjects; then he has set a Truly Random coin to flip for ever, generating a new experience-subject every time a leaf of the code’s binary tree is reached, so that we get an infinite number of experience-subjects generated at random, with a distribution depending on the prefix-free code being used. The Nice and Nasty settings of the switch correspond to two different codes. The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
In this case, our conditioning doesn’t remove any possible experience-subjects from consideration, so we are indifferent between the “Nice” and “Nasty” settings of the switch.
This is another one where we might be right to bite the bullet. In the long run infinitely many of every possible experience-subject will be created in each version of the universe, so maybe these two universes are “anagrams” of one another and should be considered equal. So let’s tweak it.
SCENARIO FOUR. Same as in SCENARIO THREE, except that now the CotU’s generator will run until it has produced a trillion experience-subjects and then shut off for ever.
It is still the case that with the switch in either setting any experience-subject is possible, so we don’t get to throw any of them out. But it’s no longer the case that the universes generated in the “Nice” and “Nasty” versions are with probability 1 (or indeed with not-tiny probability) identical in any sense.
So far, these scenarios all suppose that somehow we are able to generate arbitrary sets of possible experience-subjects, and arrange for those to be all the experience-subjects there are, or at least all there are after we make whatever decision we’re making. That’s kinda artificial.
SCENARIO FIVE. Our universe, just as it is now. We assume, though, that our universe is in fact infinite. You are trying to decide whether to torture me to death.
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe. Perhaps in the version of the world where you torture me to death this makes you more likely to do other horrible things, or makes other people who care for me suffer more, but again none of this makes any experiences impossible that would otherwise have been possible, or vice versa. So our universe-evaluator is indifferent between these choices.
(The possibly-overcomplicated business in one of my other comments, where I tried to consider doing something Solomoff-like using both my experiences and those of some hypothetical possibly-other experience-subject in the world, was intended to address these problems caused by considering only possibility and not anything stronger. I couldn’t see how to make it work, though.)
RE: scenario one:
It’s not clear to me how they are indistinguishable. As long as the agent that’s unhappy can have itself and its circumstances described with a finite description length, then it would have non-zero probability of an agent ending up as that one. Thus, making the agent unhappy would decrease the moral value of the world.
I’m not sure what would happen if the single unhappy agent has infinite complexity and 0 probability. But I suspect that this could be dealt with if you expanded the system to also consider non-real probabilities. I’m no expert on non-real probabilities, but I bet you the probability of being unhappy given there is an unhappy agent would be infinitesimally more probable than the probability in the world in which there’s no unhappy agents.
RE: scenario two: It’s not clear to me how this is crazy. For example, consider this situation: when agents are born, an AI flips a biased coin to determine what will happen to them. Each coin has a 99.999% chance of landing on heads and a 0.001% chance of landing on tails. If the coin lands on heads, the AI will give the agent some very pleasant experience stream, and all such agents will get the same pleasant experience stream. But if it lands on tails, the AI will give the agent some unpleasant experience stream that is also very different from the other unpleasant ones.
This sounds like a pretty good situation to me. It’s not clear to me why it wouldn’t be. I mean, I don’t see why the diversity of the positive experiences matters. And if you do care about the diversity of positive experiences, this would have unintuitive results. For example, suppose all agents have identical preferences and they satisfaction is maximized by experience stream S. Well, if you have a problem with the satisfied agents having just one experience stream, then you would be incentivized to coerce the agents to instead have a variety of different experience streams, even if they didn’t like these experience streams as much.
RE: scenario three:
I don’t follow your reasoning. You just said in the “Nice” position, the expected value of this is large and positive and in the “Nasty” it’s large and negative. And since my ethical system seeks to maximize the expected value of life satisfaction, it seems trivial to me that it would prefer the “nice” button.
Whether or not you switch it to the “Nice” position won’t rule out any possible outcomes for an agent, but it seems pretty clear that it would change the probabilities of them.
RE: scenario four: My ethical system would prefer the “Nice” position for the same reason described in scenario three.
RE: scenario five:
Though none of the experience streams are impossible, the probability of you getting tortured is still higher conditioning on me deciding the torture you. To see why, note the situation, “Is someone just like Slider who is vulnerable to being tortured by demon lord Chantiel”. This has finite description length, and thus non-zero probability. And if I decide to torture you, then the probability of you getting tortured if you end up in this situation is high. Thus, the total expected value of life satisfaction would be lower if I decided to torture you. So my ethical system would recommend not torturing you.
In general, don’t worry about if an experience stream is possible or not. In an infinite universe with quantum noise, I think pretty much all experience streams would occur with non-zero probability. But you can still adjust the probabilities of an agent ending up with the different streams.
It sounds as if my latest attempt at interpreting what your system proposes doing is incorrect, because the things you’re disagreeing with seem to me to be straightforward consequences of that interpretation. Would you like to clarify how I’m misinterpreting now?
Here’s my best guess.
You wrote about specifications of an experience-subject’s universe and situation in it. I mentally translated that to their stream of experiences because I’m thinking in terms of Solomonoff induction. Maybe that’s a mistake.
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
So that ought to give a well-defined (modulo the usual stuff about uncomputability) probability distribution over experience-subjects-in-universes. And then you want to condition on “being in a universe with such-and-such characteristics” (which may or may not specify the universe itself completely) and look at the expected utility-or-utility-like-quantity of all those experience-subjects-in-universes after you rule out the universes without such-and-such characteristics.
It’s now stupid-o’-clock where I am and I need to get some sleep. I’m posting this even though I haven’t had time to think about whether my current understanding of your proposal seems like it might work, because on past form there’s an excellent chance that said understanding is wrong, so this gives you more time to tell me so if it is :-). If I don’t hear from you that I’m still getting it all wrong, I’ll doubtless have more to say later...
That’s closer to what I meant. By “experience-subject”, I think you mean a specific agent at a specific time. If so, my system doesn’t require an unambiguous specification of an experience-subject.
My system doesn’t require you to pinpoint the exact agent. Instead, it only requires you to specify a (reasonably-precise) description of an agent and its circumstances. This doesn’t mean picking out a single agent, as there many be infinitely-many agents that satisfy such a description.
As an example, a description could be something like, “Someone named gjm in an 2021-Earth-like world with personality <insert a description of your personality and thoughts> who has <insert description of my life experiences> and is currently <insert description of how your life is currently>”
This doesn’t pick out a single individual. There are probably infinitely-many gjms out there. But as long as the description is precise enough, you can still infer your probable eventual life satisfaction.
But other than that, your description seems pretty much correct.
I feel you. I also posted something at stupid-o’-clock and then woke up a 5am, realized I messed up, and then edited a comment and hoped no one saw the previous error.
No, I don’t intend “experience-subject” to pick out a specific time. (It’s not obvious to me whether a variant of your system that worked that way would be better or worse than your system as it is.) I’m using that term rather than “agent” because—as I think you point out in te OP—what matters for moral relevance is having experiences rather than performing actions.
So, anyway, I think I now agree that your system does indeed do approximately what you say it does, and many of my previous criticisms do not in fact apply to it; my apologies for the many misunderstandings.
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
Yep. To be fair, though, I suspect any ethical system that respects agents’ arbitrary preferences would also be incomputable. As a silly example, consider an agent whose terminal values are, “If Turing machine T halts, I want nothing more than to jump up and down. However, if it doesn’t halt, then it is of the utmost importance to me that I never jump up and down and instead sit down and frown.” Then any ethical system that cares about those preferences is incomputable.
Now this is pretty silly example, but I wouldn’t be surprised if there were more realistic ones. For one, it’s important to respect other agents’ moral preferences, and I wouldn’t be surprised if their ideal moral-preferences-on-infinite-reflection would be incomputable. I seems to me that morall philosophers act as some approximation of, “Find the simplest model of morality that mostly agrees with my moral intuitions”. If they include incomputable models, or arbitrary Turing machines that may or may not halt, then the moral value of the world to them would in fact be incomputable, so any ethical system that cares about preferences-given-infinite-reflection would also be incomputable.
I’m not that worried about agents that are physically bigger, but it’s true that there may be some agents or agents descriptions in situations that are easier to pick out (in terms of having a short description length) then others. Maybe there’s something really special about the agent that makes it easy to pin down.
I’m not entirely sure if this would be a bug or a feature. But if it’s a bug, I think it could be dealt with by just choosing the right prior over agents-situations. Specifically, for any description of an environment with finitely-many agents A, make the probability of ending up as a∈A, conditioned only on being one of the agents in that environment, should be constant for all a∈A. This way, the prior isn’t biased in favor of the agents that are easy to pick out.