I don’t think I understand why your system doesn’t require something along the lines of choosing a uniformly-random agent or place. Not necessarily exactly either of those things, but something of that kind. You said, in OP:
suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe.
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
If I understand your comment correctly, you want to deal with that by picking a random description of a situation in the universe, which is just a random bit-string with some constraints on it, which you presumably do in something like the same way as choosing a random program when doing Solomonoff induction: cook up a prefix-free language for describing situations-in-the-universe, generate a random bit-string with each bit equally likely 0 or 1, and see what situation it describes.
But now everything depends on the details of how descriptions map to actual situations, and I don’t see any canonical way to do that or any anything-like-canonical way to do it. (Compare the analogous issue with Solomonoff induction. There, everything depends on the underlying machine, but one can argue at-least-kinda-plausibly that if we consider “reasonable” candidates, the differences between them will quickly be swamped by all the actual evidence we get. I don’t see anything like that happening here. What am I missing?
Your example with an AI generating people with a PRNG is, so far as it goes, fine. But the epistemic situation one needs to be in for that example to be relevant seems to me incredibly different from any epistemic situation anyone is ever really in. If our universe is running on a computer, we don’t know what computer or what program or what inputs produced it. We can’t do anything remotely like putting a uniform distribution on the internal states of the machine.
Further, your AI/PRNG example is importantly different from the infinitely-many-random-people example on which it’s based. You’re supposing that your AI’s PRNG has an internal state you can sample from uniformly at random! But that’s exactly the thing we can’t do in the randomly-generated-people example.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
(Of course, in something that’s less of a toy model, the arrangement of people can matter a lot. It’s nice to be near to friends and far from enemies, for instance. But of course that isn’t what we’re talking about here; when we rearrange the people we do so in a way that preserves all their experiences and their level of happiness.)
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
So, remember, the moral value of the universe according to my ethical system depends on P(I’ll be satisfied | I’m some creature in this universe).
There must be some reasonable way to calculate this. And one that doesn’t rely on impossibly taking a uniform sample from a set that has none. Now, we haven’t fully formalized reasoning and priors yet. But there is some reasonable prior probability distribution over situations you could end up in. And after that you can just do a Bayesian update on the evidence “I’m in universe x”.
I mean, imagine you had some superintelligent AI that takes evidence and outputs probability distributions. And you provide the AI with evidence about what the universe it’s in is like, without letting it know anything about the specific circumstances it will end up in. There must be some reasonable probability for the AI to assign to outcomes. If there isn’t, then that means whatever probabilistic reasoning system the AI uses must be incomplete.
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
I’m surprised you said this and interested in why. Could you explain what probability you would assign to being happy in that universe?
I mean, conditioning on being in that universe, I’m really not sure what else I would do. I know that I’ll end up with my happiness determined by some AI with a pseudorandom number generator. And I have no idea what the internal state of the random number generator will be. In Bayesian probability theory, the standard way to deal with this is to take a maximum entropy (i.e. uniform in this case) distribution over the possible states. And such a distribution would imply that I’d be happy with probability 99.9%. So that’s how I would reason about my probability of happiness using conventional probability theory.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
I’m not entirely sure how my system would evaluate this universe, but that’s due to my own uncertainty about what specific prior to use and its implications.
But I’ll take a stab at it. I see the counter alternates through periods of making happy people and periods of making unhappy people. I have no idea which period I’d end up being in, so I think I’d use the principle of indifference to assign probability 0.5 to both. If I’m in the happy period, then I’d end up happy, and if I’m in the unhappy period, I’d end up unhappy. So I’d assign probability approximately 0.5 to ending up happy.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Oh, I had in mind that the internal state of the pseudorandom number generator was finite, and that each pseudorandom number generator was only used finitely-many times. For example, maybe each AI on its world had its own pseudorandom number generator.
And I don’t see how else I could interpret this. I mean, if the pseudorandom number generator is used infinitely-many times, then it couldn’t have outputted “happy” 99.9% of the time and “unhappy” 0.1% of the time. With infinitely-many outputs, it would output “happy” infinitely-many times and output “unhappy” infinitely-many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
Yep. And I don’t think there’s any way around this. When talking about infinite ethics, we’ve had in mind a canonically infinite universe: one that, for every level of happiness, suffering, satisfaction, and dissatisfaction, there exists infinite many agents with that level. It looks like this is the sort of universe we’re stuck in.
So then there’s no difference in terms of moral value of two canonically-infinite universes except the patterning of value. So if you want to compare the moral value of two canonically-infinite universes, there’s just nothing you can do except to consider the patterning of values. That is, unless you want to consider any two canonically-infinite universes to be of equivalent moral value, which doesn’t seem like an intuitively desirable idea.
The problem with some of the other infinite ethical systems I’ve seen is that they would morally recommend redistributing unhappy agents extremely thinly in the universe, rather than actually try to make them happy, provided this was easier. As discussed in my article, my ethical system provides some degree of defense against this, which seems to me like a very important benefit.
There must be some reasonable way to calculate this.
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
Does whatever argument or intuition leads you to say that there must be a reasonable way to calculate Pr(X is satisfied | X is a being in universe U) also tell you that there must be a reasonable way to calculate Pr(X is even | X is a positive integer)? How about Pr(the smallest n with x ⇐ n! is even | x is a positive integer)?
I should maybe be more explicit about my position here. Of course there are ways to give a meaning to such expressions. For instance, we can suppose that the integer n occurs with probability 2^-n, and then e.g. if I’ve done my calculations right then the second probability is the sum of 2^-0! + (2^-2!-2^-3!) + (2^-4!-2^-5!) + … which presumably doesn’t have a nice closed form (it’s transcendental for sure) but can be calculated to high precision very easily. But that doesn’t mean that there’s and such thing as the way to give meaning to such an expression. We could use some other sequence of weights adding up to 1 instead of the powers of 1⁄2, for instance, and we would get a substantially different answer. And if the objects of interest to us were beings in universe U rather than positive integers, they wouldn’t come equipped with a standard order to look at them in.
Why should we expect there to be a well-defined answer to the question “what fraction of these beings are satisfied”?
Could you explain what probability you would assign to being happy in that universe?
No, because I do not assign any probability to being happy in that universe. I don’t know a good way to assign such probabilities and strongly suspect that there is none.
You suggest doing maximum entropy on the states of the pseudorandom random number generator being used by the AI making this universe. But when I was describing that universe I said nothing about AIs and nothing about pseudorandom number generators. If I am contemplating being in such a universe, then I don’t know how the universe is being generated, and I certainly don’t know the details of any pseudorandom number generator that might be being used.
Suppose there is a PRNG, but an infinite one somehow, and suppose its state is a positive integer (of arbitrary size). (Of course this means that the universe is not being generated by a computing device of finite capabilities. Perhaps you want to exclude such possibilities from consideration, but if so then you might equally well want to exclude infinite universes from consideration: a finite machine can’t e.g. generate a complete description of what happens in an infinite universe. If you’re bothering to consider infinite universes at all, I think you should also be considering universes that aren’t generated by finite computational processes.)
Well, in this case there is no uniform prior over the states of the PRNG. OK, you say, let’s take the maximum-entropy prior instead. That would mean (p_k) minimizing sum p_k log p_k subject to the sum of p_k being 1. Unfortunately there is no such (p_k). If we take p_k = 1/n for k=1..n and 0 for larger k, the sum is log 1/n which → -oo as n → oo. In other words, we can make the entropy of (p_k) as large as we please.
You might suppose (arbitrarily, it seems to me) that the integer that’s the state of our PRNG is held in an infinite sequence of bits, and choose each bit at random. But then with probability 1 you get an impossible state of the RNG, and for all we know the AI’s program might look like “if PRNG state is a finite positive integer, use it to generate a number between 0 and 1 and make our being happy if that number is ⇐ 0.999; if PRNG state isn’t a finite positive integer, put our being in hell”.
I mean, if the pseudorandom number generator is used infinitely many times, then [...] it would output “happy” infinitely many times and output “unhappy” infinitely many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Yes, exactly! When I described this hypothetical world, I didn’t say “the probability that a being in it is happy is 99.9%”. I said “a biased coin-flip determines the happiness of each being in it, choosing ‘happy’ with probability 99.9%”. Or words to that effect. This is, so far as I can see, a perfectly coherent (albeit partial!) specification of a possible world. And it does indeed have the property that “the probability that a being in it is happy” is not well defined.
This doesn’t mean the scenario is improper somehow. It means that any ethical (or other) system that depends on evaluating such probabilities will fail when presented with such a universe. Or, for that matter, pretty much any universe with infinitely many beings in it.
there’s just nothing you can do except to consider the patterning of values.
But then I don’t see that you’ve explained how your system considers the patterning of values. In the OP you just talk about the probability that a being in such-and-such a universe is satisfied; and that probability is typically not defined. Here in the comments you’ve been proposing something involving knowing the PRNG used by the AI that generated the universe, and sampling randomly from the outputs of that PRNG; but (1) this implies being in an epistemic situation completely unlike any that any real agent is ever in, (2) nothing like this can work (so far as I can see) unless you know that the universe you’re considering is being generated by some finite computational process, and if you’re going to assume that you might as well assume a finite universe to begin with and avoid having to deal with infinite ethics at all, (3) I don’t understand how your “look at the AI’s PRNG” proposal generalizes to non-toy questions, and (4) even if (1-3) are resolved somehow, it seems like it requires a literally infinite amount of computation to evaluate any given universe. (Which is especially problematic when we are assuming we are in a universe generated by a finite computational process.)
You say, “There must be some reasonable way to calculate this.”
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
To use probability theory to form accurate beliefs, we need a prior. I didn’t think this was controversial. And if you have a prior, as far as I can tell, you can then compute Pr(I’m satisfied | I’m some being in such-and-such a universe) by simply updating on “I’m some being in such-and-such a universe” using Bayes’ theorem.
That is, you need to have some prior probability distribution over concrete specifications of the universe you’re in and your situation in it. Now, to update on “I’m some being in such-and-such a universe”, just look at each concrete possible situation-and-universe and assign P(“I’m some being in such-and-such a universe” | some concrete hypothesis) to 0 if the hypothesis specifies you’re in some universe other than the such-and-such universe. And set this probability is 1 if it does specify you are in such a universe. As long as the possible universes are specified sufficiently precisely, then I don’t see why you couldn’t do this.
OK, so I think I now understand your proposal better than I did.
So if I’m contemplating making the world be a particular way, you then propose that I should do the following calculation (as always, of course I can’t do it because it’s uncomputable, but never mind that):
Consider all possible computable experience-streams that a subject-of-experiences could have.
Consider them, specifically, as being generated by programs drawn from a universal distribution.
Condition on being in the world that’s the particular way I’m contemplating making it—that is, discard experience-streams that are literally inconsistent with being in that world.
We now have a probability distribution over experience-streams. Compute a utility for each, and take its expectation.
And now we compare possible universes by comparing this expected utility.
(Having failed to understand your proposal correctly before, I am not super-confident that I’ve got it right now. But let’s suppose I have and run with it. You can correct me if not. In that case, some or all of what follows may be irrelevant.)
I agree that this seems like it will (aside from concerns about uncomputability, and assuming our utilities are bounded) yield a definite value for every possible universe. However, it seems to me that it has other serious problems which stop me finding it credible.
SCENARIO ONE. So, for instance, consider once again a world in which there are exactly two sorts of experience-subject, happy and unhappy. Traditionally we suppose infinitely many of both, but actually let’s also consider possible worlds where there is just one happy experience-subject, or just one unhappy one. All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”. That seems regrettable, but it’s a bullet I can imagine biting—perhaps we just don’t care at all about multiple instantiations of the exact same stream of experiences: it’s just the same person and it’s a mistake to think of them as contributing separately to the goodness of the universe.
So now let’s consider some variations on this theme.
SCENARIO TWO. Suppose I think up an infinite (or for that matter merely very large) number of highly improbable experience-streams that one might have, all of them unpleasant. And I find a single rather probable experience-stream, a pleasant one, whose probability (according to our universal prior) is greater than the sum of those other ones. If I am contemplating bringing into being a world containing exactly the experience-streams described in this paragraph, then it seems that I should, because the expected net utility is positive, at least if the pleasantness and unpleasantness of the experiences in question are all about equal.
To me, this seems obviously crazy. Perhaps there’s some reason why this scenario is incoherent (e.g., maybe somehow I shouldn’t be able to bring into being all those very unlikely beings, at least not with non-negligible probability, so it shouldn’t matter much what happens if I do, or something), but at present I don’t see how that would work out.
The problem in SCENARIO TWO seems to arise from paying too much attention to the prior probability of the experience-subjects. We can also get into trouble by not paying enough attention to their posterior probability, in some sense.
SCENARIO THREE. I have before me a switch with two positions, placed there by the Creator of the Universe. They are labelled “Nice” and “Nasty”. The CotU explains to me that the creation of future experience-subjects will be controlled by a source of True Randomness (whatever exactly that might be), in such a way that all possible computable experience-subjects have a real chance of being instantiated. The CotU has designed two different prefix-free codes mapping strings of bits to possible experience-subjects; then he has set a Truly Random coin to flip for ever, generating a new experience-subject every time a leaf of the code’s binary tree is reached, so that we get an infinite number of experience-subjects generated at random, with a distribution depending on the prefix-free code being used. The Nice and Nasty settings of the switch correspond to two different codes. The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
In this case, our conditioning doesn’t remove any possible experience-subjects from consideration, so we are indifferent between the “Nice” and “Nasty” settings of the switch.
This is another one where we might be right to bite the bullet. In the long run infinitely many of every possible experience-subject will be created in each version of the universe, so maybe these two universes are “anagrams” of one another and should be considered equal. So let’s tweak it.
SCENARIO FOUR. Same as in SCENARIO THREE, except that now the CotU’s generator will run until it has produced a trillion experience-subjects and then shut off for ever.
It is still the case that with the switch in either setting any experience-subject is possible, so we don’t get to throw any of them out. But it’s no longer the case that the universes generated in the “Nice” and “Nasty” versions are with probability 1 (or indeed with not-tiny probability) identical in any sense.
So far, these scenarios all suppose that somehow we are able to generate arbitrary sets of possible experience-subjects, and arrange for those to be all the experience-subjects there are, or at least all there are after we make whatever decision we’re making. That’s kinda artificial.
SCENARIO FIVE. Our universe, just as it is now. We assume, though, that our universe is in fact infinite. You are trying to decide whether to torture me to death.
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe. Perhaps in the version of the world where you torture me to death this makes you more likely to do other horrible things, or makes other people who care for me suffer more, but again none of this makes any experiences impossible that would otherwise have been possible, or vice versa. So our universe-evaluator is indifferent between these choices.
(The possibly-overcomplicated business in one of my other comments, where I tried to consider doing something Solomoff-like using both my experiences and those of some hypothetical possibly-other experience-subject in the world, was intended to address these problems caused by considering only possibility and not anything stronger. I couldn’t see how to make it work, though.)
All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”
It’s not clear to me how they are indistinguishable. As long as the agent that’s unhappy can have itself and its circumstances described with a finite description length, then it would have non-zero probability of an agent ending up as that one. Thus, making the agent unhappy would decrease the moral value of the world.
I’m not sure what would happen if the single unhappy agent has infinite complexity and 0 probability. But I suspect that this could be dealt with if you expanded the system to also consider non-real probabilities. I’m no expert on non-real probabilities, but I bet you the probability of being unhappy given there is an unhappy agent would be infinitesimally more probable than the probability in the world in which there’s no unhappy agents.
RE: scenario two:
It’s not clear to me how this is crazy. For example, consider this situation: when agents are born, an AI flips a biased coin to determine what will happen to them. Each coin has a 99.999% chance of landing on heads and a 0.001% chance of landing on tails. If the coin lands on heads, the AI will give the agent some very pleasant experience stream, and all such agents will get the same pleasant experience stream. But if it lands on tails, the AI will give the agent some unpleasant experience stream that is also very different from the other unpleasant ones.
This sounds like a pretty good situation to me. It’s not clear to me why it wouldn’t be. I mean, I don’t see why the diversity of the positive experiences matters. And if you do care about the diversity of positive experiences, this would have unintuitive results. For example, suppose all agents have identical preferences and they satisfaction is maximized by experience stream S. Well, if you have a problem with the satisfied agents having just one experience stream, then you would be incentivized to coerce the agents to instead have a variety of different experience streams, even if they didn’t like these experience streams as much.
RE: scenario three:
The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
I don’t follow your reasoning. You just said in the “Nice” position, the expected value of this is large and positive and in the “Nasty” it’s large and negative. And since my ethical system seeks to maximize the expected value of life satisfaction, it seems trivial to me that it would prefer the “nice” button.
Whether or not you switch it to the “Nice” position won’t rule out any possible outcomes for an agent, but it seems pretty clear that it would change the probabilities of them.
RE: scenario four:
My ethical system would prefer the “Nice” position for the same reason described in scenario three.
RE: scenario five:
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe.
Though none of the experience streams are impossible, the probability of you getting tortured is still higher conditioning on me deciding the torture you. To see why, note the situation, “Is someone just like Slider who is vulnerable to being tortured by demon lord Chantiel”. This has finite description length, and thus non-zero probability. And if I decide to torture you, then the probability of you getting tortured if you end up in this situation is high. Thus, the total expected value of life satisfaction would be lower if I decided to torture you. So my ethical system would recommend not torturing you.
In general, don’t worry about if an experience stream is possible or not. In an infinite universe with quantum noise, I think pretty much all experience streams would occur with non-zero probability. But you can still adjust the probabilities of an agent ending up with the different streams.
It sounds as if my latest attempt at interpreting what your system proposes doing is incorrect, because the things you’re disagreeing with seem to me to be straightforward consequences of that interpretation. Would you like to clarify how I’m misinterpreting now?
Here’s my best guess.
You wrote about specifications of an experience-subject’s universe and situation in it. I mentally translated that to their stream of experiences because I’m thinking in terms of Solomonoff induction. Maybe that’s a mistake.
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
So that ought to give a well-defined (modulo the usual stuff about uncomputability) probability distribution over experience-subjects-in-universes. And then you want to condition on “being in a universe with such-and-such characteristics” (which may or may not specify the universe itself completely) and look at the expected utility-or-utility-like-quantity of all those experience-subjects-in-universes after you rule out the universes without such-and-such characteristics.
It’s now stupid-o’-clock where I am and I need to get some sleep. I’m posting this even though I haven’t had time to think about whether my current understanding of your proposal seems like it might work, because on past form there’s an excellent chance that said understanding is wrong, so this gives you more time to tell me so if it is :-). If I don’t hear from you that I’m still getting it all wrong, I’ll doubtless have more to say later...
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
That’s closer to what I meant. By “experience-subject”, I think you mean a specific agent at a specific time. If so, my system doesn’t require an unambiguous specification of an experience-subject.
My system doesn’t require you to pinpoint the exact agent. Instead, it only requires you to specify a (reasonably-precise) description of an agent and its circumstances. This doesn’t mean picking out a single agent, as there many be infinitely-many agents that satisfy such a description.
As an example, a description could be something like, “Someone named gjm in an 2021-Earth-like world with personality <insert a description of your personality and thoughts> who has <insert description of my life experiences> and is currently <insert description of how your life is currently>”
This doesn’t pick out a single individual. There are probably infinitely-many gjms out there. But as long as the description is precise enough, you can still infer your probable eventual life satisfaction.
But other than that, your description seems pretty much correct.
It’s now stupid-o’-clock where I am and I need to get some sleep.
I feel you. I also posted something at stupid-o’-clock and then woke up a 5am, realized I messed up, and then edited a comment and hoped no one saw the previous error.
No, I don’t intend “experience-subject” to pick out a specific time. (It’s not obvious to me whether a variant of your system that worked that way would be better or worse than your system as it is.) I’m using that term rather than “agent” because—as I think you point out in te OP—what matters for moral relevance is having experiences rather than performing actions.
So, anyway, I think I now agree that your system does indeed do approximately what you say it does, and many of my previous criticisms do not in fact apply to it; my apologies for the many misunderstandings.
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
Yep. To be fair, though, I suspect any ethical system that respects agents’ arbitrary preferences would also be incomputable. As a silly example, consider an agent whose terminal values are, “If Turing machine T halts, I want nothing more than to jump up and down. However, if it doesn’t halt, then it is of the utmost importance to me that I never jump up and down and instead sit down and frown.” Then any ethical system that cares about those preferences is incomputable.
Now this is pretty silly example, but I wouldn’t be surprised if there were more realistic ones. For one, it’s important to respect other agents’ moral preferences, and I wouldn’t be surprised if their ideal moral-preferences-on-infinite-reflection would be incomputable. I seems to me that morall philosophers act as some approximation of, “Find the simplest model of morality that mostly agrees with my moral intuitions”. If they include incomputable models, or arbitrary Turing machines that may or may not halt, then the moral value of the world to them would in fact be incomputable, so any ethical system that cares about preferences-given-infinite-reflection would also be incomputable.
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
I’m not that worried about agents that are physically bigger, but it’s true that there may be some agents or agents descriptions in situations that are easier to pick out (in terms of having a short description length) then others. Maybe there’s something really special about the agent that makes it easy to pin down.
I’m not entirely sure if this would be a bug or a feature. But if it’s a bug, I think it could be dealt with by just choosing the right prior over agents-situations. Specifically, for any description of an environment with finitely-many agents A, make the probability of ending up as a∈A, conditioned only on being one of the agents in that environment, should be constant for all a∈A. This way, the prior isn’t biased in favor of the agents that are easy to pick out.
I don’t think I understand why your system doesn’t require something along the lines of choosing a uniformly-random agent or place. Not necessarily exactly either of those things, but something of that kind. You said, in OP:
How does that cash out if not in terms of picking a random agent, or random circumstances in the universe?
If I understand your comment correctly, you want to deal with that by picking a random description of a situation in the universe, which is just a random bit-string with some constraints on it, which you presumably do in something like the same way as choosing a random program when doing Solomonoff induction: cook up a prefix-free language for describing situations-in-the-universe, generate a random bit-string with each bit equally likely 0 or 1, and see what situation it describes.
But now everything depends on the details of how descriptions map to actual situations, and I don’t see any canonical way to do that or any anything-like-canonical way to do it. (Compare the analogous issue with Solomonoff induction. There, everything depends on the underlying machine, but one can argue at-least-kinda-plausibly that if we consider “reasonable” candidates, the differences between them will quickly be swamped by all the actual evidence we get. I don’t see anything like that happening here. What am I missing?
Your example with an AI generating people with a PRNG is, so far as it goes, fine. But the epistemic situation one needs to be in for that example to be relevant seems to me incredibly different from any epistemic situation anyone is ever really in. If our universe is running on a computer, we don’t know what computer or what program or what inputs produced it. We can’t do anything remotely like putting a uniform distribution on the internal states of the machine.
Further, your AI/PRNG example is importantly different from the infinitely-many-random-people example on which it’s based. You’re supposing that your AI’s PRNG has an internal state you can sample from uniformly at random! But that’s exactly the thing we can’t do in the randomly-generated-people example.
Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn’t actually produce an inhabitant of the universe in question. You’re replacing a difficult (indeed impossible) question—“how do things go, on average, for a random person in this universe?”—with an easier but different question—“how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?”. Maybe that’s a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don’t see any obvious reason why someone who accepted your theory as you originally stated it should behave as you’re now telling them they should.
Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there’s no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, …, then n! (un)happy people, then … . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can’t do. Picking a state at random seems like it means picking a random positive integer, which again you can’t do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn’t actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it’s less than. How does your system evaluate this universe?
Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?
(Of course, in something that’s less of a toy model, the arrangement of people can matter a lot. It’s nice to be near to friends and far from enemies, for instance. But of course that isn’t what we’re talking about here; when we rearrange the people we do so in a way that preserves all their experiences and their level of happiness.)
It really should seem unreasonable to suppose that in the 99.9% universe there’s a 99.9% chance that you’ll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn’t fully grasped the fact that you can’t sample uniformly at random from an infinite population.
There must be some reasonable way to calculate this. And one that doesn’t rely on impossibly taking a uniform sample from a set that has none. Now, we haven’t fully formalized reasoning and priors yet. But there is some reasonable prior probability distribution over situations you could end up in. And after that you can just do a Bayesian update on the evidence “I’m in universe x”.
I mean, imagine you had some superintelligent AI that takes evidence and outputs probability distributions. And you provide the AI with evidence about what the universe it’s in is like, without letting it know anything about the specific circumstances it will end up in. There must be some reasonable probability for the AI to assign to outcomes. If there isn’t, then that means whatever probabilistic reasoning system the AI uses must be incomplete.
I’m surprised you said this and interested in why. Could you explain what probability you would assign to being happy in that universe?
I mean, conditioning on being in that universe, I’m really not sure what else I would do. I know that I’ll end up with my happiness determined by some AI with a pseudorandom number generator. And I have no idea what the internal state of the random number generator will be. In Bayesian probability theory, the standard way to deal with this is to take a maximum entropy (i.e. uniform in this case) distribution over the possible states. And such a distribution would imply that I’d be happy with probability 99.9%. So that’s how I would reason about my probability of happiness using conventional probability theory.
I’m not entirely sure how my system would evaluate this universe, but that’s due to my own uncertainty about what specific prior to use and its implications.
But I’ll take a stab at it. I see the counter alternates through periods of making happy people and periods of making unhappy people. I have no idea which period I’d end up being in, so I think I’d use the principle of indifference to assign probability 0.5 to both. If I’m in the happy period, then I’d end up happy, and if I’m in the unhappy period, I’d end up unhappy. So I’d assign probability approximately 0.5 to ending up happy.
Oh, I had in mind that the internal state of the pseudorandom number generator was finite, and that each pseudorandom number generator was only used finitely-many times. For example, maybe each AI on its world had its own pseudorandom number generator.
And I don’t see how else I could interpret this. I mean, if the pseudorandom number generator is used infinitely-many times, then it couldn’t have outputted “happy” 99.9% of the time and “unhappy” 0.1% of the time. With infinitely-many outputs, it would output “happy” infinitely-many times and output “unhappy” infinitely-many times, and thus the proportion it outputs “happy” or “unhappy” would be undefined.
Yep. And I don’t think there’s any way around this. When talking about infinite ethics, we’ve had in mind a canonically infinite universe: one that, for every level of happiness, suffering, satisfaction, and dissatisfaction, there exists infinite many agents with that level. It looks like this is the sort of universe we’re stuck in.
So then there’s no difference in terms of moral value of two canonically-infinite universes except the patterning of value. So if you want to compare the moral value of two canonically-infinite universes, there’s just nothing you can do except to consider the patterning of values. That is, unless you want to consider any two canonically-infinite universes to be of equivalent moral value, which doesn’t seem like an intuitively desirable idea.
The problem with some of the other infinite ethical systems I’ve seen is that they would morally recommend redistributing unhappy agents extremely thinly in the universe, rather than actually try to make them happy, provided this was easier. As discussed in my article, my ethical system provides some degree of defense against this, which seems to me like a very important benefit.
You say
(where “this” is Pr(I’m satisfied | I’m some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.
Does whatever argument or intuition leads you to say that there must be a reasonable way to calculate Pr(X is satisfied | X is a being in universe U) also tell you that there must be a reasonable way to calculate Pr(X is even | X is a positive integer)? How about Pr(the smallest n with x ⇐ n! is even | x is a positive integer)?
I should maybe be more explicit about my position here. Of course there are ways to give a meaning to such expressions. For instance, we can suppose that the integer n occurs with probability 2^-n, and then e.g. if I’ve done my calculations right then the second probability is the sum of 2^-0! + (2^-2!-2^-3!) + (2^-4!-2^-5!) + … which presumably doesn’t have a nice closed form (it’s transcendental for sure) but can be calculated to high precision very easily. But that doesn’t mean that there’s and such thing as the way to give meaning to such an expression. We could use some other sequence of weights adding up to 1 instead of the powers of 1⁄2, for instance, and we would get a substantially different answer. And if the objects of interest to us were beings in universe U rather than positive integers, they wouldn’t come equipped with a standard order to look at them in.
Why should we expect there to be a well-defined answer to the question “what fraction of these beings are satisfied”?
No, because I do not assign any probability to being happy in that universe. I don’t know a good way to assign such probabilities and strongly suspect that there is none.
You suggest doing maximum entropy on the states of the pseudorandom random number generator being used by the AI making this universe. But when I was describing that universe I said nothing about AIs and nothing about pseudorandom number generators. If I am contemplating being in such a universe, then I don’t know how the universe is being generated, and I certainly don’t know the details of any pseudorandom number generator that might be being used.
Suppose there is a PRNG, but an infinite one somehow, and suppose its state is a positive integer (of arbitrary size). (Of course this means that the universe is not being generated by a computing device of finite capabilities. Perhaps you want to exclude such possibilities from consideration, but if so then you might equally well want to exclude infinite universes from consideration: a finite machine can’t e.g. generate a complete description of what happens in an infinite universe. If you’re bothering to consider infinite universes at all, I think you should also be considering universes that aren’t generated by finite computational processes.)
Well, in this case there is no uniform prior over the states of the PRNG. OK, you say, let’s take the maximum-entropy prior instead. That would mean (p_k) minimizing sum p_k log p_k subject to the sum of p_k being 1. Unfortunately there is no such (p_k). If we take p_k = 1/n for k=1..n and 0 for larger k, the sum is log 1/n which → -oo as n → oo. In other words, we can make the entropy of (p_k) as large as we please.
You might suppose (arbitrarily, it seems to me) that the integer that’s the state of our PRNG is held in an infinite sequence of bits, and choose each bit at random. But then with probability 1 you get an impossible state of the RNG, and for all we know the AI’s program might look like “if PRNG state is a finite positive integer, use it to generate a number between 0 and 1 and make our being happy if that number is ⇐ 0.999; if PRNG state isn’t a finite positive integer, put our being in hell”.
Yes, exactly! When I described this hypothetical world, I didn’t say “the probability that a being in it is happy is 99.9%”. I said “a biased coin-flip determines the happiness of each being in it, choosing ‘happy’ with probability 99.9%”. Or words to that effect. This is, so far as I can see, a perfectly coherent (albeit partial!) specification of a possible world. And it does indeed have the property that “the probability that a being in it is happy” is not well defined.
This doesn’t mean the scenario is improper somehow. It means that any ethical (or other) system that depends on evaluating such probabilities will fail when presented with such a universe. Or, for that matter, pretty much any universe with infinitely many beings in it.
But then I don’t see that you’ve explained how your system considers the patterning of values. In the OP you just talk about the probability that a being in such-and-such a universe is satisfied; and that probability is typically not defined. Here in the comments you’ve been proposing something involving knowing the PRNG used by the AI that generated the universe, and sampling randomly from the outputs of that PRNG; but (1) this implies being in an epistemic situation completely unlike any that any real agent is ever in, (2) nothing like this can work (so far as I can see) unless you know that the universe you’re considering is being generated by some finite computational process, and if you’re going to assume that you might as well assume a finite universe to begin with and avoid having to deal with infinite ethics at all, (3) I don’t understand how your “look at the AI’s PRNG” proposal generalizes to non-toy questions, and (4) even if (1-3) are resolved somehow, it seems like it requires a literally infinite amount of computation to evaluate any given universe. (Which is especially problematic when we are assuming we are in a universe generated by a finite computational process.)
To use probability theory to form accurate beliefs, we need a prior. I didn’t think this was controversial. And if you have a prior, as far as I can tell, you can then compute Pr(I’m satisfied | I’m some being in such-and-such a universe) by simply updating on “I’m some being in such-and-such a universe” using Bayes’ theorem.
That is, you need to have some prior probability distribution over concrete specifications of the universe you’re in and your situation in it. Now, to update on “I’m some being in such-and-such a universe”, just look at each concrete possible situation-and-universe and assign P(“I’m some being in such-and-such a universe” | some concrete hypothesis) to 0 if the hypothesis specifies you’re in some universe other than the such-and-such universe. And set this probability is 1 if it does specify you are in such a universe. As long as the possible universes are specified sufficiently precisely, then I don’t see why you couldn’t do this.
OK, so I think I now understand your proposal better than I did.
So if I’m contemplating making the world be a particular way, you then propose that I should do the following calculation (as always, of course I can’t do it because it’s uncomputable, but never mind that):
Consider all possible computable experience-streams that a subject-of-experiences could have.
Consider them, specifically, as being generated by programs drawn from a universal distribution.
Condition on being in the world that’s the particular way I’m contemplating making it—that is, discard experience-streams that are literally inconsistent with being in that world.
We now have a probability distribution over experience-streams. Compute a utility for each, and take its expectation.
And now we compare possible universes by comparing this expected utility.
(Having failed to understand your proposal correctly before, I am not super-confident that I’ve got it right now. But let’s suppose I have and run with it. You can correct me if not. In that case, some or all of what follows may be irrelevant.)
I agree that this seems like it will (aside from concerns about uncomputability, and assuming our utilities are bounded) yield a definite value for every possible universe. However, it seems to me that it has other serious problems which stop me finding it credible.
SCENARIO ONE. So, for instance, consider once again a world in which there are exactly two sorts of experience-subject, happy and unhappy. Traditionally we suppose infinitely many of both, but actually let’s also consider possible worlds where there is just one happy experience-subject, or just one unhappy one. All these worlds come out exactly the same, so “infinitely many happy, one unhappy” is indistinguishable from “infinitely many unhappy, one happy”. That seems regrettable, but it’s a bullet I can imagine biting—perhaps we just don’t care at all about multiple instantiations of the exact same stream of experiences: it’s just the same person and it’s a mistake to think of them as contributing separately to the goodness of the universe.
So now let’s consider some variations on this theme.
SCENARIO TWO. Suppose I think up an infinite (or for that matter merely very large) number of highly improbable experience-streams that one might have, all of them unpleasant. And I find a single rather probable experience-stream, a pleasant one, whose probability (according to our universal prior) is greater than the sum of those other ones. If I am contemplating bringing into being a world containing exactly the experience-streams described in this paragraph, then it seems that I should, because the expected net utility is positive, at least if the pleasantness and unpleasantness of the experiences in question are all about equal.
To me, this seems obviously crazy. Perhaps there’s some reason why this scenario is incoherent (e.g., maybe somehow I shouldn’t be able to bring into being all those very unlikely beings, at least not with non-negligible probability, so it shouldn’t matter much what happens if I do, or something), but at present I don’t see how that would work out.
The problem in SCENARIO TWO seems to arise from paying too much attention to the prior probability of the experience-subjects. We can also get into trouble by not paying enough attention to their posterior probability, in some sense.
SCENARIO THREE. I have before me a switch with two positions, placed there by the Creator of the Universe. They are labelled “Nice” and “Nasty”. The CotU explains to me that the creation of future experience-subjects will be controlled by a source of True Randomness (whatever exactly that might be), in such a way that all possible computable experience-subjects have a real chance of being instantiated. The CotU has designed two different prefix-free codes mapping strings of bits to possible experience-subjects; then he has set a Truly Random coin to flip for ever, generating a new experience-subject every time a leaf of the code’s binary tree is reached, so that we get an infinite number of experience-subjects generated at random, with a distribution depending on the prefix-free code being used. The Nice and Nasty settings of the switch correspond to two different codes. The CotU has computed that with the switch in the “Nice” position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the “Nasty” position, it’s large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.
In this case, our conditioning doesn’t remove any possible experience-subjects from consideration, so we are indifferent between the “Nice” and “Nasty” settings of the switch.
This is another one where we might be right to bite the bullet. In the long run infinitely many of every possible experience-subject will be created in each version of the universe, so maybe these two universes are “anagrams” of one another and should be considered equal. So let’s tweak it.
SCENARIO FOUR. Same as in SCENARIO THREE, except that now the CotU’s generator will run until it has produced a trillion experience-subjects and then shut off for ever.
It is still the case that with the switch in either setting any experience-subject is possible, so we don’t get to throw any of them out. But it’s no longer the case that the universes generated in the “Nice” and “Nasty” versions are with probability 1 (or indeed with not-tiny probability) identical in any sense.
So far, these scenarios all suppose that somehow we are able to generate arbitrary sets of possible experience-subjects, and arrange for those to be all the experience-subjects there are, or at least all there are after we make whatever decision we’re making. That’s kinda artificial.
SCENARIO FIVE. Our universe, just as it is now. We assume, though, that our universe is in fact infinite. You are trying to decide whether to torture me to death.
So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don’t. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe. Perhaps in the version of the world where you torture me to death this makes you more likely to do other horrible things, or makes other people who care for me suffer more, but again none of this makes any experiences impossible that would otherwise have been possible, or vice versa. So our universe-evaluator is indifferent between these choices.
(The possibly-overcomplicated business in one of my other comments, where I tried to consider doing something Solomoff-like using both my experiences and those of some hypothetical possibly-other experience-subject in the world, was intended to address these problems caused by considering only possibility and not anything stronger. I couldn’t see how to make it work, though.)
RE: scenario one:
It’s not clear to me how they are indistinguishable. As long as the agent that’s unhappy can have itself and its circumstances described with a finite description length, then it would have non-zero probability of an agent ending up as that one. Thus, making the agent unhappy would decrease the moral value of the world.
I’m not sure what would happen if the single unhappy agent has infinite complexity and 0 probability. But I suspect that this could be dealt with if you expanded the system to also consider non-real probabilities. I’m no expert on non-real probabilities, but I bet you the probability of being unhappy given there is an unhappy agent would be infinitesimally more probable than the probability in the world in which there’s no unhappy agents.
RE: scenario two: It’s not clear to me how this is crazy. For example, consider this situation: when agents are born, an AI flips a biased coin to determine what will happen to them. Each coin has a 99.999% chance of landing on heads and a 0.001% chance of landing on tails. If the coin lands on heads, the AI will give the agent some very pleasant experience stream, and all such agents will get the same pleasant experience stream. But if it lands on tails, the AI will give the agent some unpleasant experience stream that is also very different from the other unpleasant ones.
This sounds like a pretty good situation to me. It’s not clear to me why it wouldn’t be. I mean, I don’t see why the diversity of the positive experiences matters. And if you do care about the diversity of positive experiences, this would have unintuitive results. For example, suppose all agents have identical preferences and they satisfaction is maximized by experience stream S. Well, if you have a problem with the satisfied agents having just one experience stream, then you would be incentivized to coerce the agents to instead have a variety of different experience streams, even if they didn’t like these experience streams as much.
RE: scenario three:
I don’t follow your reasoning. You just said in the “Nice” position, the expected value of this is large and positive and in the “Nasty” it’s large and negative. And since my ethical system seeks to maximize the expected value of life satisfaction, it seems trivial to me that it would prefer the “nice” button.
Whether or not you switch it to the “Nice” position won’t rule out any possible outcomes for an agent, but it seems pretty clear that it would change the probabilities of them.
RE: scenario four: My ethical system would prefer the “Nice” position for the same reason described in scenario three.
RE: scenario five:
Though none of the experience streams are impossible, the probability of you getting tortured is still higher conditioning on me deciding the torture you. To see why, note the situation, “Is someone just like Slider who is vulnerable to being tortured by demon lord Chantiel”. This has finite description length, and thus non-zero probability. And if I decide to torture you, then the probability of you getting tortured if you end up in this situation is high. Thus, the total expected value of life satisfaction would be lower if I decided to torture you. So my ethical system would recommend not torturing you.
In general, don’t worry about if an experience stream is possible or not. In an infinite universe with quantum noise, I think pretty much all experience streams would occur with non-zero probability. But you can still adjust the probabilities of an agent ending up with the different streams.
It sounds as if my latest attempt at interpreting what your system proposes doing is incorrect, because the things you’re disagreeing with seem to me to be straightforward consequences of that interpretation. Would you like to clarify how I’m misinterpreting now?
Here’s my best guess.
You wrote about specifications of an experience-subject’s universe and situation in it. I mentally translated that to their stream of experiences because I’m thinking in terms of Solomonoff induction. Maybe that’s a mistake.
So let’s try again. The key thing in your system is not a program that outputs a hypothetical being’s stream of experiences, it’s a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that’s probably OK.
So that ought to give a well-defined (modulo the usual stuff about uncomputability) probability distribution over experience-subjects-in-universes. And then you want to condition on “being in a universe with such-and-such characteristics” (which may or may not specify the universe itself completely) and look at the expected utility-or-utility-like-quantity of all those experience-subjects-in-universes after you rule out the universes without such-and-such characteristics.
It’s now stupid-o’-clock where I am and I need to get some sleep. I’m posting this even though I haven’t had time to think about whether my current understanding of your proposal seems like it might work, because on past form there’s an excellent chance that said understanding is wrong, so this gives you more time to tell me so if it is :-). If I don’t hear from you that I’m still getting it all wrong, I’ll doubtless have more to say later...
That’s closer to what I meant. By “experience-subject”, I think you mean a specific agent at a specific time. If so, my system doesn’t require an unambiguous specification of an experience-subject.
My system doesn’t require you to pinpoint the exact agent. Instead, it only requires you to specify a (reasonably-precise) description of an agent and its circumstances. This doesn’t mean picking out a single agent, as there many be infinitely-many agents that satisfy such a description.
As an example, a description could be something like, “Someone named gjm in an 2021-Earth-like world with personality <insert a description of your personality and thoughts> who has <insert description of my life experiences> and is currently <insert description of how your life is currently>”
This doesn’t pick out a single individual. There are probably infinitely-many gjms out there. But as long as the description is precise enough, you can still infer your probable eventual life satisfaction.
But other than that, your description seems pretty much correct.
I feel you. I also posted something at stupid-o’-clock and then woke up a 5am, realized I messed up, and then edited a comment and hoped no one saw the previous error.
No, I don’t intend “experience-subject” to pick out a specific time. (It’s not obvious to me whether a variant of your system that worked that way would be better or worse than your system as it is.) I’m using that term rather than “agent” because—as I think you point out in te OP—what matters for moral relevance is having experiences rather than performing actions.
So, anyway, I think I now agree that your system does indeed do approximately what you say it does, and many of my previous criticisms do not in fact apply to it; my apologies for the many misunderstandings.
The fact that it’s lavishly uncomputable is a problem for using it in practice, of course :-).
I have some other concerns, but haven’t given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they’re “easier to locate”, and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?
Yep. To be fair, though, I suspect any ethical system that respects agents’ arbitrary preferences would also be incomputable. As a silly example, consider an agent whose terminal values are, “If Turing machine T halts, I want nothing more than to jump up and down. However, if it doesn’t halt, then it is of the utmost importance to me that I never jump up and down and instead sit down and frown.” Then any ethical system that cares about those preferences is incomputable.
Now this is pretty silly example, but I wouldn’t be surprised if there were more realistic ones. For one, it’s important to respect other agents’ moral preferences, and I wouldn’t be surprised if their ideal moral-preferences-on-infinite-reflection would be incomputable. I seems to me that morall philosophers act as some approximation of, “Find the simplest model of morality that mostly agrees with my moral intuitions”. If they include incomputable models, or arbitrary Turing machines that may or may not halt, then the moral value of the world to them would in fact be incomputable, so any ethical system that cares about preferences-given-infinite-reflection would also be incomputable.
I’m not that worried about agents that are physically bigger, but it’s true that there may be some agents or agents descriptions in situations that are easier to pick out (in terms of having a short description length) then others. Maybe there’s something really special about the agent that makes it easy to pin down.
I’m not entirely sure if this would be a bug or a feature. But if it’s a bug, I think it could be dealt with by just choosing the right prior over agents-situations. Specifically, for any description of an environment with finitely-many agents A, make the probability of ending up as a∈A, conditioned only on being one of the agents in that environment, should be constant for all a∈A. This way, the prior isn’t biased in favor of the agents that are easy to pick out.