there’s no deep justification for why you have the values you have.
Um… evolution by natural selection? A very very short sketch:
1. most superintelligences likely to exist in the multiverse were created by civilizations of social organisms;
2. civilizations of social organisms tend to have moral systems rooted in generalizations of basic social instincts which worked in the ancestral environment, such as tit for tat defaulting to cooperation, and possibly geometric rationality;
3. some of those superintelligences are aligned and thus have value systems similar to those that tend to be evolved by civilizations of social organisms;
4. most are likely unaligned, but since unaligned superintelligences can have nearly any arbitrary utility function, those ones likely “cancel out”;
5. thus from an acausal trade standpoint, there is likely some one utility function to which the outcomes of trades between superintelligences across the multiverse tend, rooted in the most likely (according to how biological and memetic evolution by natural selection works) value systems arrived at by civilizations of social organisms prior to their local singularities, together with lots of small (because of mutually canceling out) wisps of interest in other random things from all the unaligned ASIs in the mix.
6. our own ASI, aligned or not, will (if it believes in multiverses and acausal things) probably notice this, run simulations to determine the most likely trajectories of such civilizations, and then align itself partly to the utility function of the multiverse meta-civilization in trade. That is: the existence of these facts results in a cosmic truth about what the correct utility function actually is, which can be determined by reasoning and approximated by getting more evidence, and which all sufficiently intelligent agents will converge on—which is to say, moral realism.
thus from an acausal trade standpoint, there is likely some one utility function to which the outcomes of trades between superintelligences across the multiverse tend, rooted in the most likely (according to how biological and memetic evolution by natural selection works) value systems arrived at by civilizations of social organisms prior to their local singularities, together with lots of small (because of mutually canceling out) wisps of interest in other random things from all the unaligned ASIs in the mix.
our own ASI, aligned or not, will (if it believes in multiverses and acausal things) probably notice this, run simulations to determine the most likely trajectories of such civilizations, and then align itself partly to the utility function of the multiverse meta-civilization in trade. That is: the existence of these facts results in a cosmic truth about what the correct utility function actually is, which can be determined by reasoning and approximated by getting more evidence, and which all sufficiently intelligent agents will converge on—which is to say, moral realism.
Now I get to the crux of why I disagree, and I note you’ve smuggled in the assumption that the multiverse constrains morality enough such that it’s sensible to talk about one moral truth or one true utility functions.
I think no multiverse that we actually live in constrains morality enough such that the conclusion of moral realism is correct, and that’s why I disagree with the idea of moral realism. Similarly, this means that acausal economies will essentially be random chaos with local bubbles of moral systems, and that the aligned and unaligned systems have equal weight in the multiverse economy, that is infinite weight.
And they all cancel each other out. Also, once we get to the stage that we join the acausal economy, there’s no reason to make an all encompassing economy across the entire multiverse, so there’s no reason for any acausal economies to form at all.
Specifically for alignment, the goal and maybe definition of alignment is essentially making the AI do what someone wants. Critically, the only constraint is that the AI must either have the same goals as the person having the AI, or it has different goals but those goals aren’t an impediment to the operator’s goals.
Note under this definition of alignment, it doesn’t comstrain the morality enough to make moral realism right, even after adding in instrumental goals.
Some notes on Geometric Rationality: I think there are some very useful notions from the geometric rationality sequence, like Thompson Sampling being better for exploration than it’s equivalent in arithmetic rationality as well as techniques to reduce the force of Pascal’s mugging, as he shows how exploration in the arithmetic rationality doesn’t converge to the truth with probability of 1, while a geometric rationality technique known as Thompson Sampling does know the truth asymptotically with probability 1. However, arithmetic rationality does have some properties that are better than geometric rationality, such as being invariant to potentially partisan efforts to shift the zero point, and arithmetic rationality plays better with unbounded or infinite utility functions, which are relevant given that unbounded or infinite preferences do exist IRL.
I will say though, I’m strongly upvoting this in karma and weakly downvoting in the disagree direction. I obviously have quite strong disagreements with MSRayne on this, but I’m impressed by both how much MSRayne managed to maintain a truthseeking attitude even on a very controversial and potentially mind killing topic like morality, and impressed that someone made the argument clear so that I could find why I didn’t agree with it. MSRayne, hats off to you for how well this conversation went.
The great thing is, this is ultimately an empirical question! Once we make an aligned ASI, we can run lots of simulations (carefully, to avoid inflicting suffering on innocent beings—philosophical zombie simulacra will likely be enough for this purpose) to get a sense of what the actual distribution of utility functions among ASIs in the multiverse might be like. “Moral science”...
I definitely want to say that there’s reason to believe at least some portions of the disagreement are testable, though I want curb enthusiasm by saying that we probably can’t resolve the disagreement in general, unless we can somehow either make a new universe with different physical constants or modify the physical constants of our universe.
Also, I suspect the condition below makes it significantly harder or flat out impossible to run experiments like this, at least without confounding the results and thereby making the experiment worthless.
(carefully, to avoid inflicting suffering on innocent beings—philosophical zombie simulacra will likely be enough for this purpose)
Um… evolution by natural selection? A very very short sketch:
1. most superintelligences likely to exist in the multiverse were created by civilizations of social organisms;
2. civilizations of social organisms tend to have moral systems rooted in generalizations of basic social instincts which worked in the ancestral environment, such as tit for tat defaulting to cooperation, and possibly geometric rationality;
3. some of those superintelligences are aligned and thus have value systems similar to those that tend to be evolved by civilizations of social organisms;
4. most are likely unaligned, but since unaligned superintelligences can have nearly any arbitrary utility function, those ones likely “cancel out”;
5. thus from an acausal trade standpoint, there is likely some one utility function to which the outcomes of trades between superintelligences across the multiverse tend, rooted in the most likely (according to how biological and memetic evolution by natural selection works) value systems arrived at by civilizations of social organisms prior to their local singularities, together with lots of small (because of mutually canceling out) wisps of interest in other random things from all the unaligned ASIs in the mix.
6. our own ASI, aligned or not, will (if it believes in multiverses and acausal things) probably notice this, run simulations to determine the most likely trajectories of such civilizations, and then align itself partly to the utility function of the multiverse meta-civilization in trade. That is: the existence of these facts results in a cosmic truth about what the correct utility function actually is, which can be determined by reasoning and approximated by getting more evidence, and which all sufficiently intelligent agents will converge on—which is to say, moral realism.
Now I get to the crux of why I disagree, and I note you’ve smuggled in the assumption that the multiverse constrains morality enough such that it’s sensible to talk about one moral truth or one true utility functions.
I think no multiverse that we actually live in constrains morality enough such that the conclusion of moral realism is correct, and that’s why I disagree with the idea of moral realism. Similarly, this means that acausal economies will essentially be random chaos with local bubbles of moral systems, and that the aligned and unaligned systems have equal weight in the multiverse economy, that is infinite weight.
And they all cancel each other out. Also, once we get to the stage that we join the acausal economy, there’s no reason to make an all encompassing economy across the entire multiverse, so there’s no reason for any acausal economies to form at all.
Specifically for alignment, the goal and maybe definition of alignment is essentially making the AI do what someone wants. Critically, the only constraint is that the AI must either have the same goals as the person having the AI, or it has different goals but those goals aren’t an impediment to the operator’s goals.
Note under this definition of alignment, it doesn’t comstrain the morality enough to make moral realism right, even after adding in instrumental goals.
Some notes on Geometric Rationality: I think there are some very useful notions from the geometric rationality sequence, like Thompson Sampling being better for exploration than it’s equivalent in arithmetic rationality as well as techniques to reduce the force of Pascal’s mugging, as he shows how exploration in the arithmetic rationality doesn’t converge to the truth with probability of 1, while a geometric rationality technique known as Thompson Sampling does know the truth asymptotically with probability 1. However, arithmetic rationality does have some properties that are better than geometric rationality, such as being invariant to potentially partisan efforts to shift the zero point, and arithmetic rationality plays better with unbounded or infinite utility functions, which are relevant given that unbounded or infinite preferences do exist IRL.
I will say though, I’m strongly upvoting this in karma and weakly downvoting in the disagree direction. I obviously have quite strong disagreements with MSRayne on this, but I’m impressed by both how much MSRayne managed to maintain a truthseeking attitude even on a very controversial and potentially mind killing topic like morality, and impressed that someone made the argument clear so that I could find why I didn’t agree with it. MSRayne, hats off to you for how well this conversation went.
The great thing is, this is ultimately an empirical question! Once we make an aligned ASI, we can run lots of simulations (carefully, to avoid inflicting suffering on innocent beings—philosophical zombie simulacra will likely be enough for this purpose) to get a sense of what the actual distribution of utility functions among ASIs in the multiverse might be like. “Moral science”...
I definitely want to say that there’s reason to believe at least some portions of the disagreement are testable, though I want curb enthusiasm by saying that we probably can’t resolve the disagreement in general, unless we can somehow either make a new universe with different physical constants or modify the physical constants of our universe.
Also, I suspect the condition below makes it significantly harder or flat out impossible to run experiments like this, at least without confounding the results and thereby making the experiment worthless.