What you see as the factors holding back people from cooperating with modern analogues of FAI projects? Do you think those modern analogues could derive improved cooperation through broadcasting specific enfranchisement policy?
As a practical matter, it looks to me like the majority of wealthy, intelligent, rational modern folks an FAI project might want to cooperate with lean towards egalitarianism and humanism, not blues versus greens type sectarianism.
If you don’t think someone has enough political clout to bother with, they’ll be incentivized to prove you wrong. Even if you’re right most of the time, you’ll be giving yourself trouble.
I agree that very young humans are a potential difficult gray area. One possible solution is to simulate their growth into adults before computing their CEV. Presumably the age at which their growth should be simulated up to is not as controversial as who should be included.
Sorry, you’ve lost me. Can you clarify what the different arguments you refer to here are, and why the difference between them matters in this context?
FAI team trustworthiness is a different subject than optimal enfranchisement structure.
What you see as the factors holding back people from cooperating with modern analogues of FAI projects?
I’m not sure what those modern analogues are, but in general here are a few factors I see preventing people from cooperating on projects where both mutual cooperation and unilateral cooperation would be beneficial:
Simple error in calculating the expected value of cooperating.
Perceiving more value in obtaining higher status within my group by defending my group’s wrong beliefs about the project’s value than in defecting from my group by cooperating in the project
Perceiving more value in continuing to defend my previously articulated position against the project (e.g., in being seen as consistent or as capable of discharging earlier commitments) than in changing my position and cooperating in the project
Why do you ask?
Do you think those modern analogues could derive improved cooperation through broadcasting specific enfranchisement policy?
I suspect that would be an easier question to answer with anything other than “it depends” if I had a specific example to consider. In general, I expect that it depends on who is motivated to support the project now to what degree, and the specific enfranchisement policy under discussion, and what value they perceive in that policy.
As a practical matter, it looks to me like the majority of wealthy, intelligent, rational modern folks an FAI project might want to cooperate with lean towards egalitarianism and humanism, not blues versus greens type sectarianism.
Sure, that’s probably true, at least for some values of “lean towards” (there’s a lot to be said here about actual support and signaled support but I’m not sure it matters). And it will likely remain true for as long as the FAI project in question only cares about the cooperation of wealthy, intelligent, rational modern folks, which they are well advised to continuing to do for as long as FAI isn’t a subject of particular interest to anyone else, and to stop doing as soon as possible thereafter.
If you don’t think someone has enough political clout to bother with, they’ll be incentivized to prove you wrong. Even if you’re right most of the time, you’ll be giving yourself trouble.
(shrug) Sure, there’s some nonzero expected cost to the brief window between when they start proving their influence and I concede and include them.
One possible solution is to simulate their growth into adults before computing their CEV. Presumably the age at which their growth should be simulated up to is not as controversial as who should be included.
Can you clarify what the relevant difference is between including a too-young person in the target for a CEV-extractor, vs. pointing a growth-simulator at the too-young-person and including the resulting simulated person in the target for a CEV-extractor?
FAI team trustworthiness is a different subject than optimal enfranchisement structure.
It was mainly rhetorical; I tend to think that what holds back today’s FAI efforts is lack of rationality and inability of folks to take highly abstract arguments seriously.
Can you clarify what the relevant difference is between including a too-young person in the target for a CEV-extractor, vs. pointing a growth-simulator at the too-young-person and including the resulting simulated person in the target for a CEV-extractor?
Potentially bad things that could happen from implementing the CEV of a two-year-old.
Humans acquire morality as part of their development. Three-year-olds have a different, more selfish morality than older folks. There’s no reason in principle why a three-year-old who was “more the person he wished he was” would necessarily be a moral adult...
CEV does not mean considering the preferences of an agent who is “more moral”. There is no such thing. Morality is not a scalar quantity. I certainly hope the implementation would end up favoring the sort of morals I like enough to calculate the CEV of a three-year-old and get an output similar to that of an adult, but it seems like a bad idea to count on the implementation being that robust.
Consider the following three target-definitions for a superhuman optimizer: a) one patterned on the current preferences of a typical three-year-old b) one patterned on the current preferences of a typical thirty-year old c) one that is actually safe to implement (aka “Friendly”)
I understand you to be saying that the gulf between A and C is enormous, and I quite agree. I have not the foggiest beginnings of a clue how one might go about building a system that reliably gets from A to C and am not at all convinced it’s possible.
I would say that the gulf between B and C is similarly enormous, and I’m equally ignorant of how to build a system that spans it. But this whole discussion (and all discussions of CEV-based FAI) presumes that this gulf is spannable in practice. If we can span the B-C gulf, I take that as strong evidence indicating that we can span the A-C gulf.
Put differently: to talk seriously about implementing an FAI based on the CEV of thirty-year-olds, but at the same time dismiss the idea of doing so based on the CEV of three-year-olds, seems roughly analogous to seriously setting out to build a device that lets me teleport from Boston to Denver without occupying the intervening space, but dismissing the idea of building one that goes from Boston to San Francisco as a laughable fantasy because, as everyone knows, San Francisco is further away than Denver.
That’s why I said I don’t understand what you think the extractor is doing. I can see where, if I had a specific theory of how a teleporter operates, I might confidently say that it can span 2k miles but not 3k miles, arbitrary as that sounds in the absence of such a theory. Similarly, if I had a specific theory of how a CEV-extractor operates, I might confidently say it can work safely on a 30-year-old mind but not a 3-year-old. It’s only in the absence of such a theory that such a claim is arbitrary.
It seems likely to me that the CEV of the 30-year-old would be friendly and the CEV of the three-year-old would not be, but as you say at this point it’s hard to say much for sure.
(nods) That follows from what you’ve said earlier.
I suspect we have very different understandings of how similar the 30-year-old’s desires are to their volition.
Perhaps one way of getting at that difference is thus: how likely do you consider it that the CEV of a 30-year-old would be something that, if expressed in a form that 30-year-old can understand (say, for example, the opportunity to visit a simulated world for a year that is constrained by that CEV), would be relatively unsurprising to that 30-year-old… something that would elicit “Oh, cool, yeah, this is more or less what I had in mind” rather than “Holy Fucking Mother of God what kind of an insane world IS this?!?”?
For my own part, I consider the latter orders of magnitude more likely.
What you see as the factors holding back people from cooperating with modern analogues of FAI projects? Do you think those modern analogues could derive improved cooperation through broadcasting specific enfranchisement policy?
As a practical matter, it looks to me like the majority of wealthy, intelligent, rational modern folks an FAI project might want to cooperate with lean towards egalitarianism and humanism, not blues versus greens type sectarianism.
If you don’t think someone has enough political clout to bother with, they’ll be incentivized to prove you wrong. Even if you’re right most of the time, you’ll be giving yourself trouble.
I agree that very young humans are a potential difficult gray area. One possible solution is to simulate their growth into adults before computing their CEV. Presumably the age at which their growth should be simulated up to is not as controversial as who should be included.
FAI team trustworthiness is a different subject than optimal enfranchisement structure.
I’m not sure what those modern analogues are, but in general here are a few factors I see preventing people from cooperating on projects where both mutual cooperation and unilateral cooperation would be beneficial:
Simple error in calculating the expected value of cooperating.
Perceiving more value in obtaining higher status within my group by defending my group’s wrong beliefs about the project’s value than in defecting from my group by cooperating in the project
Perceiving more value in continuing to defend my previously articulated position against the project (e.g., in being seen as consistent or as capable of discharging earlier commitments) than in changing my position and cooperating in the project
Why do you ask?
I suspect that would be an easier question to answer with anything other than “it depends” if I had a specific example to consider. In general, I expect that it depends on who is motivated to support the project now to what degree, and the specific enfranchisement policy under discussion, and what value they perceive in that policy.
Sure, that’s probably true, at least for some values of “lean towards” (there’s a lot to be said here about actual support and signaled support but I’m not sure it matters). And it will likely remain true for as long as the FAI project in question only cares about the cooperation of wealthy, intelligent, rational modern folks, which they are well advised to continuing to do for as long as FAI isn’t a subject of particular interest to anyone else, and to stop doing as soon as possible thereafter.
(shrug) Sure, there’s some nonzero expected cost to the brief window between when they start proving their influence and I concede and include them.
Can you clarify what the relevant difference is between including a too-young person in the target for a CEV-extractor, vs. pointing a growth-simulator at the too-young-person and including the resulting simulated person in the target for a CEV-extractor?
I agree with this, certainly.
It was mainly rhetorical; I tend to think that what holds back today’s FAI efforts is lack of rationality and inability of folks to take highly abstract arguments seriously.
Potentially bad things that could happen from implementing the CEV of a two-year-old.
I conclude that I do not understand what you think the CEV-extractor is doing.
Humans acquire morality as part of their development. Three-year-olds have a different, more selfish morality than older folks. There’s no reason in principle why a three-year-old who was “more the person he wished he was” would necessarily be a moral adult...
CEV does not mean considering the preferences of an agent who is “more moral”. There is no such thing. Morality is not a scalar quantity. I certainly hope the implementation would end up favoring the sort of morals I like enough to calculate the CEV of a three-year-old and get an output similar to that of an adult, but it seems like a bad idea to count on the implementation being that robust.
Consider the following three target-definitions for a superhuman optimizer:
a) one patterned on the current preferences of a typical three-year-old
b) one patterned on the current preferences of a typical thirty-year old
c) one that is actually safe to implement (aka “Friendly”)
I understand you to be saying that the gulf between A and C is enormous, and I quite agree. I have not the foggiest beginnings of a clue how one might go about building a system that reliably gets from A to C and am not at all convinced it’s possible.
I would say that the gulf between B and C is similarly enormous, and I’m equally ignorant of how to build a system that spans it. But this whole discussion (and all discussions of CEV-based FAI) presumes that this gulf is spannable in practice. If we can span the B-C gulf, I take that as strong evidence indicating that we can span the A-C gulf.
Put differently: to talk seriously about implementing an FAI based on the CEV of thirty-year-olds, but at the same time dismiss the idea of doing so based on the CEV of three-year-olds, seems roughly analogous to seriously setting out to build a device that lets me teleport from Boston to Denver without occupying the intervening space, but dismissing the idea of building one that goes from Boston to San Francisco as a laughable fantasy because, as everyone knows, San Francisco is further away than Denver.
That’s why I said I don’t understand what you think the extractor is doing. I can see where, if I had a specific theory of how a teleporter operates, I might confidently say that it can span 2k miles but not 3k miles, arbitrary as that sounds in the absence of such a theory. Similarly, if I had a specific theory of how a CEV-extractor operates, I might confidently say it can work safely on a 30-year-old mind but not a 3-year-old. It’s only in the absence of such a theory that such a claim is arbitrary.
It seems likely to me that the CEV of the 30-year-old would be friendly and the CEV of the three-year-old would not be, but as you say at this point it’s hard to say much for sure.
(nods) That follows from what you’ve said earlier.
I suspect we have very different understandings of how similar the 30-year-old’s desires are to their volition.
Perhaps one way of getting at that difference is thus: how likely do you consider it that the CEV of a 30-year-old would be something that, if expressed in a form that 30-year-old can understand (say, for example, the opportunity to visit a simulated world for a year that is constrained by that CEV), would be relatively unsurprising to that 30-year-old… something that would elicit “Oh, cool, yeah, this is more or less what I had in mind” rather than “Holy Fucking Mother of God what kind of an insane world IS this?!?”?
For my own part, I consider the latter orders of magnitude more likely.
I’m pretty uncertain.