As far as I can tell, it’s literally impossible for me to prefer an AI that would implement CEV over one that would implement CEV—if what I want is actually CEV then the AI will figure this out while extrapolating my vision and implement that. On the other hand, it’s clearly possible for me to prefer CEV to CEV.
How likely do you consider it for CEV to be the first superintelligent AI to be created, compared to CEV?
Unless you’re a top AI researcher working solo to create your own AI, you may have to support CEV as the best compromise possible under the circumstances. It’ll probably be far closer to CEV than CEV or CEV would be.
However, CEV<$randomAIresearcher> is probably even closer to mine than CEV is…
CEV is likely to be very, very far from the preferences of most decent people...
The people who get to choose the utility function of the first AI have the option of ignoring the desires of the rest of humanity. I think they are likely to do so, because:
They know each other, and so can predict each other’s CEV better than that of the whole of humanity
They can explicitly trade utility with each other and encode compromises into the utility function (so that it won’t be a pure CEV)
The fact they were in this project together indicates a certain commonality of interests and ideas, and may serve to exclude memes that AI-builders would likely consider dangerous (e.g., fundamentalist religion)
They have had the opportunity of excluding people they don’t like from participating in the project to begin with
Also, Putin and Ahmadinejad are much more likely than the average human to influence the first AI’s utility function, simply because they have a lot of money and power.
They know each other, and so can predict each other’s CEV better than that of the whole of humanity
I believe the idea is that the AI will need to calculate the CEV, not the programmers (or it’s not CEV). And the AI will have a whole lot more statistical data to calculate the CEV of humanity than the CEV of individual contributors. Unless we’re talking uploaded personalities, which is a whole different discussion.
They can explicitly trade utility with each other and encode compromises into the utility function (so that it won’t be a pure CEV)
So you want hard-coded compromises that opposes and overrides what these people would collectively prefer to do if they were more intelligent, more competent and more self-aware?
I don’t think that’s a good idea at all.
The fact they were in this project together indicates a certain commonality of interests and ideas, and may serve to exclude memes that AI-builders would likely consider dangerous (e.g., fundamentalist religion)
Do you believe that fundamentalist religion would exist if fundamentalist religionists believed that their religion was false, and were also completely self-aware? Why do you think a CEV (which essentially means what people would want if they were as intelligent as the AI) would support a dangerous meme?
They have had the opportunity of excluding people they don’t like from participating in the project to begin with
I don’t think that the 9999 first contributors get to vote on whether they’ll accept a donation from the 10,000th one. And unless you believe these 10,000 people can create and defend their own country BEFORE the AI gets created, I’d urge not being vocal about them excluding everyone else, when developments in AI become close enough that the whole world starts paying serious attention.
Also, Putin and Ahmadinejad are much more likely than the average human to influence the first AI’s utility function, simply because they have a lot of money and power.
I believe the idea is that the AI will need to calculate the CEV, not the programmers (or it’s not CEV). And the AI will have a whole lot more statistical data to calculate the CEV of humanity than the CEV of individual contributors.
The programmers want the AI to calculate CEV because they expect CEV to be something they will like. We can’t calculate CEV ourselves, but that doesn’t mean we don’t know any of CEV’s (expected) properties.
However, we might be wrong about what CEV will turn out to be like, and we may come to regret pre-committing to CEV. That’s why I think we should prefer CEV, because we can predict it better.
So you want hard-coded compromises that opposes and overrides what these people would collectively prefer to do if they were more intelligent, more competent and more self-aware?
What I meant was that they might oppose and override some of the input to the CEV from the rest of humanity.
However, it might also be a good idea to override some of your own CEV results, because we don’t know in advance what the CEV will be. We define the desired result as “the best possible extrapolation”, but our implementation may produce something different. It’s very dangerous to precommit the whole future universe to something you don’t yet know at the moment of precommitment (my point number 1). So, you’d want to include overrides about things you’re certain should not be in the CEV.
Do you believe that fundamentalist religion would exist if fundamentalist religionists believed that their religion was false, and were also completely self-aware?
This is a misleading question.
If you are certain that the CEV will decide against fundamentalist religion, you should not oppose precommitting the AI to oppose fundamentalist religion, because you’re certain this won’t change the outcome. If you don’t want to include this modification to the AI, that means you 1) accept there is a possibility of religion being part of the CEV, and 2) want to precommit to living with that religion if it is part of the CEV.
Why do you think a CEV (which essentially means what people would want if they were as intelligent as the AI) would support a dangerous meme?
Maybe intelligent people like dangerous memes. I don’t know, because I’m not yet that intelligent. I do know though that having high intelligence doesn’t imply anything about goals or morals.
Broadly, this question is similar to “why do you think this brilliant AI-genie might misinterpret our request to alleviate world hunger?”
I don’t think that the 9999 first contributors get to vote on whether they’ll accept a donation from the 10,000th one.
Why not? If they’re controlling the project at that point, they can make that decision.
And unless you believe these 10,000 people can create and defend their own country BEFORE the AI gets created, I’d urge not being vocal about them excluding everyone else, when developments in AI become close enough that the whole world starts paying serious attention.
I’m not being vocal about any actual group I may know of that is working on AI :-)
I might still want to be vocal about my approach, and might want any competing groups to adopt it. I don’t have good probabilitiy estimates on this, but it might be the case that I would prefer CEV to CEV.
That’s why CEV is far better than CEV.
Why are you certain of this? At the very least it depends on who the person contributing money is.
“Humanity” includes a huge variety of different people. Depending on the CEV it may also include an even wider variety of people who lived in the past and counterfactuals who might live in the future. And the CEV, as far as I know, is vastly underspecified right now—we don’t even have a good conceptual test that would tell us if a given scenario is a probable outcome of CEV, let alone a generative way to calculate that outcome.
Saying that the CEV “will best please everyone” is just handwaving this aside. Precommitting the whole future lightcone to the result of a process we don’t know in advance is very dangerous, and very scary. It might be the best possible compromise between all humans, but it is not the case that all humans have equal input into the behavior of the first AI. I have not seen any good arguments claiming that implementing CEV is a better strategy than just trying to be to build the first AI before anyone else and then making it implement a narrow CEV.
Suppose that the first AI is fully general, and can do anything you ask of it. What reason is there for its builders, whoever they are, to ask to it to implement CEV rather than CEV?
That is, if I really take the CEV idea seriously as proposed, there simply is no way I can prefer CEV(me + X) to CEV(me)… if it turns out that I would, if I knew enough and thought about it carefully enough and “grew” enough and etc., care about other people’s preferences (either in and of themselves, as in “I hadn’t thought of that but now that you point it out I want that too”, or by reference to their owners, as in “I don’t care about that but if you do then fine let’s have that too,” for which distinction I bet there’s a philosophical term of art that I don’t know), then the CEV-extraction process will go ahead and optimize for those preferences as well, even if I don’t actually know what they are, or currently care about them; even if I currently think they are a horrible evil bad no-good idea. (I might be horrified by that result, but presumably I should endorse it anyway.)
This works precisely because the CEV-extraction process as defined depends on an enormous amount of currently-unavailable data in the course of working out the target’s “volition” given its current desires, including entirely counterfactual data about what the target would want if exposed to various idealized and underspecified learning/”growing” environments.
That said, the minute we start talking instead about some actual realizable thing in the world, some approximation of CEV-me computable by a not-yet-godlike intelligence, it stops being quite so clear that all of the above is true.
An approximate-CEV extractor might find things in your brain that I would endorse if I knew about them (given sufficient time and opportunity to discuss it with you and “grow” and so forth) but that it wasn’t able to actually compute based on just my brain as a target, in which case pointing it at both of us might be better (in my own terms!) than pointing it at just me.
It comes down to a question of how much we trust the seed AI that’s doing the extraction to actually solve the problem.
It’s also perhaps worth asking what happens if I build the CEV-extracting seed AI and point it at my target community and it comes back with “I don’t have enough capability to compute CEV for that community. I will have to increase my capabilities in order to solve that problem.”
Yes, that would be preferable. But only because I assert a correlation between the attributes that produce what we measure as g and with personality traits and actual underlying preferences. A superintelligence extrapolating on ’s preferences would, in fact, produce a different outcome than one extrapolating on .
ArisKataris’s accusation that you don’t understand CEV means misses the mark. You can understand CEV and still not conclude that CEV is necessarily a good thing.
What would that accomplish? It’s the intelligence of the AI that will be getting used, not the intelligence of the people in question.
I’m getting the impression that some people don’t understand what CEV even means. It’s not about the programmers predicting a course of action, it’s not about the AI using people’s current choice, it’s about the AI using the extrapolated volition—what people would choose if they were as smart and knowledgeable as the AI.
Good one according to which criteria? CEV is perfect according to humankind’s criteria if humankind were more intelligent and more sane than it currently is.
Mine. (This is tautological.) Anything else that is kind of similar to mine would be acceptable.
CEV is perfect according to humankind’s criteria if humankind were more intelligent and more sane than it currently is.
Which is fine if ‘sane’ is defined as ‘more like what I would consider ‘sane’. But that’s because sane has all sorts of loaded connotations with respect to actual preferences—and “humanity’s” may very well not qualify as not-insane.
Serious question:
Is this addressed to the coherent extrapolated volition of humankind, as expressed by SIAI? I’m under the impression it is not.
As far as I can tell, it’s literally impossible for me to prefer an AI that would implement CEV over one that would implement CEV—if what I want is actually CEV then the AI will figure this out while extrapolating my vision and implement that. On the other hand, it’s clearly possible for me to prefer CEV to CEV.
How likely do you consider it for CEV to be the first superintelligent AI to be created, compared to CEV?
Unless you’re a top AI researcher working solo to create your own AI, you may have to support CEV as the best compromise possible under the circumstances. It’ll probably be far closer to CEV than CEV or CEV would be.
However, CEV<$randomAIresearcher> is probably even closer to mine than CEV is… CEV is likely to be very, very far from the preferences of most decent people...
A far more likely compromise would be CEV.
The people who get to choose the utility function of the first AI have the option of ignoring the desires of the rest of humanity. I think they are likely to do so, because:
They know each other, and so can predict each other’s CEV better than that of the whole of humanity
They can explicitly trade utility with each other and encode compromises into the utility function (so that it won’t be a pure CEV)
The fact they were in this project together indicates a certain commonality of interests and ideas, and may serve to exclude memes that AI-builders would likely consider dangerous (e.g., fundamentalist religion)
They have had the opportunity of excluding people they don’t like from participating in the project to begin with
Also, Putin and Ahmadinejad are much more likely than the average human to influence the first AI’s utility function, simply because they have a lot of money and power.
I disagree with all of these four claims
I believe the idea is that the AI will need to calculate the CEV, not the programmers (or it’s not CEV). And the AI will have a whole lot more statistical data to calculate the CEV of humanity than the CEV of individual contributors. Unless we’re talking uploaded personalities, which is a whole different discussion.
So you want hard-coded compromises that opposes and overrides what these people would collectively prefer to do if they were more intelligent, more competent and more self-aware?
I don’t think that’s a good idea at all.
Do you believe that fundamentalist religion would exist if fundamentalist religionists believed that their religion was false, and were also completely self-aware? Why do you think a CEV (which essentially means what people would want if they were as intelligent as the AI) would support a dangerous meme?
I don’t think that the 9999 first contributors get to vote on whether they’ll accept a donation from the 10,000th one. And unless you believe these 10,000 people can create and defend their own country BEFORE the AI gets created, I’d urge not being vocal about them excluding everyone else, when developments in AI become close enough that the whole world starts paying serious attention.
That’s why CEV is far better than CEV.
The programmers want the AI to calculate CEV because they expect CEV to be something they will like. We can’t calculate CEV ourselves, but that doesn’t mean we don’t know any of CEV’s (expected) properties.
However, we might be wrong about what CEV will turn out to be like, and we may come to regret pre-committing to CEV. That’s why I think we should prefer CEV, because we can predict it better.
What I meant was that they might oppose and override some of the input to the CEV from the rest of humanity.
However, it might also be a good idea to override some of your own CEV results, because we don’t know in advance what the CEV will be. We define the desired result as “the best possible extrapolation”, but our implementation may produce something different. It’s very dangerous to precommit the whole future universe to something you don’t yet know at the moment of precommitment (my point number 1). So, you’d want to include overrides about things you’re certain should not be in the CEV.
This is a misleading question.
If you are certain that the CEV will decide against fundamentalist religion, you should not oppose precommitting the AI to oppose fundamentalist religion, because you’re certain this won’t change the outcome. If you don’t want to include this modification to the AI, that means you 1) accept there is a possibility of religion being part of the CEV, and 2) want to precommit to living with that religion if it is part of the CEV.
Maybe intelligent people like dangerous memes. I don’t know, because I’m not yet that intelligent. I do know though that having high intelligence doesn’t imply anything about goals or morals.
Broadly, this question is similar to “why do you think this brilliant AI-genie might misinterpret our request to alleviate world hunger?”
Why not? If they’re controlling the project at that point, they can make that decision.
I’m not being vocal about any actual group I may know of that is working on AI :-)
I might still want to be vocal about my approach, and might want any competing groups to adopt it. I don’t have good probabilitiy estimates on this, but it might be the case that I would prefer CEV to CEV.
Why are you certain of this? At the very least it depends on who the person contributing money is.
“Humanity” includes a huge variety of different people. Depending on the CEV it may also include an even wider variety of people who lived in the past and counterfactuals who might live in the future. And the CEV, as far as I know, is vastly underspecified right now—we don’t even have a good conceptual test that would tell us if a given scenario is a probable outcome of CEV, let alone a generative way to calculate that outcome.
Saying that the CEV “will best please everyone” is just handwaving this aside. Precommitting the whole future lightcone to the result of a process we don’t know in advance is very dangerous, and very scary. It might be the best possible compromise between all humans, but it is not the case that all humans have equal input into the behavior of the first AI. I have not seen any good arguments claiming that implementing CEV is a better strategy than just trying to be to build the first AI before anyone else and then making it implement a narrow CEV.
Suppose that the first AI is fully general, and can do anything you ask of it. What reason is there for its builders, whoever they are, to ask to it to implement CEV rather than CEV?
In an idealized form, I agree with you.
That is, if I really take the CEV idea seriously as proposed, there simply is no way I can prefer CEV(me + X) to CEV(me)… if it turns out that I would, if I knew enough and thought about it carefully enough and “grew” enough and etc., care about other people’s preferences (either in and of themselves, as in “I hadn’t thought of that but now that you point it out I want that too”, or by reference to their owners, as in “I don’t care about that but if you do then fine let’s have that too,” for which distinction I bet there’s a philosophical term of art that I don’t know), then the CEV-extraction process will go ahead and optimize for those preferences as well, even if I don’t actually know what they are, or currently care about them; even if I currently think they are a horrible evil bad no-good idea. (I might be horrified by that result, but presumably I should endorse it anyway.)
This works precisely because the CEV-extraction process as defined depends on an enormous amount of currently-unavailable data in the course of working out the target’s “volition” given its current desires, including entirely counterfactual data about what the target would want if exposed to various idealized and underspecified learning/”growing” environments.
That said, the minute we start talking instead about some actual realizable thing in the world, some approximation of CEV-me computable by a not-yet-godlike intelligence, it stops being quite so clear that all of the above is true.
An approximate-CEV extractor might find things in your brain that I would endorse if I knew about them (given sufficient time and opportunity to discuss it with you and “grow” and so forth) but that it wasn’t able to actually compute based on just my brain as a target, in which case pointing it at both of us might be better (in my own terms!) than pointing it at just me.
It comes down to a question of how much we trust the seed AI that’s doing the extraction to actually solve the problem.
It’s also perhaps worth asking what happens if I build the CEV-extracting seed AI and point it at my target community and it comes back with “I don’t have enough capability to compute CEV for that community. I will have to increase my capabilities in order to solve that problem.”
Yes. The CEV really could suck. There isn’t a good reason to assume that particular preference system is a good one.
How about CEV?
Yes, that would be preferable. But only because I assert a correlation between the attributes that produce what we measure as g and with personality traits and actual underlying preferences. A superintelligence extrapolating on ’s preferences would, in fact, produce a different outcome than one extrapolating on .
ArisKataris’s accusation that you don’t understand CEV means misses the mark. You can understand CEV and still not conclude that CEV is necessarily a good thing.
And, uh, how do you define that?
Something like g, perhaps?
What would that accomplish? It’s the intelligence of the AI that will be getting used, not the intelligence of the people in question.
I’m getting the impression that some people don’t understand what CEV even means. It’s not about the programmers predicting a course of action, it’s not about the AI using people’s current choice, it’s about the AI using the extrapolated volition—what people would choose if they were as smart and knowledgeable as the AI.
Good one according to which criteria? CEV is perfect according to humankind’s criteria if humankind were more intelligent and more sane than it currently is.
Mine. (This is tautological.) Anything else that is kind of similar to mine would be acceptable.
Which is fine if ‘sane’ is defined as ‘more like what I would consider ‘sane’. But that’s because sane has all sorts of loaded connotations with respect to actual preferences—and “humanity’s” may very well not qualify as not-insane.