Ah. That’s not at all the understanding of “utility” I’ve seen used elsewhere on this site, so I appreciate the clarification, if not its tone.
So, OK. Given that understanding, “I’d prefer the average of all human utility function over my maximized utility function even if it means I have less utility.” means that xxd would prefer (on average, everyone’s needs are met) over (xxd’s needs are maximally met). And you’re asking whether I’d prefer that xxd’s preferences be implemented, or those of “an AI who is seeking resources to further it’s own goals at the expense of everyone else” which you’re calling “Transhuman AI PhilGoetz”… yes? (I will abbreviate that T-AI-PG herafter)
The honest answer is I can’t make that determination until I have some idea what having everyone’s needs met actually looks like, and some idea of what T-AI-PGs goals look like. If T-AI-PGs goals happen to include making life awesomely wonderful for me and everyone I care about, and xxd’s understanding of “everyone’s needs” leaves me and everyone I care about worse off than that, then I’d prefer that T-AI-PG’s preferences be implemented.
That said, I suspect that you’re taking it for granted that T-AI-PGs goals don’t include that, and also that xxd’s understanding of “everyone’s needs” really and truly makes everything best for everyone, and probably consider it churlish and sophist of me to imply otherwise.
So, OK: sure, if I make those further assumptions, I’d much rather have xxd’s preferences implemented than T-AI-PG’s preferences. Of course.
Your version is exactly the same as Phil’s, just that you’ve enlarged it to include yourself and everyone you care about’s utility being maximized rather than humanity as a whole having it’s utility maximized.
When we actually do get an FAI (if) it is going to be very interesting to see how it resolves given that even among those who are thinking about this ahead of time we can’t even agree on the goals defining what FAI should actually shoot for.
I do not understand what your first sentence means.
As for your second sentence: stating what it is we value, even as individuals (let alone collectively), in a sufficiently clear and operationalizable form that it could actually be implemented, in a sufficiently consistent form that we would want it implemented, is an extremely difficult problem. I have yet to see anyone come close to solving it; in my experience the world divides neatly into people who don’t think about it at all, people who think they’ve solved it and are wrong, and people who know they haven’t solved it.
If some entity (an FAI or whatever) somehow successfully implemented a collective solution it would be far more than interesting, it would fundamentally and irrevocably change the world.
I infer from my reading of your tone that you disagree with me here; the impression I get is that you consider the fact that we haven’t agreed on a solution to demonstrate our inadequacies as problem solvers, even by human standards, but that you’re too polite to say so explicitly. Am I wrong?
We actually agree on the difficulty of the problem. I think it’s very difficult to state what it is that we want AND that if we did so we’d find that individual utility functions contradict each other.
Moreover, I’m saying that maximizing Phil Goetz’s utility function or yours and everybody you love (or even my own selfish desires and wants plus those of everyone I love) COULD in effect be an unfriendly AI because MANY others would have theirs minimized.
So I’m saying that I think a friendly AI has to have it’s goals defined as:
Choice A. the maximum number of people have their utility functions improved (rather than maximized) even if some minimized number of people have their utility functions worsened
as opposed to
Choice B. a small number having their utility functions maximized as opposed to a large number of people having their utility functions decreased (or zeroed out).
As a side note: I find it amusing that it’s so difficult to even understand each others basic axioms never mind agree on the details of what maximizing the utility function for all of us as a whole means.
To be clear: I don’t know what the details are of maximizing the utility function for all of humanity. I just think that a fair maximization of the utility function for everyone has an interesting corrollary:
In order to maximize the function for everyone, some will have their individual utility functions decreased unless we accept a much narrower definition of friendly meaning “friendly to me”
in which case as far as I’m concerned that no longer means friendly.
The logical tautology here is of course that those who consider “friendly to me” as being the only possible definition of friendly would consider an AI that maximized the average utility function of humanity and they themselves lost out, to be an UNfriendly AI.
If you want to facilitate communication, I recommend that you stop using the word “friendly” in this context on this site. There’s a lot of talk on this site of “Friendly AI”, by which is meant something relatively specific. You are using “friendly” in the more general sense implied by the English word. This is likely to cause rather a lot of confusion.
You’re right that if strategy 1 optimizes for good stuff happening to everyone I care and strategy 2 optimizes for good stuff happening to everyone whether I care about them or not, then strategy 1 will (if done sufficiently powerfully) result in people I don’t care about having good stuff taken away from them, and strategy 2 will result in everyone I care about getting less good stuff than strategy 1 will.
You seem to be saying that I therefore ought to prefer that strategy 2 be implemented, rather than strategy 1. Is that right?
You seem to be saying that you yourself prefer that strategy 2 be implemented, rather than strategy 1. Is that right?
Nope, I’m saying strategy 2 is better for humanity. Of course personally I’d prefer strategy 1 but I’m honest enough with myself to know that certain individuals would find their utility functions severely degraded if I had an all powerful AI working for me and if I don’t trust myself to be in charge then I don’t trust any other human unless it’s someone like Ghandi.
Ah. That’s not at all the understanding of “utility” I’ve seen used elsewhere on this site, so I appreciate the clarification, if not its tone.
So, OK. Given that understanding, “I’d prefer the average of all human utility function over my maximized utility function even if it means I have less utility.” means that xxd would prefer (on average, everyone’s needs are met) over (xxd’s needs are maximally met). And you’re asking whether I’d prefer that xxd’s preferences be implemented, or those of “an AI who is seeking resources to further it’s own goals at the expense of everyone else” which you’re calling “Transhuman AI PhilGoetz”… yes? (I will abbreviate that T-AI-PG herafter)
The honest answer is I can’t make that determination until I have some idea what having everyone’s needs met actually looks like, and some idea of what T-AI-PGs goals look like. If T-AI-PGs goals happen to include making life awesomely wonderful for me and everyone I care about, and xxd’s understanding of “everyone’s needs” leaves me and everyone I care about worse off than that, then I’d prefer that T-AI-PG’s preferences be implemented.
That said, I suspect that you’re taking it for granted that T-AI-PGs goals don’t include that, and also that xxd’s understanding of “everyone’s needs” really and truly makes everything best for everyone, and probably consider it churlish and sophist of me to imply otherwise.
So, OK: sure, if I make those further assumptions, I’d much rather have xxd’s preferences implemented than T-AI-PG’s preferences. Of course.
Your version is exactly the same as Phil’s, just that you’ve enlarged it to include yourself and everyone you care about’s utility being maximized rather than humanity as a whole having it’s utility maximized.
When we actually do get an FAI (if) it is going to be very interesting to see how it resolves given that even among those who are thinking about this ahead of time we can’t even agree on the goals defining what FAI should actually shoot for.
I do not understand what your first sentence means.
As for your second sentence: stating what it is we value, even as individuals (let alone collectively), in a sufficiently clear and operationalizable form that it could actually be implemented, in a sufficiently consistent form that we would want it implemented, is an extremely difficult problem. I have yet to see anyone come close to solving it; in my experience the world divides neatly into people who don’t think about it at all, people who think they’ve solved it and are wrong, and people who know they haven’t solved it.
If some entity (an FAI or whatever) somehow successfully implemented a collective solution it would be far more than interesting, it would fundamentally and irrevocably change the world.
I infer from my reading of your tone that you disagree with me here; the impression I get is that you consider the fact that we haven’t agreed on a solution to demonstrate our inadequacies as problem solvers, even by human standards, but that you’re too polite to say so explicitly. Am I wrong?
We actually agree on the difficulty of the problem. I think it’s very difficult to state what it is that we want AND that if we did so we’d find that individual utility functions contradict each other.
Moreover, I’m saying that maximizing Phil Goetz’s utility function or yours and everybody you love (or even my own selfish desires and wants plus those of everyone I love) COULD in effect be an unfriendly AI because MANY others would have theirs minimized.
So I’m saying that I think a friendly AI has to have it’s goals defined as: Choice A. the maximum number of people have their utility functions improved (rather than maximized) even if some minimized number of people have their utility functions worsened as opposed to Choice B. a small number having their utility functions maximized as opposed to a large number of people having their utility functions decreased (or zeroed out).
As a side note: I find it amusing that it’s so difficult to even understand each others basic axioms never mind agree on the details of what maximizing the utility function for all of us as a whole means.
To be clear: I don’t know what the details are of maximizing the utility function for all of humanity. I just think that a fair maximization of the utility function for everyone has an interesting corrollary: In order to maximize the function for everyone, some will have their individual utility functions decreased unless we accept a much narrower definition of friendly meaning “friendly to me” in which case as far as I’m concerned that no longer means friendly.
The logical tautology here is of course that those who consider “friendly to me” as being the only possible definition of friendly would consider an AI that maximized the average utility function of humanity and they themselves lost out, to be an UNfriendly AI.
Couple of things:
If you want to facilitate communication, I recommend that you stop using the word “friendly” in this context on this site. There’s a lot of talk on this site of “Friendly AI”, by which is meant something relatively specific. You are using “friendly” in the more general sense implied by the English word. This is likely to cause rather a lot of confusion.
You’re right that if strategy 1 optimizes for good stuff happening to everyone I care and strategy 2 optimizes for good stuff happening to everyone whether I care about them or not, then strategy 1 will (if done sufficiently powerfully) result in people I don’t care about having good stuff taken away from them, and strategy 2 will result in everyone I care about getting less good stuff than strategy 1 will.
You seem to be saying that I therefore ought to prefer that strategy 2 be implemented, rather than strategy 1. Is that right?
You seem to be saying that you yourself prefer that strategy 2 be implemented, rather than strategy 1. Is that right?
Fair enough. I will read the wiki.
Yes
Not saying anything about your preferences.
Nope, I’m saying strategy 2 is better for humanity. Of course personally I’d prefer strategy 1 but I’m honest enough with myself to know that certain individuals would find their utility functions severely degraded if I had an all powerful AI working for me and if I don’t trust myself to be in charge then I don’t trust any other human unless it’s someone like Ghandi.