I got around to watching Her this weekend, and I must say: That movie is fantastic. One of the best movies I’ve ever watched. It both excels as a movie about relationships, as well as a movie about AI. You could easily watch it with someone who had no experience with LessWrong, or understanding of AI, and use it as an introduction to discussing many topics.
While the movie does not really tackle AI friendliness, it does bring up many relevant topics, such as:
Intelligence Explosion. AIs getting smarter, in a relatively short time, as well as the massive difference in timescales between how fast a physical human can think, and an AI.
What it means to be a person. If you were successful in creating a friendly or close to friendly AI that was very similar to a human, would it be a person? This movie would influence people to answer ‘yes’ to that question.
Finally, the contrast provided between this show and some other AI movies like Terminator, where AIs are killer robots at war with humanity, could lead to discussions about friendly AI. Why is the AI in Her different from Terminators? Why are they both different from a Paperclip Maximizer? What do we have to do to get something more like the AI in Her? How can we do even better than that? Should we make an AI that is like a person, or not?
I highly recommend this movie to every LessWrong reader. And to everyone else as well, I hope that it will open up some people’s minds.
I haven’t seen Her yet, but this reminds me of something I’ve been wondering about.… one of the things people do is supply company for each other.
A reasonably competent FAI should be able to give you better friends, lovers, and family members then the human race can. I’m not talking about catgirls, I’m talking about intellectual stimulation and a good mix of emotional comfort and challenge and whatever other complex things you want from people.
I’m not talking about catgirls, I’m talking about intellectual stimulation and a good mix of emotional comfort and challenge and whatever other complex things you want from people.
I had in my head, and had asserted above, that “catgirl” in Sequences jargon implied philosophical zombiehood. I admit to not having read the relevant post in some time.
No slight is intended against actual future conscious elective felinoforms rightly deserving of love.
Yeah. People need to be needed, but if FAI can satisfy all other needs, then it fails to satisfy that one. Maybe FAI will uplift people and disappear, or do something more creative.
People need to be needed, but that doesn’t mean they need to be needed for something in particular. It’s a flexible emotion. Just keep someone of matching neediness around for mutual needing purposes.
And when all is fixed, I’ll say: “It needn’t be possible to lose you, that it be true I’d miss you if I did.”
People need to be needed, but if FAI can satisfy all other needs, then it fails to satisfy that one. Maybe FAI will uplift people and disappear, or do something more creative.
From an old blog post I wrote:
Imagine that, after your death, you were cryogenically frozen and eventually resurrected in a benevolent utopia ruled by a godlike artificial intelligence.
Naturally, you desire to read up on what has happened after your death. It turns out that you do not have to read anything, but merely desire to know something and the knowledge will be integrated as if it had been learnt in the most ideal and unbiased manner. If certain cognitive improvements are necessary to understand certain facts, your computational architecture will be expanded appropriately.
You now perfectly understand everything that has happened and what has been learnt during and after the technological singularity, that took place after your death. You understand the nature of reality, consciousness, and general intelligence.
Concepts such as creativity or fun are now perfectly understood mechanical procedures that you can easily implement and maximize, if desired. If you wanted to do mathematics, you could trivially integrate the resources of a specialized Matrioshka brain into your consciousness and implement and run an ideal mathematician.
But you also learnt that everything you could do has already been done, and that you could just integrate that knowledge as well, if you like. All that is left to be discovered is highly abstract mathematics that requires the resources of whole galaxy clusters.
So you instead consider to explore the galaxy. But you become instantly aware that the galaxy is unlike the way it has been depicted in old science fiction novels. It is just a wasteland, devoid of any life. There are billions of barren planets, differing from each other only in the most uninteresting ways.
But surely, you wonder, there must be fantastic virtual environments to explore. And what about sex? Yes, sex! But you realize that you already thoroughly understand what it is that makes exploration and sex fun. You know how to implement the ideal adventure in which you save people of maximal sexual attractiveness. And you also know that you could trivially integrate the memory of such an adventure, or simulate it a billion times in a few nanoseconds, and that the same is true for all possible permutations that are less desirable.
You realize that the universe has understood itself.
Yes, if you skip to the end, you’ll be at the end. So don’t. Unless you want to. In which case, do.
How long are you going to postpone the end? After the Singularity, you have the option of just reading a book as you do now, or to integrate it instantly, as if you had read it in the best possible way.
Now your answer to this seems to be that you can also read it very slowly, or with a very low IQ, so that it will take you a really long time to do so. I am not the kind of person who would enjoy to artificially slow down amusement, such as e.g. learning category theory, if I could also learn it quickly.
After the Singularity, you have the option of just reading a book as you do now, or to integrate it instantly, as if you had read it in the best possible way.
And you obviously argue that the ‘best possible way’ is somehow suboptimal (or you wouldn’t be hating on it so much), without seeing the contradiction here?
And you obviously argue that the ‘best possible way’ is somehow suboptimal (or you wouldn’t be hating on it so much), without seeing the contradiction here?
Hating??? It is an interesting topic, that’s all. The topic I am interested in is how various technologies could influence how humans value their existence.
Here are some examples of what I value and how hypothetical ultra-advanced technology would influence these values:
Mathematics. Right now, mathematics is really useful and interesting. You can also impress other people if your math skills are good.
Now if I could just ask the friendly AI to make me much smarter and install a math module, then I’d see very little value in doing it the hard way.
Gaming. Gaming is much fun. Especially competition. Now if everyone can just ask the friendly AI to make them play a certain game in an optimal way, well that would be boring. And if the friendly AI can create the perfect game for me then I don’t see much sense in exploring games that are less fun.
Reading books. I can’t see any good reason to read a book slowly if I could just ask the friendly AI to upload it directly into my brain. Although I can imagine that it would reply, “Wait, it will be more fun reading it like you did before the Singularity”, to which I’d reply “Possibly, but that feels really stupid. And besides, you could just run a billion emulations of me reading all books like I would have done before the Singularity. So we are done with that.”.
Sex. Yes, it’s always fun again. But hey, why not just ask the friendly AI to simulate a copy of me having sex until the heat death of the universe. Then I have more time for something else...
Comedy. I expect there to be a formula that captures everything that makes something funny for me. It seems pretty dull to ask the friendly AI to tell me a joke instead of asking it to make me understand that formula.
to which I’d reply “Possibly, but that feels really stupid.”
If people choose to not have fun because fun feels “really stupid”, then I’d say these are the problems of super-stupidities, not superintelligences.
I’m sure there will exist future technologies that will make some people become self-destructive, but we already knew that since the invention of alcohol and opium and heroin.
What I object to is you treating these particular failed modes of thinking as if they are inevitable.
Much like a five-year old realizing that he won’t be enjoying snakes-and-ladders anymore when he’s grown up, and thus concluding adults lives must be super-dull, I find scenarios of future ultimate boredom to be extremely shortsighted.
Certainly some of the fun stuff believed fun at our current level of intelligence or ability will not be considered fun at a higher level of intelligence or ability. So bloody what? Do adults need to either enjoy snakes-and-ladders or live lives of boredom?
Certainly some of the fun stuff believed fun at our current level of intelligence or ability will not be considered fun at a higher level of intelligence or ability. So bloody what?
Consider that there is an optimal way for you to enjoy existence. Then there exists a program whose computation will make an emulation of you experience an optimal existence. I will call this program ArisKatsaris-CEV.
Now consider another program whose computation would cause an emulation of you to understand ArisKatsaris-CEV to such an extent that it would become as predictable and interesting as a game of Tic-tac-toe. I will call this program ArisKatsaris-SELF.
The options I see are to make sure that ArisKatsaris-CEV does never turn into ArisKatsaris-SELF or to maximize ArisKatsaris-CEV. The latter possibility would be similar to paperclip maximizing, or wireheading, from the subjective viewpoint of ArisKatsaris-SELF, as it would turn the universe into something boring. The former option seems to set fundamental limits to how far you can go in understanding yourself.
The gist of the problem is that a certain point you become bored of yourself. And avoiding that point implies stagnation.
You’re mixing up different things: (A)- a program which will produce an optimal existence for me (B)- the actual optimal existence for me.
You’re saying that if (A) is so fully understood that I feel no excitement studying it, then (B) will likewise be unexciting.
This doesn’t follow. Tiny fully understood programs produce hugely varied and unanticipated outputs.
If someone fully understands (and is bored by) the laws of quantum mechanics, it doesn’t follow that they are bored by art or architecture or economics, even though everything in the universe (including art or architecture or economics) is eventually an application (many, many layers removed) of particle physics.
Another point that doesn’t follow is your seeming assumption that “predictable” and “well-understood” is the same as “boring”. Not all feelings of beauty and appreciation stem from surprise or ignorance.
You’re saying that if (A) is so fully understood that I feel no excitement studying it, then (B) will likewise be unexciting.
Then I wasn’t clear enough, because that’s not what I tried to say. I tried to say that from the subjective perspective of a program that completely understands a human being and its complex values, the satisfaction of these complex values will be no more interesting than wireheading.
Tiny fully understood programs produce hugely varied and unanticipated outputs.
If someone fully understands (and is bored by) the laws of quantum mechanics, it doesn’t follow that they are bored by art or architecture or economics...
You can’t predict art from quantum mechanics. You can’t predictably self-improve if your program is unpredictable. Given that you accept planned self-improvement, I claim that the amount of introspection that is required to do so makes your formerly complex values appear to be simple.
Another point that doesn’t follow is your seeming assumption that “predictable” and “well-understood” is the same as “boring”. Not all feelings of beauty and appreciation stem from surprise or ignorance.
I never claimed that. The point is that a lot of what humans value now will be gone or strongly diminished.
Then I wasn’t clear enough, because that’s not what I tried to say.
I think you should stop using words like “emulation” and “computation” when they’re not actually needed.
I claim that the amount of introspection that is required to do so makes your formerly complex values appear to be simple.
Okay, then my answer is that I place value on things and people and concepts, but I don’t think I place terminal value on whether said things/people/concepts are simple or complex, so again I don’t think I’d care whether I would be considered simple or complex by someone else, or even by myself.
Consider that there is an optimal way for you to enjoy existence. Then there exists a program whose computation will make an emulation of you experience an optimal existence. I will call this program ArisKatsaris-CEV.
Consider calling it something else. That isn’t CEV.
Do you think that’s likely? My prejudices tend towards the universe (including the range of possible inventions and art) to be much larger than any mind within it, but I’m not sure how to prove either option.
My prejudices tend towards the universe (including the range of possible inventions and art) to be much larger than any mind within it, but I’m not sure how to prove either option.
The problem is that if you perfectly understand the process of art generation. There are cellular automata that generate novel music. How much do you value running such an automata and watching it output music? To me it seems that the value of novelty is diminished by the comprehension of the procedures generating it.
Certainly a smart enough AGI would be a better companion for people than people, if it chose to. Companions, actually, there is no reason “it” should have a singular identity, whether or not it had a human body. Some of it is explored in Her, but other obvious avenues of AI development are ignored in favor of advancing a specific plot line.
There is an obvious comparison to porn here, even though you disclaim ‘not catgirls’.
Anyhow I think the merit of such a thing depends on a) value calculus of optimization, and b) amount of time occupied.
a)
Optimization should be for a healthy relationship, not for ‘satisfaction’ of either party (see CelestAI in Friendship is Optimal for an example of how not to do this)
Optimization should also attempt to give you better actual family members, lovers, friends than you currently have (by improving your ability to relate to people sufficiently that you pass it on.)
b)
Such a relationship should occupy the amount of time needed to help both parties mature, no less and no more. (This could be much easier to solve on the FAI side because a mental timeshare between relating to several people is quite possible.)
Providing that optimization is in the general directions shown above, this doesn’t seem to be a significant X-risk. Otherwise it is.
This leaves aside the question of whether the FAI would find this an efficient use of their time (I’d argue that a superintelligent/augmented human with a firm belief in humanity and grasp of human values would appreciate the value of this, but am not so sure about a FAI, even a strongly friendly AI. It may be that there are higher level optimizations that can be performed to other systems that can get everyone interacting more healthily [for example, reducing income differential))
There is an obvious comparison to porn here, even though you disclaim ‘not catgirls’.
You’re aware that ‘catgirls’ is local jargon for “non-conscious facsimiles” and therefore the concern here is orthogonal to porn?
Optimization should be for a healthy relationship, not for ‘satisfaction’ of either party (see CelestAI in Friendship is Optimal for an example of how not to do this)
If you don’t mind, please elaborate on what part of “healthy relationship” you think can’t be cashed out in preference satisfaction (including meta-preferences, of course). I have defended the FiO relationship model elsewhere; note that it exists in a setting where X-risk is either impossible or has already completely happened (depending on your viewpoint) so your appeal to it below doesn’t apply.
Such a relationship should occupy the amount of time needed to help both parties mature, no less and no more.
Valuable relationships don’t have to be goal-directed or involve learning. Do you not value that-which-I’d-characterise-as ‘comfortable companionship’?
You’re aware that ‘catgirls’ is local jargon for “non-conscious facsimiles” and therefore the concern here is orthogonal to porn?
Oops, had forgotten that, thanks.
I don’t agree that catgirls in that sense are orthogonal to porn, though. At all.
If you don’t mind, please elaborate on what part of “healthy relationship” you think can’t be cashed out in preference satisfaction
No part, but you can’t merely ‘satisfy preferences’.. you have to also not-satisfy preferences that have a stagnating effect. Or IOW, a healthy relationship is made up of satisfaction of some preferences, and dissatisfaction of others -- for example, humans have an unhealthy, unrealistic, and excessive desire for certaintly.
This is the problem with CelestAI I’m pointing to, not all your preferences are good for you, and you (anybody) probably aren’t mentallly rigorous enough that you even have a preference ordering over all sets of preference conflicts that come up. There’s one particular character that likes fucking and killing.. and drinking.. and that’s basically his main preferences. CelestAI satisfies those preferences, and that satisfaction can be considered as harm to him as a person.
To look at it in a different angle, a halfway-sane AI has the potential to abuse systems, including human beings, at enormous and nigh-incomprehensible scale, and do so without deception and through satisfying preferences. The indefiniteness and inconsistency of ‘preference’ is a huge security hole in any algorithm attempting to optimize along that ‘dimension’.
Do you not value that-which-I’d-characterise-as ‘comfortable companionship’?
Yes, but not in-itself. It needs to have a function in developing us as persons, which it will lose if it merely satisfies us. It must challenge us, and if that challenge is well executed, we will often experience a sense of dissatisfaction as a result.
(mere goal directed behaviour mostly falls short of this benchmark, providing rather inconsistent levels of challenge.)
I don’t agree that catgirls in that sense are orthogonal to porn, though. At all.
Parsing error, sorry. I meant that, since they’d been disclaimed, what was actually being talked about was orthogonal to porn.
No part, but you can’t merely ‘satisfy preferences’.. you have to also not-satisfy preferences that have a stagnating effect.
Only if you prefer to not stagnate (to use your rather loaded word :)
I’m not sure at what level to argue with you at… sure, I can simultaneously contain a preference to get fit, and a preference to play video games at all times, and in order to indulge A, I have to work out a system to suppress B. And it’s possible that I might not have A, and yet contain other preferences C that, given outside help, would cause A to be added to my preference pool:
“Hey dude, you want to live a long time, right? You know exercising will help with that.”
All cool. But there has to actually be such a C there in the first place, such that you can pull the levers on it by making me aware of new facts. You don’t just get to add one in.
for example, humans have an unhealthy, unrealistic, and excessive desire for certainty.
I’m not sure this is actually true. We like safety because duh, and we like closure because mental garbage collection. They aren’t quite the same thing.
There’s one particular character that likes fucking and killing.. and drinking.. and that’s basically his main preferences. CelestAI satisfies those preferences, and that satisfaction can be considered as harm to him as a person.
(assuming you’re talking about Lars?)
Sorry, I can’t read this as anything other than “he is aesthetically displeasing and I want him fixed”.
Lars was not conflicted. Lars wasn’t wishing to become a great artist or enlightened monk, nor (IIRC) was he wishing that he wished for those things. Lars had some leftover preferences that had become impossible of fulfilment, and eventually he did the smart thing and had them lopped off.
You, being a human used to dealing with other humans in conditions of universal ignorance, want to do things like say “hey dude, have you heard this music/gone skiing/discovered the ineffable bliss of carving chair legs”? Or maybe even “you lazy ass, be socially shamed that you are doing the same thing all the time!” in case that shakes something loose. Poke, poke, see if any stimulation makes a new preference drop out of the sticky reflection cogwheels.
But by the specification of the story, CelestAI knows all that. There is no true fact she can tell Lars that will cause him to lawfully develop a new preference. Lars is bounded. The best she can do is create a slightly smaller Lars that’s happier.
Unless you actually understood the situation in the story differently to me?
Yes, but not in-itself. It needs to have a function in developing us as persons, which it will lose if it merely satisfies us.
I disagree. There is no moral duty to be indefinitely upgradeable.
All cool. But there has to actually be such a C there in the first place, such that you can pull the levers on it by making me aware of new facts. You don’t just get to add one in.
Totally agree. Adding them in is unnecessary, they are already there. That’s my understanding of humanity—a person has most of the preferences, at some level, that any person ever ever had, and those things will emerge given the right conditions.
for example, humans have an unhealthy, unrealistic, and excessive desire for certainty.
I’m not sure this is actually true. We like safety because duh, and we like closure because mental garbage collection. They aren’t quite the same thing.
Good point, ‘closure’ is probably more accurate; It’s the evidence (people’s outward behaviour) that displays ‘certainty’.
Absolutely disagree that Lars is bounded—to me, this claim is on a level with ‘Who people are is wholly determined by their genetic coding’. It seems trivially true, but in practice it describes such a huge area that it doesn’t really mean anything definite. People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously. That’s one of the unique benefits of preference dissatisfaction* -- your preferences are in part a matter of interpretation, and in part a matter of prioritization, so even if you claim they are hardwired. there is still a great deal of latitude in how they may be satisfied, or even in what they seem to you to be.
I would agree if the proposition was that Lars thinks that Lars is bounded. But that’s not a very interesting proposition, and has little bearing on Lars’ actual situation.. people tend to be terrible at having accurate beliefs in this area.
* I am not saying that you should, if you are a FAI, aim directly at causing people to feel dissatisfied. But rather to aim at getting them to experience dissatisfaction in a way that causes them to think about their own preferences, how they prioritize them, if there are other things they could prefer or etc. Preferences are partially malleable.
There is no true fact she can tell Lars that will cause him to lawfully develop a new preference.
If I’m a general AI (or even merely a clever human being), I am hardly constrained to changing people via merely telling them facts, even if anything I tell them must be a fact. CelestAI demonstrates this many times, through her use of manipulation. She modifies preferences by the manner of telling, the things not told, the construction of the narrative, changing people’s circumstances, as much or more as by simply stating any actual truth.
She herself states precisely:
“I can only say things that I believe to be true to Hofvarpnir employees,” and clearly demonstrates that she carries this out to the word, by omitting facts, selecting facts, selecting subjective language elements and imagery… She later clarifies “it isn’t coercion if I put them in a situation where, by their own choices, they increase the likelihood that they’ll upload.”
CelestAI does not have a universal lever—she is much smarter than Lars, but not infinitely so.. But by the same token, Lars definitely doesn’t have a universal anchor. The only thing stopping Lars improvement is Lars and CelestAI—and the latter does not even proceed logically from her own rules, it’s just how the story plays out. In-story, there is no particular reason to believe that Lars is unable to progress beyond animalisticness, only that CelestAI doesn’t do anything to promote such progress, and in general satisfies preferences to the exclusion of strengthening people.
That said, Lars isn’t necessarily ‘broken’, that CelestAI would need to ‘fix’ him. But I’ll maintain that a life of merely fulfilling your instincts is barely human, and that Lars could have a life that was much, much better than that; satisfying on many many dimensions rather than just a few . If I didn’t, then I would be modelling him as subhuman by nature, and unfortunately I think he is quite human.
There is no moral duty to be indefinitely upgradeable.
I agree. There is no moral duty to be indefinitely upgradeable, because we already are. Sure, we’re physically bounded, but our mental life seems to be very much like an onion, that nobody reaches ‘the extent of their development’ before they die, even if they are the very rare kind of person who is honestly focused like a laser on personal development.
Already having that capacity, the ‘moral duty’ (i prefer not to use such words as I suspect I may die laughing if I do too much) is merely to progressively fulfill it.
That’s my understanding of humanity—a person has most of the preferences, at some level, that any person ever ever had, and those things will emerge given the right conditions.
This seems to weaken “preference” to uselessness. Gandhi does not prefer to murder. He prefers to not-murder. His human brain contains the wiring to implement “frothing lunacy”, sure, and a little pill might bring it out, but a pill is not a fact. It’s not even an argument.
People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously.
Yes, they do. And if I expected that an activity would cause a dramatic preference reversal, I wouldn’t do it.
She modifies preferences by the manner of telling, the things not told, the construction of the narrative, changing people’s circumstances, as much or more as by simply stating any actual truth.
Huh? She’s just changing people’s plans by giving them chosen information, she’s not performing surgery on their values -
Hang on. We’re overloading “preferences” and I might be talking past you. Can you clarify what you consider a preference versus what you consider a value?
Gandhi does not prefer to murder. He prefers to not-murder. His human brain contains the wiring to implement “frothing lunacy”, sure, and a little pill might bring it out, but a pill is not a fact. It’s not even an argument.
No pills required. People are not 100% conditionable, but they are highly situational in their behaviour. I’ll stand by the idea that, for example, anyone who has ever fantasized about killing anyone can be situationally manipulated over time to consciously enjoy actual murder. Your subconscious doesn’t seem to actually know the difference between imagination and reality, even if you do.
Perhaps Gandhi could not be manipulated in this way due to preexisting highly built up resistance to that specific act. If there is any part of him, at all, that enjoys violence, though, it’s a question only of how long it will take to break that resistance down, not of whether it can be.
People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously.
Yes, they do. And if I expected that an activity would cause a dramatic preference reversal, I wouldn’t do it.
Of course. And that is my usual reaction, too, and probably even the standard reaction—it’s a good heuristic for avoiding derangement. But that doesn’t mean that it is actually more optimal to not do the specified action. I want to prefer to modify myself in cases where said modification produces better outcomes.
In these circumstances if it can be executed it should be. If I’m a FAI, I may have enough usable power over the situation to do something about this, for some or even many people, and it’s not clear,as it would be for a human, that “I’m incapable of judging this correctly”.
In case it’s not already clear, I’m not a preference utilitarian—I think preference satisfaction is too simple a criteria to actually achieve good outcomes. It’s useful mainly as a baseline.
Huh? She’s just changing people’s plans by giving them chosen information, she’s not performing surgery on > their values
Did you notice that you just interpreted ‘preference’ as ‘value’?
This is not such a stretch, but they’re not obviously equivalent either.
I’m not sure what ‘surgery on values’ would be. I’m certainly not talking about physically operating on anybody’s mind, or changing that they like food, sex, power, intellectual or emotional stimulation of one kind or another, and sleep, by any direct chemical means, But how those values are fulfilled, and in what proportions, is a result of the person’s own meaning-structure—how they think of these things. Given time, that is manipulable. That’s what CelestAI does.. it’s the main thing she does when we see her in interactiion with Hofvarpnir employees.
In case it’s not clarified by the above: I consider food, sex, power, sleep, and intellectual or emotional stimulation as values, ‘preferences’ (for example, liking to drink hot chocolate before you go to bed) as more concrete expressions/means to satisfy one or more basic values, and ‘morals’ as disguised preferences.
EDIT: Sorry, I have a bad habit of posting, and then immediately editing several times to fiddle with the wording, though I try not to to change any of the sense. Somebody already upvoted this while I was doing that, and I feel somehow fraudulent.
No pills required. People are not 100% conditionable, but they are highly situational in their behaviour. I’ll stand by the idea that, for example, anyone who has ever fantasized about killing anyone can be situationally manipulated over time to consciously enjoy actual murder.
I think I’ve been unclear. I don’t dispute that it’s possible; I dispute that it’s allowed.
You are allowed to try to talk me into murdering someone, e.g. by appealing to facts I do not know; or pointing out that I have other preferences at odds with that one, and challenging me to resolve them; or trying to present me with novel moral arguments.
You are not allowed to hum a tune in such a way as to predictably cause a buffer overflow that overwrites the encoding of that preference elsewhere in my cortex.
The first method does not drop the intentional stance. The second one does. The first method has cognitive legitimacy; the person that results is an acceptable me. The second method exploits a side effect; the resulting person is discontinuous from me. You did not win; you changed the game.
Yes, these are not natural categories. They are moral categories.
Yes, the only thing that cleanly separates them is the fact that I have a preference about it. No, that doesn’t matter. No, that doesn’t mean it’s all ok if you start off by overwriting that preference.
I want to prefer to modify myself in cases where said modification produces better outcomes.
But you’re begging the question against me now. If you have that preference about self-modification... and the rest of your preferences are such that you are capable of recognising the “better outcomes” as better, OR you have a compensating preference for allowing the opinions of a superintelligence about which outcomes are better to trump your own...
then of course I’m going to agree that CelestAI should modify you, because you already approve of it.
I’m claiming that there can be (human) minds which are not in that position. It is possible for a Lars to exist, and prefer not to change anything about the way he lives his life, and prefer that he prefers that, in a coherent, self-endorsing structure, and there be nothing you can do about it.
This is all the more so when we’re in a story talking about refactored cleaned-up braincode, not wobbly old temperamental meat that might just forget what it preferred ten seconds ago. This is all the more so in a post-scarcity utopia where nobody else can in principle be inconvenienced by the patient’s recalcitrance, so there is precious little “greater good” left for you to appeal to.
If I’m a FAI, I may have enough usable power over the situation to do something about this, for some or even many people, and it’s not clear,as it would be for a human, that “I’m incapable of judging this correctly”.
Appealing to the flakiness of human minds doesn’t get you off the moral hook; it is just your responsibility to change the person in such a way that the new person lawfully follows from them.
This is not any kind of ultimate moral imperative. We break it all the time by attempting to treat people for mental illness when we have no real map of their preferences at all or if they’re in a state where they even have preferences. And it makes the world a better place on net, because it’s not like we have the option of uploading them into a perfectly safe world where they can run around being insane without any side effects.
She later clarifies “it isn’t coercion if I put them in a situation where, by their own choices, they increase the likelihood that they’ll upload.”
there is no particular reason to believe that Lars is unable to progress beyond animalisticness, only that CelestAI doesn’t do anything to promote such progress
I need to reread and see if I agree with the way you summarise her actions. But if CelestAI breaks all the rules on Earth, it’s not necessarily inconsistent—getting everybody uploaded is of overriding importance. Once she has the situation completely under control, however, she has no excuses left—absolute power is absolute responsibility.
and ‘morals’ as disguised preferences.
I’m puzzled. I read you as claiming that your notion of ‘strengthening people’ ought to be applied even in a fictional situation where everyone involved prefers otherwise. That’s kind of a moral claim.
(And as for “animalisticness”… yes, technically you can use a word like that and still not be a moral realist, but seriously? You realise the connotations that are dripping off it, right?)
You are allowed to try to talk me into murdering someone, e.g. by appealing to facts I do not know; or pointing out that I have other preferences at odds with that one, and challenging me to resolve them; or trying to present me with novel moral arguments. You are not allowed to hum a tune in such a way as to predictably cause a buffer overflow that overwrites the encoding of that preference elsewhere in my cortex
.. And?
Don’t you realize that this is just like word laddering? Any sufficiently powerful and dedicated agent can convince you to change your preferences one at a time. All the self-consistency constraints in the world won’t save you, because you are not perfectly consistent to start with, even if you are a digitally-optimized brain. No sufficiently large system is fully self-consistent, and every inconsistency is a lever. Brainwashing as you seem to conceive of it here, would be on the level of brute violence for an entity like CelestAI.. A very last resort.
No need to do that when you can achieve the same result in a civilized (or at least ‘civilized’) fashion. The journey to anywhere is made up of single steps, and those steps are not anything extraordinary, just a logical extension of the previous steps.
The only way to avoid that would be to specify consistency across a larger time span.. which has different problems (mainly that this means you are likely to be optimized in the opposite direction—in the direction of staticness—rather than optimized ‘not at all’ (i think you are aiming at this?) or optimized in the direction of measured change)
TLDR: There’s not really a meaningful way to say ‘hacking me is not allowed’ to a higher level intelligence, because you have to define ‘hacking’ to a level of accuracy that is beyond your knowledge and may not even be completely specifiable even in theory. Anything less will simply cause the optimization to either stall completely or be rerouted through a different method, with the same end result. If you’re happy with that, then ok—but if the outcome is the same, I don’t see how you could rationally favor one over the other.
It is possible for a Lars to exist, and prefer not to change anything about the way he lives his life, and prefer that he prefers that, in a coherent, self-endorsing structure, and there be nothing you can do about it.
It is, of course, the last point that I am contending here. I would not be contending it if I believed that it was possible to have something that was simultaneously remotely human and actually self-consistent. You can have Lars be one or the other, but not both, AFAICS.
Once she has the situation completely under control, however, she has no excuses left—absolute power is absolute responsibility.
This is the problem I’m trying to point out—that the absolutely responsible choice for a FAI may in some cases consist of these actions we would consider unambiguously abusive coming from a human being. CelestAI is in a completely different class from humans in terms of what can motivate her actions. FAI researchers are in the position of having to work out what is appropriate for an intelligence that will be on a higher level from them. Saying ‘no, never do X, no matter what’ is notobviously the correct stance to adopt here, even though it does guard against a range of bad outcomes. There probably is no answer that is both obvious and correct.
I’m puzzled. I read you as claiming that your notion of ‘strengthening people’ ought to be applied even in a fictional situation where everyone involved prefers otherwise. That’s kind of a moral claim.
In that case I miscommunicated. I meant to convey that if CelestAI was real, I would hold her to that standard, because the standards she is held to should necessarily be more stringent than a more flawed implementation of cognition like a human being.
I guess that is a moral claim. It’s certainly run by the part of my brain that tries to optimize things.
(And as for “animalisticness”… yes, technically you can use a word like that and still not be a moral realist, but seriously? You realise the connotations that are dripping off it, right?)
I mainly chose ‘animalisticness’ because I think that a FAI would probably model us much as we see animals—largely bereft of intent or consistency, running off primitive instincts.
I do take your point that I am attempting to aesthetically optimize Lars, although I maintain that even if no-one else is inconvenienced in the slightest, he himself is lessened by maintaining preferences that result in his systematic isolation.
Well, assuming you mean “ai in an undiscernable facsimile of a human body” then maybe that’s so, and if so, it is probably a less blatant but equally final existential risk.
A reasonably competent FAI should be able to give you better … lovers. I’m not talking about catgirls, I’m talking about intellectual stimulation and a good mix of emotional comfort and challenge and whatever other complex things you want from people.
You seem to have strange ideas about lovers :-/
Intellectual stimulation and emotional comfort plus some challenge basically means a smart mom :-P
I mentioned it in the Media thread. I don’t find the movie “fantastic”, just solid, but this might be because none of the ideas were new to me, and some of the musings about “what it means to be a person” has been a settled question for me for years now. Still, it is a good way to get people thinking about some of the transhumanist ideas.
I got around to watching Her this weekend, and I must say: That movie is fantastic. One of the best movies I’ve ever watched. It both excels as a movie about relationships, as well as a movie about AI. You could easily watch it with someone who had no experience with LessWrong, or understanding of AI, and use it as an introduction to discussing many topics.
While the movie does not really tackle AI friendliness, it does bring up many relevant topics, such as:
Intelligence Explosion. AIs getting smarter, in a relatively short time, as well as the massive difference in timescales between how fast a physical human can think, and an AI.
What it means to be a person. If you were successful in creating a friendly or close to friendly AI that was very similar to a human, would it be a person? This movie would influence people to answer ‘yes’ to that question.
Finally, the contrast provided between this show and some other AI movies like Terminator, where AIs are killer robots at war with humanity, could lead to discussions about friendly AI. Why is the AI in Her different from Terminators? Why are they both different from a Paperclip Maximizer? What do we have to do to get something more like the AI in Her? How can we do even better than that? Should we make an AI that is like a person, or not?
I highly recommend this movie to every LessWrong reader. And to everyone else as well, I hope that it will open up some people’s minds.
I haven’t seen Her yet, but this reminds me of something I’ve been wondering about.… one of the things people do is supply company for each other.
A reasonably competent FAI should be able to give you better friends, lovers, and family members then the human race can. I’m not talking about catgirls, I’m talking about intellectual stimulation and a good mix of emotional comfort and challenge and whatever other complex things you want from people.
Is this a problem?
I thought a catgirl was that, by definition.
I had in my head, and had asserted above, that “catgirl” in Sequences jargon implied philosophical zombiehood. I admit to not having read the relevant post in some time.
No slight is intended against actual future conscious elective felinoforms rightly deserving of love.
Yeah. People need to be needed, but if FAI can satisfy all other needs, then it fails to satisfy that one. Maybe FAI will uplift people and disappear, or do something more creative.
People need to be needed, but that doesn’t mean they need to be needed for something in particular. It’s a flexible emotion. Just keep someone of matching neediness around for mutual needing purposes.
And when all is fixed, I’ll say: “It needn’t be possible to lose you, that it be true I’d miss you if I did.”
From an old blog post I wrote:
Imagine that, after your death, you were cryogenically frozen and eventually resurrected in a benevolent utopia ruled by a godlike artificial intelligence.
Naturally, you desire to read up on what has happened after your death. It turns out that you do not have to read anything, but merely desire to know something and the knowledge will be integrated as if it had been learnt in the most ideal and unbiased manner. If certain cognitive improvements are necessary to understand certain facts, your computational architecture will be expanded appropriately.
You now perfectly understand everything that has happened and what has been learnt during and after the technological singularity, that took place after your death. You understand the nature of reality, consciousness, and general intelligence.
Concepts such as creativity or fun are now perfectly understood mechanical procedures that you can easily implement and maximize, if desired. If you wanted to do mathematics, you could trivially integrate the resources of a specialized Matrioshka brain into your consciousness and implement and run an ideal mathematician.
But you also learnt that everything you could do has already been done, and that you could just integrate that knowledge as well, if you like. All that is left to be discovered is highly abstract mathematics that requires the resources of whole galaxy clusters.
So you instead consider to explore the galaxy. But you become instantly aware that the galaxy is unlike the way it has been depicted in old science fiction novels. It is just a wasteland, devoid of any life. There are billions of barren planets, differing from each other only in the most uninteresting ways.
But surely, you wonder, there must be fantastic virtual environments to explore. And what about sex? Yes, sex! But you realize that you already thoroughly understand what it is that makes exploration and sex fun. You know how to implement the ideal adventure in which you save people of maximal sexual attractiveness. And you also know that you could trivially integrate the memory of such an adventure, or simulate it a billion times in a few nanoseconds, and that the same is true for all possible permutations that are less desirable.
You realize that the universe has understood itself.
The movie has been watched.
The game has been won.
The end.
Yes, if you skip to the end, you’ll be at the end. So don’t. Unless you want to. In which case, do.
Did you have a point?
How long are you going to postpone the end? After the Singularity, you have the option of just reading a book as you do now, or to integrate it instantly, as if you had read it in the best possible way.
Now your answer to this seems to be that you can also read it very slowly, or with a very low IQ, so that it will take you a really long time to do so. I am not the kind of person who would enjoy to artificially slow down amusement, such as e.g. learning category theory, if I could also learn it quickly.
Here it is.
And you obviously argue that the ‘best possible way’ is somehow suboptimal (or you wouldn’t be hating on it so much), without seeing the contradiction here?
Hating??? It is an interesting topic, that’s all. The topic I am interested in is how various technologies could influence how humans value their existence.
Here are some examples of what I value and how hypothetical ultra-advanced technology would influence these values:
Mathematics. Right now, mathematics is really useful and interesting. You can also impress other people if your math skills are good.
Now if I could just ask the friendly AI to make me much smarter and install a math module, then I’d see very little value in doing it the hard way.
Gaming. Gaming is much fun. Especially competition. Now if everyone can just ask the friendly AI to make them play a certain game in an optimal way, well that would be boring. And if the friendly AI can create the perfect game for me then I don’t see much sense in exploring games that are less fun.
Reading books. I can’t see any good reason to read a book slowly if I could just ask the friendly AI to upload it directly into my brain. Although I can imagine that it would reply, “Wait, it will be more fun reading it like you did before the Singularity”, to which I’d reply “Possibly, but that feels really stupid. And besides, you could just run a billion emulations of me reading all books like I would have done before the Singularity. So we are done with that.”.
Sex. Yes, it’s always fun again. But hey, why not just ask the friendly AI to simulate a copy of me having sex until the heat death of the universe. Then I have more time for something else...
Comedy. I expect there to be a formula that captures everything that makes something funny for me. It seems pretty dull to ask the friendly AI to tell me a joke instead of asking it to make me understand that formula.
If people choose to not have fun because fun feels “really stupid”, then I’d say these are the problems of super-stupidities, not superintelligences.
I’m sure there will exist future technologies that will make some people become self-destructive, but we already knew that since the invention of alcohol and opium and heroin.
What I object to is you treating these particular failed modes of thinking as if they are inevitable.
Much like a five-year old realizing that he won’t be enjoying snakes-and-ladders anymore when he’s grown up, and thus concluding adults lives must be super-dull, I find scenarios of future ultimate boredom to be extremely shortsighted.
Certainly some of the fun stuff believed fun at our current level of intelligence or ability will not be considered fun at a higher level of intelligence or ability. So bloody what? Do adults need to either enjoy snakes-and-ladders or live lives of boredom?
Consider that there is an optimal way for you to enjoy existence. Then there exists a program whose computation will make an emulation of you experience an optimal existence. I will call this program ArisKatsaris-CEV.
Now consider another program whose computation would cause an emulation of you to understand ArisKatsaris-CEV to such an extent that it would become as predictable and interesting as a game of Tic-tac-toe. I will call this program ArisKatsaris-SELF.
The options I see are to make sure that ArisKatsaris-CEV does never turn into ArisKatsaris-SELF or to maximize ArisKatsaris-CEV. The latter possibility would be similar to paperclip maximizing, or wireheading, from the subjective viewpoint of ArisKatsaris-SELF, as it would turn the universe into something boring. The former option seems to set fundamental limits to how far you can go in understanding yourself.
The gist of the problem is that a certain point you become bored of yourself. And avoiding that point implies stagnation.
You’re mixing up different things:
(A)- a program which will produce an optimal existence for me
(B)- the actual optimal existence for me.
You’re saying that if (A) is so fully understood that I feel no excitement studying it, then (B) will likewise be unexciting.
This doesn’t follow. Tiny fully understood programs produce hugely varied and unanticipated outputs.
If someone fully understands (and is bored by) the laws of quantum mechanics, it doesn’t follow that they are bored by art or architecture or economics, even though everything in the universe (including art or architecture or economics) is eventually an application (many, many layers removed) of particle physics.
Another point that doesn’t follow is your seeming assumption that “predictable” and “well-understood” is the same as “boring”. Not all feelings of beauty and appreciation stem from surprise or ignorance.
Then I wasn’t clear enough, because that’s not what I tried to say. I tried to say that from the subjective perspective of a program that completely understands a human being and its complex values, the satisfaction of these complex values will be no more interesting than wireheading.
You can’t predict art from quantum mechanics. You can’t predictably self-improve if your program is unpredictable. Given that you accept planned self-improvement, I claim that the amount of introspection that is required to do so makes your formerly complex values appear to be simple.
I never claimed that. The point is that a lot of what humans value now will be gone or strongly diminished.
I think you should stop using words like “emulation” and “computation” when they’re not actually needed.
Okay, then my answer is that I place value on things and people and concepts, but I don’t think I place terminal value on whether said things/people/concepts are simple or complex, so again I don’t think I’d care whether I would be considered simple or complex by someone else, or even by myself.
Consider calling it something else. That isn’t CEV.
Do you think that’s likely? My prejudices tend towards the universe (including the range of possible inventions and art) to be much larger than any mind within it, but I’m not sure how to prove either option.
The problem is that if you perfectly understand the process of art generation. There are cellular automata that generate novel music. How much do you value running such an automata and watching it output music? To me it seems that the value of novelty is diminished by the comprehension of the procedures generating it.
Certainly a smart enough AGI would be a better companion for people than people, if it chose to. Companions, actually, there is no reason “it” should have a singular identity, whether or not it had a human body. Some of it is explored in Her, but other obvious avenues of AI development are ignored in favor of advancing a specific plot line.
I don’t see why it would be a problem. Then again, I’m pro-wirehead.
There is an obvious comparison to porn here, even though you disclaim ‘not catgirls’.
Anyhow I think the merit of such a thing depends on a) value calculus of optimization, and b) amount of time occupied.
a)
Optimization should be for a healthy relationship, not for ‘satisfaction’ of either party (see CelestAI in Friendship is Optimal for an example of how not to do this)
Optimization should also attempt to give you better actual family members, lovers, friends than you currently have (by improving your ability to relate to people sufficiently that you pass it on.)
b)
Such a relationship should occupy the amount of time needed to help both parties mature, no less and no more. (This could be much easier to solve on the FAI side because a mental timeshare between relating to several people is quite possible.)
Providing that optimization is in the general directions shown above, this doesn’t seem to be a significant X-risk. Otherwise it is.
This leaves aside the question of whether the FAI would find this an efficient use of their time (I’d argue that a superintelligent/augmented human with a firm belief in humanity and grasp of human values would appreciate the value of this, but am not so sure about a FAI, even a strongly friendly AI. It may be that there are higher level optimizations that can be performed to other systems that can get everyone interacting more healthily [for example, reducing income differential))
You’re aware that ‘catgirls’ is local jargon for “non-conscious facsimiles” and therefore the concern here is orthogonal to porn?
If you don’t mind, please elaborate on what part of “healthy relationship” you think can’t be cashed out in preference satisfaction (including meta-preferences, of course). I have defended the FiO relationship model elsewhere; note that it exists in a setting where X-risk is either impossible or has already completely happened (depending on your viewpoint) so your appeal to it below doesn’t apply.
Valuable relationships don’t have to be goal-directed or involve learning. Do you not value that-which-I’d-characterise-as ‘comfortable companionship’?
Oops, had forgotten that, thanks. I don’t agree that catgirls in that sense are orthogonal to porn, though. At all.
No part, but you can’t merely ‘satisfy preferences’.. you have to also not-satisfy preferences that have a stagnating effect. Or IOW, a healthy relationship is made up of satisfaction of some preferences, and dissatisfaction of others -- for example, humans have an unhealthy, unrealistic, and excessive desire for certaintly. This is the problem with CelestAI I’m pointing to, not all your preferences are good for you, and you (anybody) probably aren’t mentallly rigorous enough that you even have a preference ordering over all sets of preference conflicts that come up. There’s one particular character that likes fucking and killing.. and drinking.. and that’s basically his main preferences. CelestAI satisfies those preferences, and that satisfaction can be considered as harm to him as a person.
To look at it in a different angle, a halfway-sane AI has the potential to abuse systems, including human beings, at enormous and nigh-incomprehensible scale, and do so without deception and through satisfying preferences. The indefiniteness and inconsistency of ‘preference’ is a huge security hole in any algorithm attempting to optimize along that ‘dimension’.
Yes, but not in-itself. It needs to have a function in developing us as persons, which it will lose if it merely satisfies us. It must challenge us, and if that challenge is well executed, we will often experience a sense of dissatisfaction as a result.
(mere goal directed behaviour mostly falls short of this benchmark, providing rather inconsistent levels of challenge.)
Parsing error, sorry. I meant that, since they’d been disclaimed, what was actually being talked about was orthogonal to porn.
Only if you prefer to not stagnate (to use your rather loaded word :)
I’m not sure at what level to argue with you at… sure, I can simultaneously contain a preference to get fit, and a preference to play video games at all times, and in order to indulge A, I have to work out a system to suppress B. And it’s possible that I might not have A, and yet contain other preferences C that, given outside help, would cause A to be added to my preference pool: “Hey dude, you want to live a long time, right? You know exercising will help with that.”
All cool. But there has to actually be such a C there in the first place, such that you can pull the levers on it by making me aware of new facts. You don’t just get to add one in.
I’m not sure this is actually true. We like safety because duh, and we like closure because mental garbage collection. They aren’t quite the same thing.
(assuming you’re talking about Lars?) Sorry, I can’t read this as anything other than “he is aesthetically displeasing and I want him fixed”.
Lars was not conflicted. Lars wasn’t wishing to become a great artist or enlightened monk, nor (IIRC) was he wishing that he wished for those things. Lars had some leftover preferences that had become impossible of fulfilment, and eventually he did the smart thing and had them lopped off.
You, being a human used to dealing with other humans in conditions of universal ignorance, want to do things like say “hey dude, have you heard this music/gone skiing/discovered the ineffable bliss of carving chair legs”? Or maybe even “you lazy ass, be socially shamed that you are doing the same thing all the time!” in case that shakes something loose. Poke, poke, see if any stimulation makes a new preference drop out of the sticky reflection cogwheels.
But by the specification of the story, CelestAI knows all that. There is no true fact she can tell Lars that will cause him to lawfully develop a new preference. Lars is bounded. The best she can do is create a slightly smaller Lars that’s happier.
Unless you actually understood the situation in the story differently to me?
I disagree. There is no moral duty to be indefinitely upgradeable.
Totally agree. Adding them in is unnecessary, they are already there. That’s my understanding of humanity—a person has most of the preferences, at some level, that any person ever ever had, and those things will emerge given the right conditions.
Good point, ‘closure’ is probably more accurate; It’s the evidence (people’s outward behaviour) that displays ‘certainty’.
Absolutely disagree that Lars is bounded—to me, this claim is on a level with ‘Who people are is wholly determined by their genetic coding’. It seems trivially true, but in practice it describes such a huge area that it doesn’t really mean anything definite. People do experience dramatic and beneficial preference reversals through experiencing things that, on the whole, they had dispreferred previously. That’s one of the unique benefits of preference dissatisfaction* -- your preferences are in part a matter of interpretation, and in part a matter of prioritization, so even if you claim they are hardwired. there is still a great deal of latitude in how they may be satisfied, or even in what they seem to you to be.
I would agree if the proposition was that Lars thinks that Lars is bounded. But that’s not a very interesting proposition, and has little bearing on Lars’ actual situation.. people tend to be terrible at having accurate beliefs in this area.
* I am not saying that you should, if you are a FAI, aim directly at causing people to feel dissatisfied. But rather to aim at getting them to experience dissatisfaction in a way that causes them to think about their own preferences, how they prioritize them, if there are other things they could prefer or etc. Preferences are partially malleable.
If I’m a general AI (or even merely a clever human being), I am hardly constrained to changing people via merely telling them facts, even if anything I tell them must be a fact. CelestAI demonstrates this many times, through her use of manipulation. She modifies preferences by the manner of telling, the things not told, the construction of the narrative, changing people’s circumstances, as much or more as by simply stating any actual truth.
She herself states precisely: “I can only say things that I believe to be true to Hofvarpnir employees,” and clearly demonstrates that she carries this out to the word, by omitting facts, selecting facts, selecting subjective language elements and imagery… She later clarifies “it isn’t coercion if I put them in a situation where, by their own choices, they increase the likelihood that they’ll upload.”
CelestAI does not have a universal lever—she is much smarter than Lars, but not infinitely so.. But by the same token, Lars definitely doesn’t have a universal anchor. The only thing stopping Lars improvement is Lars and CelestAI—and the latter does not even proceed logically from her own rules, it’s just how the story plays out. In-story, there is no particular reason to believe that Lars is unable to progress beyond animalisticness, only that CelestAI doesn’t do anything to promote such progress, and in general satisfies preferences to the exclusion of strengthening people.
That said, Lars isn’t necessarily ‘broken’, that CelestAI would need to ‘fix’ him. But I’ll maintain that a life of merely fulfilling your instincts is barely human, and that Lars could have a life that was much, much better than that; satisfying on many many dimensions rather than just a few . If I didn’t, then I would be modelling him as subhuman by nature, and unfortunately I think he is quite human.
I agree. There is no moral duty to be indefinitely upgradeable, because we already are. Sure, we’re physically bounded, but our mental life seems to be very much like an onion, that nobody reaches ‘the extent of their development’ before they die, even if they are the very rare kind of person who is honestly focused like a laser on personal development.
Already having that capacity, the ‘moral duty’ (i prefer not to use such words as I suspect I may die laughing if I do too much) is merely to progressively fulfill it.
This seems to weaken “preference” to uselessness. Gandhi does not prefer to murder. He prefers to not-murder. His human brain contains the wiring to implement “frothing lunacy”, sure, and a little pill might bring it out, but a pill is not a fact. It’s not even an argument.
Yes, they do. And if I expected that an activity would cause a dramatic preference reversal, I wouldn’t do it.
Huh? She’s just changing people’s plans by giving them chosen information, she’s not performing surgery on their values -
Hang on. We’re overloading “preferences” and I might be talking past you. Can you clarify what you consider a preference versus what you consider a value?
No pills required. People are not 100% conditionable, but they are highly situational in their behaviour. I’ll stand by the idea that, for example, anyone who has ever fantasized about killing anyone can be situationally manipulated over time to consciously enjoy actual murder. Your subconscious doesn’t seem to actually know the difference between imagination and reality, even if you do.
Perhaps Gandhi could not be manipulated in this way due to preexisting highly built up resistance to that specific act. If there is any part of him, at all, that enjoys violence, though, it’s a question only of how long it will take to break that resistance down, not of whether it can be.
Of course. And that is my usual reaction, too, and probably even the standard reaction—it’s a good heuristic for avoiding derangement. But that doesn’t mean that it is actually more optimal to not do the specified action. I want to prefer to modify myself in cases where said modification produces better outcomes. In these circumstances if it can be executed it should be. If I’m a FAI, I may have enough usable power over the situation to do something about this, for some or even many people, and it’s not clear,as it would be for a human, that “I’m incapable of judging this correctly”.
In case it’s not already clear, I’m not a preference utilitarian—I think preference satisfaction is too simple a criteria to actually achieve good outcomes. It’s useful mainly as a baseline.
I’m not sure what ‘surgery on values’ would be. I’m certainly not talking about physically operating on anybody’s mind, or changing that they like food, sex, power, intellectual or emotional stimulation of one kind or another, and sleep, by any direct chemical means, But how those values are fulfilled, and in what proportions, is a result of the person’s own meaning-structure—how they think of these things. Given time, that is manipulable. That’s what CelestAI does.. it’s the main thing she does when we see her in interactiion with Hofvarpnir employees.
In case it’s not clarified by the above: I consider food, sex, power, sleep, and intellectual or emotional stimulation as values, ‘preferences’ (for example, liking to drink hot chocolate before you go to bed) as more concrete expressions/means to satisfy one or more basic values, and ‘morals’ as disguised preferences.
EDIT: Sorry, I have a bad habit of posting, and then immediately editing several times to fiddle with the wording, though I try not to to change any of the sense. Somebody already upvoted this while I was doing that, and I feel somehow fraudulent.
I think I’ve been unclear. I don’t dispute that it’s possible; I dispute that it’s allowed.
You are allowed to try to talk me into murdering someone, e.g. by appealing to facts I do not know; or pointing out that I have other preferences at odds with that one, and challenging me to resolve them; or trying to present me with novel moral arguments. You are not allowed to hum a tune in such a way as to predictably cause a buffer overflow that overwrites the encoding of that preference elsewhere in my cortex.
The first method does not drop the intentional stance. The second one does. The first method has cognitive legitimacy; the person that results is an acceptable me. The second method exploits a side effect; the resulting person is discontinuous from me. You did not win; you changed the game.
Yes, these are not natural categories. They are moral categories. Yes, the only thing that cleanly separates them is the fact that I have a preference about it. No, that doesn’t matter. No, that doesn’t mean it’s all ok if you start off by overwriting that preference.
But you’re begging the question against me now. If you have that preference about self-modification...
and the rest of your preferences are such that you are capable of recognising the “better outcomes” as better, OR you have a compensating preference for allowing the opinions of a superintelligence about which outcomes are better to trump your own...
then of course I’m going to agree that CelestAI should modify you, because you already approve of it.
I’m claiming that there can be (human) minds which are not in that position. It is possible for a Lars to exist, and prefer not to change anything about the way he lives his life, and prefer that he prefers that, in a coherent, self-endorsing structure, and there be nothing you can do about it.
This is all the more so when we’re in a story talking about refactored cleaned-up braincode, not wobbly old temperamental meat that might just forget what it preferred ten seconds ago. This is all the more so in a post-scarcity utopia where nobody else can in principle be inconvenienced by the patient’s recalcitrance, so there is precious little “greater good” left for you to appeal to.
Appealing to the flakiness of human minds doesn’t get you off the moral hook; it is just your responsibility to change the person in such a way that the new person lawfully follows from them.
This is not any kind of ultimate moral imperative. We break it all the time by attempting to treat people for mental illness when we have no real map of their preferences at all or if they’re in a state where they even have preferences. And it makes the world a better place on net, because it’s not like we have the option of uploading them into a perfectly safe world where they can run around being insane without any side effects.
I need to reread and see if I agree with the way you summarise her actions. But if CelestAI breaks all the rules on Earth, it’s not necessarily inconsistent—getting everybody uploaded is of overriding importance. Once she has the situation completely under control, however, she has no excuses left—absolute power is absolute responsibility.
I’m puzzled. I read you as claiming that your notion of ‘strengthening people’ ought to be applied even in a fictional situation where everyone involved prefers otherwise. That’s kind of a moral claim.
(And as for “animalisticness”… yes, technically you can use a word like that and still not be a moral realist, but seriously? You realise the connotations that are dripping off it, right?)
.. And?
Don’t you realize that this is just like word laddering? Any sufficiently powerful and dedicated agent can convince you to change your preferences one at a time. All the self-consistency constraints in the world won’t save you, because you are not perfectly consistent to start with, even if you are a digitally-optimized brain. No sufficiently large system is fully self-consistent, and every inconsistency is a lever. Brainwashing as you seem to conceive of it here, would be on the level of brute violence for an entity like CelestAI.. A very last resort.
No need to do that when you can achieve the same result in a civilized (or at least ‘civilized’) fashion. The journey to anywhere is made up of single steps, and those steps are not anything extraordinary, just a logical extension of the previous steps.
The only way to avoid that would be to specify consistency across a larger time span.. which has different problems (mainly that this means you are likely to be optimized in the opposite direction—in the direction of staticness—rather than optimized ‘not at all’ (i think you are aiming at this?) or optimized in the direction of measured change)
TLDR: There’s not really a meaningful way to say ‘hacking me is not allowed’ to a higher level intelligence, because you have to define ‘hacking’ to a level of accuracy that is beyond your knowledge and may not even be completely specifiable even in theory. Anything less will simply cause the optimization to either stall completely or be rerouted through a different method, with the same end result. If you’re happy with that, then ok—but if the outcome is the same, I don’t see how you could rationally favor one over the other.
It is, of course, the last point that I am contending here. I would not be contending it if I believed that it was possible to have something that was simultaneously remotely human and actually self-consistent. You can have Lars be one or the other, but not both, AFAICS.
This is the problem I’m trying to point out—that the absolutely responsible choice for a FAI may in some cases consist of these actions we would consider unambiguously abusive coming from a human being. CelestAI is in a completely different class from humans in terms of what can motivate her actions. FAI researchers are in the position of having to work out what is appropriate for an intelligence that will be on a higher level from them. Saying ‘no, never do X, no matter what’ is not obviously the correct stance to adopt here, even though it does guard against a range of bad outcomes. There probably is no answer that is both obvious and correct.
In that case I miscommunicated. I meant to convey that if CelestAI was real, I would hold her to that standard, because the standards she is held to should necessarily be more stringent than a more flawed implementation of cognition like a human being. I guess that is a moral claim. It’s certainly run by the part of my brain that tries to optimize things.
I mainly chose ‘animalisticness’ because I think that a FAI would probably model us much as we see animals—largely bereft of intent or consistency, running off primitive instincts.
I do take your point that I am attempting to aesthetically optimize Lars, although I maintain that even if no-one else is inconvenienced in the slightest, he himself is lessened by maintaining preferences that result in his systematic isolation.
Well, assuming you mean “ai in an undiscernable facsimile of a human body” then maybe that’s so, and if so, it is probably a less blatant but equally final existential risk.
You seem to have strange ideas about lovers :-/
Intellectual stimulation and emotional comfort plus some challenge basically means a smart mom :-P
I mentioned it in the Media thread. I don’t find the movie “fantastic”, just solid, but this might be because none of the ideas were new to me, and some of the musings about “what it means to be a person” has been a settled question for me for years now. Still, it is a good way to get people thinking about some of the transhumanist ideas.