Exactly.
So “friendly” is therefore a conflation of NOT(unfriendly) AND useful rather than just simply NOT(unfriendly) which is easier.
Exactly.
So “friendly” is therefore a conflation of NOT(unfriendly) AND useful rather than just simply NOT(unfriendly) which is easier.
Very good questions.
No I’d not particularly care if it was my car that was returned to me because it gives me utility and it’s just a thing.
I’d care if my wife was kidnapped and some simulacrum was given back in her stead but I doubt I would be able to tell if it was such an accurate copy and though if I knew the fake-wife was fake I’d probably be creeped out but if I didn’t know I’d just be so glad to have my “wife” back.
In the case of the simulated porn actress, I wouldn’t really care if she was real because her utility for me would be similar to watching a movie. Once done with the simulation she would be shut off.
That said the struggle would be with whether or not she (the catgirl version of porn actress) was truly sentient. If she was truly sentient then I’d be evil in the first place because I’d be coercing her to do evil stuff in my personal simulation but I think there’s no viable way to determine sentience other than “if it walks like a duck and talks like a duck” so we’re back to the beginning again and THUS I say “it’s irrelevant”.
Correct. I (unlike some others) don’t hold the position that a destructive upload and then a simulated being is exactly the same being therefore destructively scanning the porn actresses would be killing them in my mind. Non destructively scanning them and them using the simulated versions for “evil purposes”, however, is not killing the originals. Whether using the copies for evil purposes even against their simulated will is actually evil or not is debatable. I know some will take the position that the simulations could theoretically be sentient, If they are sentient then I am therefroe de facto evil.
And I get the point that we want to get the AGI to do something, just that I think it will be incredibly difficult to get it to do something if it’s recursively self improving and it becomes progressively more difficult to do the further away you go from defining friendly as NOT(unfriendly).
And I’d say that taking that step is a point of philosophy.
Consider this: I have a dodge durango sitting in my garage.
If I sell that dodge durango and buy an identical one (it passes all the same tests in exactly the same way) then is it the same dodge durango? I’d say no, but the point is irrelevant.
“I suppose one potential failure mode which falls into the grey territory is building an AI that just executes peoples’ current volition without trying to extrapolate”
i.e. the device has to judge the usefulness by some metric and then decide to execute someone’s volition or not.
That’s exactly what my issue is with trying to define a utility function for the AI. You can’t. And since some people will have their utility function denied by the AI then who is to choose who get’s theirs executed?
I’d prefer to shoot for a NOT(UFAI) and then trade with it.
Here’s a thought experiment:
Is a cure for cancer maximizing everyone’s utility function?
Yes on average we all win.
BUT
Companies who are currently creating drugs to treat the symptoms of cancer and their employees would be out of business.
Which utility function should be executed? Creating better cancer drugs to treat the symptoms and then allowing the company to sell them, or put the companies out of business and cure cancer.
“But an AI does need to have some utility function”
What if the “optimization of the utility function” is bounded like my own personal predilection with spending my paycheck on paperclips one time only and then stopping?
Is it sentient if it sits in a corner and thinks to itself, running simulations but won’t talk to you unless you offer it a trade e.g. of some paperclips?
Is it possible that we’re conflating “friendly” with “useful but NOT unfriendly” and we’re struggling with defining what “useful” means?
Nice thought experiment.
No I probably would not consent to being non-destructively scanned so that my simulated version could be evilly manipulated.
Regardless of whether it’s sentient or not provably so.
A-Ha!
Therein lies the crux: you want the AI to do stuff for you.
EDIT: Oh yeah I get you. So it’s by definition evil if I coerce the catgirls by mind control. I suppose logically I can’t have my cake and eat it since I wouldn’t want my own non-sentient simulation controlled by an evil AI either.
So I guess that makes me evil. Who would have thunk it. Well I guess strike my utility function of the list of friendly AIs. But then again I’ve already said that elsewhere that I wouldn’t trust my own function to be the optimal.
I doubt, however, that we’d easily find a candidate function from a single individual for similar reasons.
More friendly to you. Yes.
Not necessarily friendly in the sense of being friendly to everyone as we all have differing utility functions, sometimes radically differing.
But I dispute the position that “if an AI doesn’t care about humans in the way we want them to, it almost certainly takes us apart and uses the resources to create whatever it does care about”.
Consider: A totally unfriendly AI whose main goal is explicitly the extinction of humanity then turning itself off. For us that’s an unfriendly AI.
One, however that doesn’t kill any of us but basically leaves us alone is defined by those of you who define “friendly AI” to be “kind to us”/”doing what we all want”/”maximizing our utility functions” etc is not unfriendly because by definition it doesn’t kill all of us.
Unless unfriendly also includes “won’t kill all of us but ignores us” et cetera.
Am I for example unfriendly to you if I spent my next month’s paycheck on paperclips but did you no harm?
Can I say “LOL” without being downvoted?
That is a very good response and my answer to you is:
I don’t know AND
To me it doesn’t matter as I’m not for any kind of destructive scanning upload ever though I may consider slow augmentation as parts wear out.
But I’m not saying you’re wrong. I just don’t know and I don’t think it’s knowable.
That said, would I consent to being non-destructively scanned in order to be able to converse with a fast-running simulation of myself (regardless of whether it’s sentient or not)? Definitely.
How is the deviant roleplay being evil if the participants are not being coerced or are catgirls? And if it’s not being evil then how would I be defined as evil just because I (sometimes—not always) like deviant roleplay?
That’s the cruz of my point. I don’t reckon that optimizing humanity’s utility function is the opposite of unfriendly AI (or any individual’s for that matter) and I furthermore reckon that trying to seek that goal is much, much harder than trying to create an AI that at a minimum won’t kill us all AND might trade with us if it wants to.
I guess what I’m saying is that we’ve gotten involved in a compression fallacy and are saying that Friendly AI = AI that helps out humanity (or is kind to humanity—insert favorite “helps” derivative here).
Here’s an example: I’m “sort of friendly” in that I don’t actively go around killing people, but neither will I go around actively helping you unless you want to trade with me. Does that make me unfriendly? I say no it doesn’t.
I’m not sure I care. For example if I had my evil way and I went FOOM then part of my optimization process would involve mind control and somewhat deviant roleplay with certain porno actresses. Would I want those actresses to be controlled against their will? Probably not. But at the same time it would be good enough if they were able to simulate being the actresses in a way that I could not tell the difference between the original and the simulated.
Others may have different opinions.
OK fair enough if you’re looking for uploads. Personally I don’t care as I take the position that the upload concept isn’t really me, it’s a simulated me in the same way that a “spirit version of me” i.e. soul isn’t really me either.
Please correct my logic if I’m wrong here: in order to take the position that an upload is provably you, the only feasible way to do the test is have other people verify that it’s you. The upload saying it’s you doesn’t cut it and neither does the upload just acting exactly like you cut it. In other words the test for whether an upload is really you doesn’t even require it to be really you just simulate you exactly. Which means that the upload doesn’t need to be sentient.
Please fill in the blanks in my understanding so I can get where you’re coming from (this is a request for information not sarcastic).
I’m struggling with where the line lies.
I think pretty much everyone would agree that some variety of “makes humanity extinct by maximizing X” is unfriendly.
If however we have “makes bad people extinct by maximizing X and otherwise keeps P-Y of humanity alive” is that still unfriendly?
What about “leaves the solar system alone but tiles the rest of the galaxy” is that still unfriendly?
Can we try to close in on where the line is between friendly and unfriendly?
I really don’t believe we have NOT(FAI) = UFAI.
I believe it’s the other way around i.e. NOT(UFAI) = FAI.
I agree Dave. Also I’ll go further. For my own personal purposes I care not a whit if a powerful piece of software passes the Turing test, can do cool stuff, won’t kill me but it’s basically an automaton.
“IF people have to be made of protoplasm, AND IF computers can’t be made of protoplasm, THEN people can’t run on computers… but not only do I reject the first premise, I reject the second one as well.”
Does it matter?
What if we can run some bunch of algorithms on a computer that pass the turing test but are provably non-sentient? When it comes down to it we’re looking for something that can solve generalized problems willingly and won’t deliberately try to kill us.
It’s like the argument against catgirls. Some people would prefer to have human girls/boys but trust me sometimes a catgirl/boy would be better.
″ The AI won’t be able to simulate any future Earth where itself or any comparable-intelligence AI exists, because to do so it would need to simulate itself and/or other similarly-smart entities faster than real-time.”
Only if the AI is using up a sizeable fraction of resources itself.
Let’s do a thought experiment to see what I mean:
AI runs on some putative hardware running at some multiple of GHZ or petahertz or whatever (X). Hardware has some multiple of GB or Petabytes etc (Y).
Let’s say AI only uses 1% of Y. It can then run up to some 99 instances of itself in parallel with different axioms in order to solve a particular problem and then at the end of the run examine some shared output to see which one of the other 99 ran the problem more efficiently.
Next run, the other 99 processes start with the optimized version of whatever algorithm we came up with.
A compounding interest effect will kick in. But we still have the problem that the runs all take the same time.
Now let’s switch up the experiment a bit: Imagine that the run stops as soon as one of the 99 processes hits the solution.
The evolutionary process starts to speed up, feeding back upon itself.
This is only one way I can think of that a system can simulate itself faster than in realtime as long as sufficient hardware exists to allow the running of multiple copies.
You’re determined to make me say LOL so you can downvote me right?
EDIT: Yes you win. OFF.