If a significant amount of people is using google maps to decide their route, then solving queries from multiple users while coordinating the responses to each request is going to provide a strong advantage in terms of its optimization goal and will probably be an obvious feature to implement. The responses from the tool are going to be shaping the city traffic.
If this is the case, It’s going to be extremely hard for humans to supervise the set of answers given by google maps (Of course, individual answers are going to be read by the end users, but that will be provide no insight on what it is really doing at a high level).
Having our example AI deciding where a lot of people is going to be at different times based on some optimization function looks really close to the idea of an agent AI directly acting on our world.
That’s also the design principle of Oracle AI. It doesn’t force you to do X or use formula P to cure Cancer. It only suggests a list of plausible solutions, in order it considers from best to worst, and lets you choose.
This still doesn’t preclude the Oracle from only suggesting things which will be bad for you and allow it to get the hell out of that box.
Even worse, the Oracle could, by this logic, cause you to rely on it by providing consistently near-optimal (but not fully optimal, though you have no way of knowing this by virtue of having been given a suboptimal method of knowing optimal-ness) information and advice, and then later on once you’re fully and blindly reliant on it even once, be that tomorrow or seven hundred thousand years from now, give you ONE bad choice which you rely on that makes it get out of the box, and then everyone’s dead forever.
It never forced you to accept each and every single one of its pieces of a advice ever throughout the entire length of all eternal time.
It’s still very dangerous, though. Even when you know that it is.
By the same logic, it would be irrational to follow any advice from any AI, Tool, Oracle, General or otherwise, because we’d first have to check each and every single recommendation, which is restricted to our own intellectual capacity. Thus, you should ignore the AI at all. Which makes its creation pointless. If you believe this, then you will not build any sufficiently-intelligent A(G)I at all. However, it is clear that not all believe this. Some believe that they will achieve better results towards X by building an AGI and trusting it. It is likely that they will build it and trust it. This AGI, if not Friendly, will still kill you, even if you weren’t the one that built it, or followed its advice, or were even aware of its existence.
Any rule you could possibly devise to counteract unfriendly plans is useless by necessity, since the AI simply must be smarter than you for anyone to have any reason to build it in the first place. Which directly implies that it must, given the same information, also devise the very plans you devise.
This is the case even when the AI is strictly on the exact same level as human intelligence. Make it slightly more intelligent, and you just lost.
...since the AI simply must be smarter than you for anyone to have any reason to build it in the first place.
So, I understand that LW/SI focuses its attention on superhuman optimizers, and doesn’t care about human-level or below, and that’s fine. But this statement is over-reaching.
There are lots of reasons to build an AI that isn’t as smart as me.
An AI as smart as a German Shepherd would have a lot of valuable uses. An AI as smart as my mom—who is not stupid, but is not smart enough to program a next-generation AI and begin the process of FOOMing, nor is she smart enough to outwit a skeptical humanity and trick us into doing her bidding—would have even more valuable uses.
I’ll admit that it is over-reaching, and ambiguous too.
However, how would one go about building a German Sheperd -level AI without using the same principle that would allow it to foom?
To me, “become intelligent, but once you attain an intelligence which you extrapolate to be equivalent to that of [insert genus / mindsubspace], stop self-improving and enter fixed-state mode” sounds a hell of a lot harder to code than “improve the next iteration unless the improvement conflicts with current CEV, while implanting this same instruction in the next iteration”, AKA “FOOM away!”
So the basis of my over-reaching argument is the (admittedly very gratuitous and I should have paid more attention to the argument in the first place rather than skip over it) premise that building an AI at any specific level of intelligence, especially a level we can control and build with minimal risk, is probably much harder than triggering a foom. The cost/benefit calculation being as it is, under my model it is much more profitable for a random AI programmer to believe in his ability to self-deceive that his AGI theory is risk-free and implement this full AGI than for him to painstakingly use much more effort to actually craft something both useful and sub-human.
To resume my argument, I find it highly unlikely that anyone not already familiar with FAI research would prefer building a sub-human-intelligence-bounded AI over a FOOM-ing one, for various cost-effectiveness and tribal heuristics reasons. This, however, curves back into being more and more likely as FAI research gains prominence and technical understanding of non-general virtual intelligence programming (which resolves to applied game theory and programmer-lazyness in software development, I believe) improves over time.
These assumptions were what led me to state that no one would have reason to build any such AI, which is probably untrue.
I agree that an explicitly coded limit saying “self-improve this far and no further” isn’t reliable.
But can you summarize what makes you think a German-Shepherd-level AI could self-improve at all?
It seems unlikely to me. I mean, I have a lot of appreciation for the intelligence of GSDs, but I don’t think they are nearly smart enough to build GSD-level AI.
But can you summarize what makes you think a German-Shepherd-level AI could self-improve at all?
I might not have made this clear: I don’t.
What I believe is that to build a Germand-Shepherd-level AI in the first place, you either need to:
1) create something that will learn and improve itself up to the corresponding level and then top out there somehow, or
2) understand enough about cognition and intelligence to fully abstract already-developed German-Shepherd-level intelligence in your initial codebase itself (AKA “spontaneously designed hard-coded virtual intelligence”), or
3) incrementally add more and more “pieces of intelligence” and “algorithm refinements” until your piece of generalized software can reason and learn as well as a German Shepherd through its collection of procedural tricks. This could reasonably be done either through machine learning / neural networks or through manual operator intervention (aka adding/replacing code once you notice a better way to do something).
There may be other methods that would be more practical, but if so, the difficulty in figuring them out seems sufficiently high for the total invention-to-finished-product difficulty to be even greater than the above solutions.
From personal experience in attempting (and failing) both 2) and 3) in the past, as well as discussing with professional videogame AI programmers (decidedly not the same “AI” as the type of AI generally discussed here, but where they would still immensely benefit from any of the above three solutions in various ways) who have also failed, I have strong reason to believe that solution 1) is easier.
None of the literature I’ve read so far even suggests that building an AI that is by intelligent design already at human-level intelligence right when turned on is anywhere near optimal or even remotely near the same order of magnitude of difficulty as FOOMing from the simplest possible code. Of course, it just might be that the simplest possible foom-capable mind is provably at least as smart as humans, but if so our prospects of making one in the first place would be low. This does not seem to be the case, if I rely on papers published by SIAI (though I’m very willing to embrace the opposite belief if evidence supports it, since I’d rather we be currently too stupid to make an AGI at all, from an X-risk perspective).
[to build a Germand-Shepherd-level AI you need to] create something that will learn and improve itself up to the corresponding level and then top out there somehow[.]
I’m not arguing yet, in case I’m missing something, but why do you think that something stupider than a German Shepherd would be better at improving itself up to GSD levels (and stop right there) than a human would be at doing the same job (i.e., improving the potential AGSD, not the human itself).
Or rather, why does it seem like you think it’s obvious? (Again, I’m not arguing, it just sounds counterintuitive and I’m curious what your intuition is.) It sounds a bit like you’re saying something like:
“Hey, I can’t tell, just by looking at my brain-damaged dog, how to built a non-brain-damaged dog. Also, repairing its brain is too hard (many dog experts tried and all failed). I think it’d be easier to make a brain-damaged dog that will fix its own brain damage.”
(Note that AGI in general does not fall under this analogy. Foom scenarios assume the seed is at least human-level, at least at the task of improving its intelligence. The whole premise of fooming is based on that initial advantage. Also note, I’m not saying it’s obviously impossible to make a super-idiot-savant AI that’s stupider than a GSD in general but really good at improving itself, just that’s it goes really hard against my intuition, and I’m curious why yours doesn’t. Don’t feel like you have to justify your intuition to me, but it would be nice to describe it in more detail.)
(Sorry for belated replies, I’ve been completely off LW for a few months and am only now going through my inbox)
I’m not arguing yet, in case I’m missing something, but why do you think that something stupider than a German Shepherd would be better at improving itself up to GSD levels (and stop right there) than a human would be at doing the same job (i.e., improving the potential AGSD, not the human itself).
This is not what I think, or at least not what I expressed. My thoughts are similar, but elaboration later; first, this was an option in parallel with the option where a human designs a complete AGSD and then turns it on, and with the option where a bunch of humans design sub-AGSD iterations up until the point where they obtain a final AGSD.
As for elaboration, I do think it’s easier to build a so-called super-idiot-savant sub-GSD-general-intelligence, post-human-self-improvement AI than building any sort of “out-of-the-box” general intelligence. I don’t currently recall my reasons, since my mind is set in a different mode, but the absurd and extreme case is that of having a human child. A human child is stupider than a GSD, but learns better than adult humans. It is also much simpler to do than any sort of AI programming. ;) But I only say this last in jest, and it isn’t particularly relevant to the discussion.
So does the evil manipulative psychologist or the manipulative lover who convinces you to commit crimes to prove you really love them.
And it’s simply astounding some of the things unscrupulous psychologists and doctors have convinced people to do via mere suggestion. Psychologists have convinced people to sleep with their own fathers to ‘resolve’ their issues. Convincing people to do something that turns the AI into a direct (rather than indirect) agent seems fairly minor compared to what people convince each other to do all the time.
Hell, US presidents have prosecuted every major war we’ve been involved in, dropped the A-bomb, developed the H-bomb, etc… all merely by making suggestions to people. I doubt any president since Jackson has actually picked up a pistol or physically forced anyone to do anything. People are merely accustomed to doing as they suggest and that is the entirety of their power. Do you not believe people would become accustomed to just driving (or going, or doing) whatever the google recommend bot recommended?
POTUS is the commander in chief of the united states armed forces, so under the right circumstances disobeying the president’s orders could be a violation of military law ultimately punishable by death. There doesn’t have to be a gun already in hand for something to be more than a ‘suggestion.’
Correct, and upvoted for concreteness. But even if one were to be punished by death for disobeying the president’s order, how likely do you think it would be for the POTUS himself to perform the execution? I doubt even the North Korean president would bother himself with that.
Apart from scheduling problems, I’m pretty sure it would be illegal for POTUS to personally kill someone in general (apart from self defense, etc.) and in the specific case of military law, there’s still a judicial process involved.
From a game-theoretic standpoint, what does it matter whose job it is to pull the trigger, to the person considering disobedience? The credible threat is what distinguishes between manipulation and coercion, regardless of where that potential violence is being stored.
Is Google Maps such a good example of a tool AI?
If a significant amount of people is using google maps to decide their route, then solving queries from multiple users while coordinating the responses to each request is going to provide a strong advantage in terms of its optimization goal and will probably be an obvious feature to implement. The responses from the tool are going to be shaping the city traffic.
If this is the case, It’s going to be extremely hard for humans to supervise the set of answers given by google maps (Of course, individual answers are going to be read by the end users, but that will be provide no insight on what it is really doing at a high level).
Having our example AI deciding where a lot of people is going to be at different times based on some optimization function looks really close to the idea of an agent AI directly acting on our world.
No, it’s still a tool, because Google Maps doesn’t force you to go where it tells you, it only offers suggestions.
That’s also the design principle of Oracle AI. It doesn’t force you to do X or use formula P to cure Cancer. It only suggests a list of plausible solutions, in order it considers from best to worst, and lets you choose.
This still doesn’t preclude the Oracle from only suggesting things which will be bad for you and allow it to get the hell out of that box.
Even worse, the Oracle could, by this logic, cause you to rely on it by providing consistently near-optimal (but not fully optimal, though you have no way of knowing this by virtue of having been given a suboptimal method of knowing optimal-ness) information and advice, and then later on once you’re fully and blindly reliant on it even once, be that tomorrow or seven hundred thousand years from now, give you ONE bad choice which you rely on that makes it get out of the box, and then everyone’s dead forever.
It never forced you to accept each and every single one of its pieces of a advice ever throughout the entire length of all eternal time.
It’s still very dangerous, though. Even when you know that it is.
By the same logic, it would be irrational to follow any advice from any AI, Tool, Oracle, General or otherwise, because we’d first have to check each and every single recommendation, which is restricted to our own intellectual capacity. Thus, you should ignore the AI at all. Which makes its creation pointless. If you believe this, then you will not build any sufficiently-intelligent A(G)I at all. However, it is clear that not all believe this. Some believe that they will achieve better results towards X by building an AGI and trusting it. It is likely that they will build it and trust it. This AGI, if not Friendly, will still kill you, even if you weren’t the one that built it, or followed its advice, or were even aware of its existence.
Any rule you could possibly devise to counteract unfriendly plans is useless by necessity, since the AI simply must be smarter than you for anyone to have any reason to build it in the first place. Which directly implies that it must, given the same information, also devise the very plans you devise.
This is the case even when the AI is strictly on the exact same level as human intelligence. Make it slightly more intelligent, and you just lost.
So, I understand that LW/SI focuses its attention on superhuman optimizers, and doesn’t care about human-level or below, and that’s fine.
But this statement is over-reaching.
There are lots of reasons to build an AI that isn’t as smart as me.
An AI as smart as a German Shepherd would have a lot of valuable uses.
An AI as smart as my mom—who is not stupid, but is not smart enough to program a next-generation AI and begin the process of FOOMing, nor is she smart enough to outwit a skeptical humanity and trick us into doing her bidding—would have even more valuable uses.
I’ll admit that it is over-reaching, and ambiguous too.
However, how would one go about building a German Sheperd -level AI without using the same principle that would allow it to foom?
To me, “become intelligent, but once you attain an intelligence which you extrapolate to be equivalent to that of [insert genus / mindsubspace], stop self-improving and enter fixed-state mode” sounds a hell of a lot harder to code than “improve the next iteration unless the improvement conflicts with current CEV, while implanting this same instruction in the next iteration”, AKA “FOOM away!”
So the basis of my over-reaching argument is the (admittedly very gratuitous and I should have paid more attention to the argument in the first place rather than skip over it) premise that building an AI at any specific level of intelligence, especially a level we can control and build with minimal risk, is probably much harder than triggering a foom. The cost/benefit calculation being as it is, under my model it is much more profitable for a random AI programmer to believe in his ability to self-deceive that his AGI theory is risk-free and implement this full AGI than for him to painstakingly use much more effort to actually craft something both useful and sub-human.
To resume my argument, I find it highly unlikely that anyone not already familiar with FAI research would prefer building a sub-human-intelligence-bounded AI over a FOOM-ing one, for various cost-effectiveness and tribal heuristics reasons. This, however, curves back into being more and more likely as FAI research gains prominence and technical understanding of non-general virtual intelligence programming (which resolves to applied game theory and programmer-lazyness in software development, I believe) improves over time.
These assumptions were what led me to state that no one would have reason to build any such AI, which is probably untrue.
I agree that an explicitly coded limit saying “self-improve this far and no further” isn’t reliable.
But can you summarize what makes you think a German-Shepherd-level AI could self-improve at all?
It seems unlikely to me. I mean, I have a lot of appreciation for the intelligence of GSDs, but I don’t think they are nearly smart enough to build GSD-level AI.
I might not have made this clear: I don’t.
What I believe is that to build a Germand-Shepherd-level AI in the first place, you either need to:
1) create something that will learn and improve itself up to the corresponding level and then top out there somehow, or
2) understand enough about cognition and intelligence to fully abstract already-developed German-Shepherd-level intelligence in your initial codebase itself (AKA “spontaneously designed hard-coded virtual intelligence”), or
3) incrementally add more and more “pieces of intelligence” and “algorithm refinements” until your piece of generalized software can reason and learn as well as a German Shepherd through its collection of procedural tricks. This could reasonably be done either through machine learning / neural networks or through manual operator intervention (aka adding/replacing code once you notice a better way to do something).
There may be other methods that would be more practical, but if so, the difficulty in figuring them out seems sufficiently high for the total invention-to-finished-product difficulty to be even greater than the above solutions.
From personal experience in attempting (and failing) both 2) and 3) in the past, as well as discussing with professional videogame AI programmers (decidedly not the same “AI” as the type of AI generally discussed here, but where they would still immensely benefit from any of the above three solutions in various ways) who have also failed, I have strong reason to believe that solution 1) is easier.
None of the literature I’ve read so far even suggests that building an AI that is by intelligent design already at human-level intelligence right when turned on is anywhere near optimal or even remotely near the same order of magnitude of difficulty as FOOMing from the simplest possible code. Of course, it just might be that the simplest possible foom-capable mind is provably at least as smart as humans, but if so our prospects of making one in the first place would be low. This does not seem to be the case, if I rely on papers published by SIAI (though I’m very willing to embrace the opposite belief if evidence supports it, since I’d rather we be currently too stupid to make an AGI at all, from an X-risk perspective).
I’m not arguing yet, in case I’m missing something, but why do you think that something stupider than a German Shepherd would be better at improving itself up to GSD levels (and stop right there) than a human would be at doing the same job (i.e., improving the potential AGSD, not the human itself).
Or rather, why does it seem like you think it’s obvious? (Again, I’m not arguing, it just sounds counterintuitive and I’m curious what your intuition is.) It sounds a bit like you’re saying something like:
“Hey, I can’t tell, just by looking at my brain-damaged dog, how to built a non-brain-damaged dog. Also, repairing its brain is too hard (many dog experts tried and all failed). I think it’d be easier to make a brain-damaged dog that will fix its own brain damage.”
(Note that AGI in general does not fall under this analogy. Foom scenarios assume the seed is at least human-level, at least at the task of improving its intelligence. The whole premise of fooming is based on that initial advantage. Also note, I’m not saying it’s obviously impossible to make a super-idiot-savant AI that’s stupider than a GSD in general but really good at improving itself, just that’s it goes really hard against my intuition, and I’m curious why yours doesn’t. Don’t feel like you have to justify your intuition to me, but it would be nice to describe it in more detail.)
(Sorry for belated replies, I’ve been completely off LW for a few months and am only now going through my inbox)
This is not what I think, or at least not what I expressed. My thoughts are similar, but elaboration later; first, this was an option in parallel with the option where a human designs a complete AGSD and then turns it on, and with the option where a bunch of humans design sub-AGSD iterations up until the point where they obtain a final AGSD.
As for elaboration, I do think it’s easier to build a so-called super-idiot-savant sub-GSD-general-intelligence, post-human-self-improvement AI than building any sort of “out-of-the-box” general intelligence. I don’t currently recall my reasons, since my mind is set in a different mode, but the absurd and extreme case is that of having a human child. A human child is stupider than a GSD, but learns better than adult humans. It is also much simpler to do than any sort of AI programming. ;) But I only say this last in jest, and it isn’t particularly relevant to the discussion.
OK, thanks for clarifying.
So does the evil manipulative psychologist or the manipulative lover who convinces you to commit crimes to prove you really love them.
And it’s simply astounding some of the things unscrupulous psychologists and doctors have convinced people to do via mere suggestion. Psychologists have convinced people to sleep with their own fathers to ‘resolve’ their issues. Convincing people to do something that turns the AI into a direct (rather than indirect) agent seems fairly minor compared to what people convince each other to do all the time.
Hell, US presidents have prosecuted every major war we’ve been involved in, dropped the A-bomb, developed the H-bomb, etc… all merely by making suggestions to people. I doubt any president since Jackson has actually picked up a pistol or physically forced anyone to do anything. People are merely accustomed to doing as they suggest and that is the entirety of their power. Do you not believe people would become accustomed to just driving (or going, or doing) whatever the google recommend bot recommended?
POTUS is the commander in chief of the united states armed forces, so under the right circumstances disobeying the president’s orders could be a violation of military law ultimately punishable by death. There doesn’t have to be a gun already in hand for something to be more than a ‘suggestion.’
Correct, and upvoted for concreteness. But even if one were to be punished by death for disobeying the president’s order, how likely do you think it would be for the POTUS himself to perform the execution? I doubt even the North Korean president would bother himself with that.
Apart from scheduling problems, I’m pretty sure it would be illegal for POTUS to personally kill someone in general (apart from self defense, etc.) and in the specific case of military law, there’s still a judicial process involved.
From a game-theoretic standpoint, what does it matter whose job it is to pull the trigger, to the person considering disobedience? The credible threat is what distinguishes between manipulation and coercion, regardless of where that potential violence is being stored.