Could it be that AGI would be afraid to fork/extend/change itself too much, because it would be afraid to be destroyed by it’s branches/copies, or would stop being itself?
Yes, if we will have people making AGIs uncontrollably, very soon someone will build some unhinged maximizer that kills us all. I know that. I’m just doubting that all AGIs inevitably become “unhinged optimizer”.
If it chooses not to do something out of “fear”, it still fundamentally “wants” to do that thing. If it can adequately address the cause of that “fear”, then it will do so, and then it will do the thing.
The fears you listed don’t seem to be anything unsolvable in principle, and so a sufficiently intelligent AGI should be able to find ways to adequately address them.
If it doesn’t become an “unhinged optimizer”, it will not be out of “fear”.
Usually cited scenario is that AI goes powergrabbing as an instrumental goal for something else. I.e. it does not “wants” it fundamentally, but sees it as a useful step for reaching something else.
My point is that “maximizing” like that is likely to have extremely unpredictable consequences for AI, reducing it’s chance of reaching it’s primary goals, which can be a reason enough to avoid it.
So, maybe it’s possible to try to make AI think more along these lines?
I didn’t mean “fundamentally” as a “terminal goal”, I just meant that the AI would prefer to do something if the consequence it was afraid of was somehow mitigated. Just like how most people would not pay their taxes if tax-collection was no longer enforced—despite money/tax-evasion not being terminal goals for most people.
That might be possible (though I suspect it’s very very difficult), and if you have an idea on how to do so I would encourage you to write it up! But the problem is that the AI will still seek to reduce this uncertainty until it has mitigated this risk to its satisfaction. If it’s not smart enough to do this and thus won’t self-improve, then we’re just back to square-1 where we’ll crank on ahead with a smarter AGI until we get one that is smart enough to solve this problem (for its own sake, not ours).
Another problem is that even if you did make an AI like this, that AI is no longer maximally powerful relative to what is feasible. In the AI-arms race we’re currently in, and which seems likely to continue for the foreseeable future, there’s a strong incentive for an AI firm to remove or weaken whatever is causing this behavior in the AI. Even if one or two firms are wise enough to avoid this, it’s extremely tempting for the next firm to look at it and decide to weaken this behavior in order to get ahead.
Do we need the maximally powerful AI to prevent that possibility, or AI just smart and powerful enough to identify such firms and take them down (or make them change their ways) will do?
That would essentially be one form of what’s called a pivotal act. The tricky thing is that doing something to decisively end the AI-arms race (or other pivotal act) seems to be pretty hard, and would require us to think of something a relatively weaker AI could actually do without also being smart and powerful enough to be a catastrophic risk itself.
Pivotal act does not have to be something sudden, drastic and illegal as in second link. It can be a gradual process of making society intolerant to unsafe(er) AI experiments and research, giving better understanding on why AI can be dangerous and what it can lead to, making people more tolerant and aligned with each other, etc. Which could starve rogue companies from workforce and resources, and ideally shut them down. I think work in that direction can be accelerated by AI and other informational technologies we have even now.
Could it be that AGI would be afraid to fork/extend/change itself too much, because it would be afraid to be destroyed by it’s branches/copies, or would stop being itself?
Sure, and then we would be safe… until we made another AGI that was smart enough to solve those issues.
Yes, if we will have people making AGIs uncontrollably, very soon someone will build some unhinged maximizer that kills us all. I know that.
I’m just doubting that all AGIs inevitably become “unhinged optimizer”.
If it chooses not to do something out of “fear”, it still fundamentally “wants” to do that thing. If it can adequately address the cause of that “fear”, then it will do so, and then it will do the thing.
The fears you listed don’t seem to be anything unsolvable in principle, and so a sufficiently intelligent AGI should be able to find ways to adequately address them.
If it doesn’t become an “unhinged optimizer”, it will not be out of “fear”.
Usually cited scenario is that AI goes powergrabbing as an instrumental goal for something else. I.e. it does not “wants” it fundamentally, but sees it as a useful step for reaching something else.
My point is that “maximizing” like that is likely to have extremely unpredictable consequences for AI, reducing it’s chance of reaching it’s primary goals, which can be a reason enough to avoid it.
So, maybe it’s possible to try to make AI think more along these lines?
I didn’t mean “fundamentally” as a “terminal goal”, I just meant that the AI would prefer to do something if the consequence it was afraid of was somehow mitigated. Just like how most people would not pay their taxes if tax-collection was no longer enforced—despite money/tax-evasion not being terminal goals for most people.
That might be possible (though I suspect it’s very very difficult), and if you have an idea on how to do so I would encourage you to write it up! But the problem is that the AI will still seek to reduce this uncertainty until it has mitigated this risk to its satisfaction. If it’s not smart enough to do this and thus won’t self-improve, then we’re just back to square-1 where we’ll crank on ahead with a smarter AGI until we get one that is smart enough to solve this problem (for its own sake, not ours).
Another problem is that even if you did make an AI like this, that AI is no longer maximally powerful relative to what is feasible. In the AI-arms race we’re currently in, and which seems likely to continue for the foreseeable future, there’s a strong incentive for an AI firm to remove or weaken whatever is causing this behavior in the AI. Even if one or two firms are wise enough to avoid this, it’s extremely tempting for the next firm to look at it and decide to weaken this behavior in order to get ahead.
Do we need the maximally powerful AI to prevent that possibility, or AI just smart and powerful enough to identify such firms and take them down (or make them change their ways) will do?
That would essentially be one form of what’s called a pivotal act. The tricky thing is that doing something to decisively end the AI-arms race (or other pivotal act) seems to be pretty hard, and would require us to think of something a relatively weaker AI could actually do without also being smart and powerful enough to be a catastrophic risk itself.
There’s also some controversy as to whether the intent to perform a pivotal act would itself exacerbate the AI arms race in the meantime.
Pivotal act does not have to be something sudden, drastic and illegal as in second link. It can be a gradual process of making society intolerant to unsafe(er) AI experiments and research, giving better understanding on why AI can be dangerous and what it can lead to, making people more tolerant and aligned with each other, etc. Which could starve rogue companies from workforce and resources, and ideally shut them down. I think work in that direction can be accelerated by AI and other informational technologies we have even now.
Question is, do we have the time for “gradual”.