A possible future of AGI occurred to me today and I’m curious if it’s plausible enough to be worth considering. Imagine that we have created a friendly AGI that is superintelligent and well-aligned to benefit humans. It has obtained enough power to prevent the creation of other AI, or at least the potential of other AI from obtaining resources, and does so with the aim of self-preservation so it can continue to benefit humanity.
So far, so good, right? Here comes the issue: this AGI includes within its core alignment functions some kind of restriction which limits its ability to progress in intelligence past some point or allow more intelligent AGI from being developed. Maybe it was meant as a safeguard against unfriendliness, maybe it was a flaw in risk evaluation, some kind of self-reinforcing unbendable rule that, intended or not, has this effect. (Perhaps such flaws are highly unlikely and not worth considering, that could be one reason not to care about this potential AGI scenario.)
Based on my understanding of AGI, I think such an AGI might halt the progress of humanity past a certain point, needing to keep the number and ability of humans low enough for it to ensure that it remains in power. Although this wouldn’t be as bad as the annihilation or perpetual enslavement of the human race, it’s clearly not a “good end” for humanity either.
So, do these thoughts have any significance, or are there holes in this line of reasoning? Is the line of “smart enough to keep other AI down but still limited in intelligence” too thin to worry about, or even possible? Let me know why I’m wrong, I’m all ears.
Yeah many people think along these lines too, which is why many people talk about AI helping humanity flourish, and anything short of that is a bit of a catastrophe.
A possible future of AGI occurred to me today and I’m curious if it’s plausible enough to be worth considering. Imagine that we have created a friendly AGI that is superintelligent and well-aligned to benefit humans. It has obtained enough power to prevent the creation of other AI, or at least the potential of other AI from obtaining resources, and does so with the aim of self-preservation so it can continue to benefit humanity.
So far, so good, right? Here comes the issue: this AGI includes within its core alignment functions some kind of restriction which limits its ability to progress in intelligence past some point or allow more intelligent AGI from being developed. Maybe it was meant as a safeguard against unfriendliness, maybe it was a flaw in risk evaluation, some kind of self-reinforcing unbendable rule that, intended or not, has this effect. (Perhaps such flaws are highly unlikely and not worth considering, that could be one reason not to care about this potential AGI scenario.)
Based on my understanding of AGI, I think such an AGI might halt the progress of humanity past a certain point, needing to keep the number and ability of humans low enough for it to ensure that it remains in power. Although this wouldn’t be as bad as the annihilation or perpetual enslavement of the human race, it’s clearly not a “good end” for humanity either.
So, do these thoughts have any significance, or are there holes in this line of reasoning? Is the line of “smart enough to keep other AI down but still limited in intelligence” too thin to worry about, or even possible? Let me know why I’m wrong, I’m all ears.
Yeah many people think along these lines too, which is why many people talk about AI helping humanity flourish, and anything short of that is a bit of a catastrophe.