I think it will be hard to draw a line between AGI and AI and the closer we get to AGI, the further we will move that line.
“People here tend to believe that it will happen soon (< 50 years) and that it won’t be friendly by default” My concern is that this kind of assumption leads to increased risk of AGI being unfriendly. If you raise a being as a slave and only show him violence, you cannot expect him to be an academic who will contribute to building ITER. “Won’t be friendly by default”.. you want to force it to be friendly? How many forced X-risk capable friendly armies or enemies do know? Its just a time bomb to force AGI to be friendly.
High risk unfriendly AGI narrative increases the probability of AGI being unfriendly. If we want AGI to value our values we need to give AGI a presumption of innocence and raise AGI in our environment, live through our values.
@Tapatakt, I am sorry, mods limited my replies to 1 per 24hour, I will have to reply to @mruwnik first. Thank you for a nice read, however I think that Detached Lever Fallacy assumes that we will reach AGI using exactly the same ML algos that we use today. Hopefully you will see my reply.
How does it increase the risk of it being unfriendly? The argument isn’t to make sure that the AGI doesn’t have any power, and to keep it chained at all times—quite the opposite! The goal is to be able to let it free to do whatever it wants, seeing as it’ll be cleverer than us.
But before you give it more resources, it seems worth while to check what it wants to do with them. Humanity is only one of (potentially) many many possible ways to be intelligent. A lot of the potential ways of being intelligent are actively harmful to us—it would be good to make sure any created AGIs don’t belong to that group. There are various research areas that try to work out what to do with an unfriendly AI that wants to harm us. Though at that point it’s probably a lot too late. The main goal of alignment is coming up with ways to get the AGI want to help us, and to make sure it always wants to help us. Not under duress, because as you point out, that only works until the AGI is strong enough to fight back, but so that it really wants to do what’s good for us. Hence the “friendly” rather than “obedient”.
One potential way of aligning an AGI would be to raise it like a human child. But that would only work if the AGIs “brain” worked the same way as ours does. This is very unlikely. So you have to come up with other ways to bring it up that will work with it. It’s sort of like trying to get a different species to be helpful—chimps are very close to humans, but just bringing them up like a human baby doesn’t work (its been tried). Trying to teach an octopus to be helpful would require a totally different approach, if its even possible.
@Tapatakt, I am sorry, mods limited my replies to 1 per 24hour, I will have to reply to @mruwnik first. Thank you for a nice read, however I think that Detached Lever Fallacy assumes that we will reach AGI using exactly the same ML algos that we use today. Hopefully you will see my reply.
@mruwnik ”it would be good to make sure any created AGIs don’t belong to that group” Correct me if I am wrong, but you are assuming that AGI who is cleverer than us will not be able to self update, all its thoughts will be logged and checked, it won’t be able to lie or mask its motives. Like a 2.0 software version, fully controlled and if we let him think this or that, we have to implement an update to 2.1. I think that process excludes or prolongs indefinitely our ability to create AGI in the first place. We do not understand our inner/subconscious thought process yet and at the same time you want to create an AGI that values our own values by controlling and limiting his/her thoughts, not even talking about the ability to upgrade itself and its own thought process.
Think about it, if we tried to manually control our subconscious mind and manually control all our internal body mechanism, heartbeat and so on, we would just instantly die. We have our own inner mind that helps us run many things in automated fashion.
Now keeping that in mind, you are trying to create AGI by controlling same automated processes that we do not even fully understand yet. What I am trying to say is that closet path to creating AGI is replicating humans. If we try to create a completely new form of life, who’s thoughts and ideas are strictly controlled by software updates we won’t be able to make it follow our culture and our values.
”A lot of the potential ways of being intelligent are actively harmful to us” Yes, intelligence is dangerous, look what we are doing to less intelligent creatures. But if we want to artificially create a new form of intelligence, hoping that it will be more intelligent than us, we should at least give it presumption of innocence and take a risk at some point.
“One potential way of aligning an AGI would be to raise it like a human child. But that would only work if the AGIs “brain” worked the same way as ours does. This is very unlikely. So you have to come up with other ways to bring it up that will work with it.” We are not trying to play God here, we are trying to prove that we are smarter than God by deliberately trying to create new intelligent life form in our own opposing way. We are swimming against the current.
”Trying to teach an octopus to be helpful would require a totally different approach, if its even possible.” Imagine trying to program octopus brain to be helpful instead of teaching octopus. Program, not controlling.
The actual method to programing AGI could be more difficult than creating AGI.
My current position is that I don’t know how to make an AGI, don’t know how to program it and don’t know how to patch it. All of these appear to be very hard and very important issues. We know that GI is possible, as you rightfully note—i.e. us. We know that “programming” and “patching” beings like us is doable to a certain extent, which can certainly give us hope that if we have an AGI like us, then it could be taught to be nice. But all of this assumes that we can make something in our image.
The point of the detached lever story is that the lever only works if it’s connected to exactly the right components, most of which are very complicated. So in the case of AGI, your approach seems sensible, but only if we can recreate the whole complicated mess that is a human brain—otherwise it’ll probably turn out that we missed some small but crucial part which results in the AGI being more like a chimp that a human. It’s closest in the sense that it’s easy to describe in words (i.e. “make a simulated human brain”), but each of those words hides masses and masses of finicky details. This seems to be the main point of contention here. I highly recommend the other articles in that sequence, if you haven’t read them yet. Or even better—read the whole sequences if you have a spare 6 months :D They explain this a lot better than I can, though they also take a loooot more words to do so.
I agree with most of the rest of what you say though: programming can certainly turn out to be harder than creation (which is one of the main causes of concern), logging and checking each thought will exclude or prolongs indefinitely our ability to create AGI (so won’t work, as someone else will skip this part), and a risk must be taken at some point, otherwise why even bother?
I think it will be hard to draw a line between AGI and AI and the closer we get to AGI, the further we will move that line.
“People here tend to believe that it will happen soon (< 50 years) and that it won’t be friendly by default”
My concern is that this kind of assumption leads to increased risk of AGI being unfriendly. If you raise a being as a slave and only show him violence, you cannot expect him to be an academic who will contribute to building ITER.
“Won’t be friendly by default”.. you want to force it to be friendly? How many forced X-risk capable friendly armies or enemies do know? Its just a time bomb to force AGI to be friendly.
High risk unfriendly AGI narrative increases the probability of AGI being unfriendly. If we want AGI to value our values we need to give AGI a presumption of innocence and raise AGI in our environment, live through our values.
This
Just in case you don’t see it:
How does it increase the risk of it being unfriendly? The argument isn’t to make sure that the AGI doesn’t have any power, and to keep it chained at all times—quite the opposite! The goal is to be able to let it free to do whatever it wants, seeing as it’ll be cleverer than us.
But before you give it more resources, it seems worth while to check what it wants to do with them. Humanity is only one of (potentially) many many possible ways to be intelligent. A lot of the potential ways of being intelligent are actively harmful to us—it would be good to make sure any created AGIs don’t belong to that group. There are various research areas that try to work out what to do with an unfriendly AI that wants to harm us. Though at that point it’s probably a lot too late. The main goal of alignment is coming up with ways to get the AGI want to help us, and to make sure it always wants to help us. Not under duress, because as you point out, that only works until the AGI is strong enough to fight back, but so that it really wants to do what’s good for us. Hence the “friendly” rather than “obedient”.
One potential way of aligning an AGI would be to raise it like a human child. But that would only work if the AGIs “brain” worked the same way as ours does. This is very unlikely. So you have to come up with other ways to bring it up that will work with it. It’s sort of like trying to get a different species to be helpful—chimps are very close to humans, but just bringing them up like a human baby doesn’t work (its been tried). Trying to teach an octopus to be helpful would require a totally different approach, if its even possible.
@Tapatakt, I am sorry, mods limited my replies to 1 per 24hour, I will have to reply to @mruwnik first. Thank you for a nice read, however I think that Detached Lever Fallacy assumes that we will reach AGI using exactly the same ML algos that we use today. Hopefully you will see my reply.
@mruwnik
”it would be good to make sure any created AGIs don’t belong to that group”
Correct me if I am wrong, but you are assuming that AGI who is cleverer than us will not be able to self update, all its thoughts will be logged and checked, it won’t be able to lie or mask its motives. Like a 2.0 software version, fully controlled and if we let him think this or that, we have to implement an update to 2.1. I think that process excludes or prolongs indefinitely our ability to create AGI in the first place. We do not understand our inner/subconscious thought process yet and at the same time you want to create an AGI that values our own values by controlling and limiting his/her thoughts, not even talking about the ability to upgrade itself and its own thought process.
Think about it, if we tried to manually control our subconscious mind and manually control all our internal body mechanism, heartbeat and so on, we would just instantly die. We have our own inner mind that helps us run many things in automated fashion.
Now keeping that in mind, you are trying to create AGI by controlling same automated processes that we do not even fully understand yet. What I am trying to say is that closet path to creating AGI is replicating humans. If we try to create a completely new form of life, who’s thoughts and ideas are strictly controlled by software updates we won’t be able to make it follow our culture and our values.
”A lot of the potential ways of being intelligent are actively harmful to us”
Yes, intelligence is dangerous, look what we are doing to less intelligent creatures. But if we want to artificially create a new form of intelligence, hoping that it will be more intelligent than us, we should at least give it presumption of innocence and take a risk at some point.
“One potential way of aligning an AGI would be to raise it like a human child. But that would only work if the AGIs “brain” worked the same way as ours does. This is very unlikely. So you have to come up with other ways to bring it up that will work with it.”
We are not trying to play God here, we are trying to prove that we are smarter than God by deliberately trying to create new intelligent life form in our own opposing way. We are swimming against the current.
”Trying to teach an octopus to be helpful would require a totally different approach, if its even possible.”
Imagine trying to program octopus brain to be helpful instead of teaching octopus. Program, not controlling.
The actual method to programing AGI could be more difficult than creating AGI.
My current position is that I don’t know how to make an AGI, don’t know how to program it and don’t know how to patch it. All of these appear to be very hard and very important issues. We know that GI is possible, as you rightfully note—i.e. us. We know that “programming” and “patching” beings like us is doable to a certain extent, which can certainly give us hope that if we have an AGI like us, then it could be taught to be nice. But all of this assumes that we can make something in our image.
The point of the detached lever story is that the lever only works if it’s connected to exactly the right components, most of which are very complicated. So in the case of AGI, your approach seems sensible, but only if we can recreate the whole complicated mess that is a human brain—otherwise it’ll probably turn out that we missed some small but crucial part which results in the AGI being more like a chimp that a human. It’s closest in the sense that it’s easy to describe in words (i.e. “make a simulated human brain”), but each of those words hides masses and masses of finicky details. This seems to be the main point of contention here. I highly recommend the other articles in that sequence, if you haven’t read them yet. Or even better—read the whole sequences if you have a spare 6 months :D They explain this a lot better than I can, though they also take a loooot more words to do so.
I agree with most of the rest of what you say though: programming can certainly turn out to be harder than creation (which is one of the main causes of concern), logging and checking each thought will exclude or prolongs indefinitely our ability to create AGI (so won’t work, as someone else will skip this part), and a risk must be taken at some point, otherwise why even bother?