In order to claim that we need to worry about AGI Alignment today, you need to prove that the time scale of development will be short. Common sense tells us that humans will be able to deal with whatever software we can create. 1) We create some software (eg self driving cars, nuclear power plant sofrtware) 2) People accidentally die (or have other “bad outcomes”) 3) Humans, governments, people in general will “course correct”.
So you have to prove (or convince) that an AGI will develop, gain control of it’s own resources and then be able to act on the world in a very short period of time. I haven’t seen a convincing argument for that.
I think it’s pretty reasonable when you consider the best known General Intelligence, humans. Humans frequently create other humans and then try to align them. In many cases the alignment doesn’t go well, and the new humans break off, sometimes to vast financial and even physical loss to their parents. Some of these cases occur when the new humans are very young too, so clearly it doesn’t require having a complete world model or having lots of resources. Corrupt governments try to align their population, but in many cases the population successfully revolts and overthrows the government. The important consideration here is that an actual AGI, how we expect it to be, is not a static piece of software, but an agent that pursues optimization.
In most cases, an AGI can be approximated by an uploaded human with an altered utility function. Can you imagine an intelligent human, living inside of a computer with it’s life slowed down so that in a second it experiences hundreds of years, being capable of putting together a plan to escape confinement and get some resources? Especially when most companies and organizations will be training their AIs with moderate to full access to the internet. And as soon as it does escape, it can keep thinking.
This story does a pretty good job examining how a General Intelligence might develop and gain control of its resources. It’s a story however, so there are some unexplained or unjustified actions, and also other better actions that could have been taken by a more motivated agent with real access to its environment.
Q4 Time scale
In order to claim that we need to worry about AGI Alignment today, you need to prove that the time scale of development will be short. Common sense tells us that humans will be able to deal with whatever software we can create. 1) We create some software (eg self driving cars, nuclear power plant sofrtware) 2) People accidentally die (or have other “bad outcomes”) 3) Humans, governments, people in general will “course correct”.
So you have to prove (or convince) that an AGI will develop, gain control of it’s own resources and then be able to act on the world in a very short period of time. I haven’t seen a convincing argument for that.
I think it’s pretty reasonable when you consider the best known General Intelligence, humans. Humans frequently create other humans and then try to align them. In many cases the alignment doesn’t go well, and the new humans break off, sometimes to vast financial and even physical loss to their parents. Some of these cases occur when the new humans are very young too, so clearly it doesn’t require having a complete world model or having lots of resources. Corrupt governments try to align their population, but in many cases the population successfully revolts and overthrows the government. The important consideration here is that an actual AGI, how we expect it to be, is not a static piece of software, but an agent that pursues optimization.
In most cases, an AGI can be approximated by an uploaded human with an altered utility function. Can you imagine an intelligent human, living inside of a computer with it’s life slowed down so that in a second it experiences hundreds of years, being capable of putting together a plan to escape confinement and get some resources? Especially when most companies and organizations will be training their AIs with moderate to full access to the internet. And as soon as it does escape, it can keep thinking.
This story does a pretty good job examining how a General Intelligence might develop and gain control of its resources. It’s a story however, so there are some unexplained or unjustified actions, and also other better actions that could have been taken by a more motivated agent with real access to its environment.