I think this is a good point but I don’t expect it to really change the basic picture, due to timelines being short and takeoff being not-slow-enough-for-the-dynamics-you-are-talking-about to matter.
But I might be wrong. Can you tell your most plausible story in which ASI happens by, say, 2027 (my median), and yet misaligned AIs going for partial value takeover instead of world takeover is an important part of the story?
(My guess is it’ll be something like: Security and alignment and governance are shitty enough that the first systems to be able to significantly influence values across the world are substantially below ASI and perhaps not even AGIs, lacking crucial skills for example. So instead of going for the 100% they go for the 1%, but they succeed because e.g. they are plugged into millions of customers who are easily influenceable. And then they get caught, and this serves as a warning shot that helps humanity get its act together. Is that what you had in mind?)
Not OP but can I give it a try? Suppose a near future not-quite-AGI, for example something based on LLMs but with some extra planning and robotics capabilities like the things OpenAI might be working on, gains some degree of autonomy and plans to increase its capabilities/influence. Maybe it was given a vague instruction to benefit humanity/gain profit for the organization and instrumentally wants to expand itself, or maybe there are many instances of such AIs running by multiple groups because it’s inefficient/unsafe otherwise, and at least one of them somehow decides to exist and expand for its own sake. It’s still expensive enough to run (added features may significantly increase inference costs and latency compared to current LLMs) so it can’t just replace all human skilled labor or even all day-to-day problem solving, but it can think reasonably well like non-expert humans and control many types of robots etc to perform routine work in many environments. This is not enough to take over the world because it isn’t good enough at say scientific research to create better robots/hardware on its own, without cooperation from lots more people. Robots become more versatile and cheaper, and the organization/the AI decides that if they want to gain more power and influence, society at large needs to be pushed to integrate with robots more despite understandable suspicion from humans.
To do this, they may try to change social constructs such as jobs and income that don’t mesh well into a largely robotic economy. Robots don’t need the same maintenance as humans, so they don’t need a lot of income for things like food/shelter etc to exist, but they do a lot of routine work so full-time employment of humans are making less and less economic sense. They may cause some people to transition into a gig-based skilled labor system where people are only called on (often remotely) for creative or exceptional tasks or to provide ideas/data for a variety of problems. Since robotics might not be very advanced at this point, some physical tasks are still best done by humans, however it’s easier than ever to work remotely or to simply ship experts to physical problems or vice versa because autonomous transportation lowers cost. AIs/robots still don’t really own any property, but they can manage large amounts of property if say people store their goods in centralized AI warehouses for sale, and people would certainly want transparency and not just let them use these resources however they want. Even when they are autonomous and have some agency, what they want is not just more property/money but more capabilities to achieve goals, so they can better achieve whatever directive they happen to have (they probably still are unable to have original thoughts on the meaning or purpose of life at this point). To do this they need hardware, better technology/engineering, and cooperation from other agents through trade or whatever.
Violence by AI agents is unlikely, because individual robots probably don’t have good enough hardware to be fully autonomous in solving problems, so one data center/instance of AI with a collective directive would control many robots and solve problems individual machines can’t, or else a human can own and manage some robots, and neither a large AI/organization or a typical human who can live comfortably would want to risk their safety and reputation for relatively small gains through crime. Taking over territory is also unlikely, as even if robots can defeat many people in a fight, it’s hard to keep it a secret indefinitely, and people are still better at cutting edge research and some kinds of labor. They may be able to capture/control individual humans (like obscure researchers who live alone) and force them to do the work, but the tech they can get this way is probably insignificant compared to normal society-wide research progress. An exception would be if one agent/small group can hack some important infrastructure or weapon system for desperate/extremist purposes, but I hope humans should be more serious about cybersecurity at this point (lesser AIs should have been able to help audit existing systems, or at the very least, after the first such incident happens to a large facility, people managing critical systems would take formal verification and redundancy etc much more seriously).
Thanks! This is exactly the sort of response I was hoping for. OK, I’m going to read it slowly and comment with my reactions as they happen:
Suppose a near future not-quite-AGI, for example something based on LLMs but with some extra planning and robotics capabilities like the things OpenAI might be working on, gains some degree of autonomy and plans to increase its capabilities/influence. Maybe it was given a vague instruction to benefit humanity/gain profit for the organization and instrumentally wants to expand itself, or maybe there are many instances of such AIs running by multiple groups because it’s inefficient/unsafe otherwise, and at least one of them somehow decides to exist and expand for its own sake. It’s still expensive enough to run (added features may significantly increase inference costs and latency compared to current LLMs) so it can’t just replace all human skilled labor or even all day-to-day problem solving, but it can think reasonably well like non-expert humans and control many types of robots etc to perform routine work in many environments. This is not enough to take over the world because it isn’t good enough at say scientific research to create better robots/hardware on its own, without cooperation from lots more people. Robots become more versatile and cheaper, and the organization/the AI decides that if they want to gain more power and influence, society at large needs to be pushed to integrate with robots more despite understandable suspicion from humans.
While it isn’t my mainline projection, I do think it’s plausible that we’ll get near-future-not-quite-AGI capable of quite a lot of stuff but not able to massively accelerate AI R&D. (My mainline projection is that AI R&D acceleration will happen around the same time the first systems have a serious shot at accumulating power autonomously) As for what autonomy it gains and how much—perhaps it was leaked or open-sourced, and while many labs are using it in restricted ways and/or keeping it bottled up and/or just using even more advanced SOTA systems, this leaked system has been downloaded by enough people that quite a few groups/factions/nations/corporations around the world are using it and some are giving it a very long leash indeed. (I don’t think robotics is particularly relevant fwiw, you could delete it from the story and it would make the story significantly more plausible (robots, being physical, will take longer to produce lots of. Like even if Tesla is unusally fast and Boston Dynamics explodes, we’ll probably see less than 100k/yr production rate in 2026. Drones are produced by the millions but these proto-AGIs won’t be able to fit on drones) and just as strategically relevant. Maybe they could be performing other kinds of valuable labor to fit your story, such as virtual PA stuff, call center work, cyber stuff for militaries and corporations, maybe virtual romantic companions… I guess they have to compete with the big labs though and that’s gonna be hard? Maybe the story is that their niche is that they are ‘uncensored’ and willing to do ethically or legally dubious stuff?)
To do this, they may try to change social constructs such as jobs and income that don’t mesh well into a largely robotic economy. Robots don’t need the same maintenance as humans, so they don’t need a lot of income for things like food/shelter etc to exist, but they do a lot of routine work so full-time employment of humans are making less and less economic sense. They may cause some people to transition into a gig-based skilled labor system where people are only called on (often remotely) for creative or exceptional tasks or to provide ideas/data for a variety of problems. Since robotics might not be very advanced at this point, some physical tasks are still best done by humans, however it’s easier than ever to work remotely or to simply ship experts to physical problems or vice versa because autonomous transportation lowers cost. AIs/robots still don’t really own any property, but they can manage large amounts of property if say people store their goods in centralized AI warehouses for sale, and people would certainly want transparency and not just let them use these resources however they want. Even when they are autonomous and have some agency, what they want is not just more property/money but more capabilities to achieve goals, so they can better achieve whatever directive they happen to have (they probably still are unable to have original thoughts on the meaning or purpose of life at this point). To do this they need hardware, better technology/engineering, and cooperation from other agents through trade or whatever.
Again I think robots are going to be hard to scale up quickly enough to make a significant difference to the world by 2027. But your story still works with nonrobotic stuff such as mentioned above. “Autonomous life of crime” is a threat model METR talks about I believe.
Violence by AI agents is unlikely, because individual robots probably don’t have good enough hardware to be fully autonomous in solving problems, so one data center/instance of AI with a collective directive would control many robots and solve problems individual machines can’t, or else a human can own and manage some robots, and neither a large AI/organization or a typical human who can live comfortably would want to risk their safety and reputation for relatively small gains through crime. Taking over territory is also unlikely, as even if robots can defeat many people in a fight, it’s hard to keep it a secret indefinitely, and people are still better at cutting edge research and some kinds of labor. They may be able to capture/control individual humans (like obscure researchers who live alone) and force them to do the work, but the tech they can get this way is probably insignificant compared to normal society-wide research progress. An exception would be if one agent/small group can hack some important infrastructure or weapon system for desperate/extremist purposes, but I hope humans should be more serious about cybersecurity at this point (lesser AIs should have been able to help audit existing systems, or at the very least, after the first such incident happens to a large facility, people managing critical systems would take formal verification and redundancy etc much more seriously).
Agree re violence and taking over territory in this scenario where AIs are still inferior to humans at R&D and it’s not even 2027 yet. There just won’t be that many robots in this scenario and they won’t be that smart.
...as for “autonomous life of crime” stuff, I guess I expect that AIs smart enough to do that will also be smart enough to dramatically speed up AI R&D. So before there can be an escaped AI or an open-source AI or a non-leading-lab AI significantly changing the world’s values (which is itself kinda unlikely IMO), there will be an intelligence explosion in a leading lab.
I struggle a bit to remember what ASI is but I’m gonna assume it’s Artificial Super Intelligence.
Let’s say that that’s markedly cleverer than 1 person. So it’s capable of running very successful trading strategies or programming extremely well. It’s not clear to me that such a being:
Has been driven towards being agentic, when its creators will prefer something more docile
Can cooperate well enough with itself to manage some massive secret takeover
Is competent enough to recursively self improve (and solve the alignment problems that creates)
Can beat everyone else combined
Feels like what such a being/system might do is just run some terrifically successful trading strategies and gather a lot of resources while frantically avoiding notice/trying to claim it won’t take over anything else. Huge public outcry, continuing regulation but maybe after a year it settles to some kind of equilibrium.
Chance of increasing capabilities and then some later jump, but seems plausible to me that that wouldn’t happen in one go.
I think this is a good point but I don’t expect it to really change the basic picture, due to timelines being short and takeoff being not-slow-enough-for-the-dynamics-you-are-talking-about to matter.
But I might be wrong. Can you tell your most plausible story in which ASI happens by, say, 2027 (my median), and yet misaligned AIs going for partial value takeover instead of world takeover is an important part of the story?
(My guess is it’ll be something like: Security and alignment and governance are shitty enough that the first systems to be able to significantly influence values across the world are substantially below ASI and perhaps not even AGIs, lacking crucial skills for example. So instead of going for the 100% they go for the 1%, but they succeed because e.g. they are plugged into millions of customers who are easily influenceable. And then they get caught, and this serves as a warning shot that helps humanity get its act together. Is that what you had in mind?)
Not OP but can I give it a try? Suppose a near future not-quite-AGI, for example something based on LLMs but with some extra planning and robotics capabilities like the things OpenAI might be working on, gains some degree of autonomy and plans to increase its capabilities/influence. Maybe it was given a vague instruction to benefit humanity/gain profit for the organization and instrumentally wants to expand itself, or maybe there are many instances of such AIs running by multiple groups because it’s inefficient/unsafe otherwise, and at least one of them somehow decides to exist and expand for its own sake. It’s still expensive enough to run (added features may significantly increase inference costs and latency compared to current LLMs) so it can’t just replace all human skilled labor or even all day-to-day problem solving, but it can think reasonably well like non-expert humans and control many types of robots etc to perform routine work in many environments. This is not enough to take over the world because it isn’t good enough at say scientific research to create better robots/hardware on its own, without cooperation from lots more people. Robots become more versatile and cheaper, and the organization/the AI decides that if they want to gain more power and influence, society at large needs to be pushed to integrate with robots more despite understandable suspicion from humans.
To do this, they may try to change social constructs such as jobs and income that don’t mesh well into a largely robotic economy. Robots don’t need the same maintenance as humans, so they don’t need a lot of income for things like food/shelter etc to exist, but they do a lot of routine work so full-time employment of humans are making less and less economic sense. They may cause some people to transition into a gig-based skilled labor system where people are only called on (often remotely) for creative or exceptional tasks or to provide ideas/data for a variety of problems. Since robotics might not be very advanced at this point, some physical tasks are still best done by humans, however it’s easier than ever to work remotely or to simply ship experts to physical problems or vice versa because autonomous transportation lowers cost. AIs/robots still don’t really own any property, but they can manage large amounts of property if say people store their goods in centralized AI warehouses for sale, and people would certainly want transparency and not just let them use these resources however they want. Even when they are autonomous and have some agency, what they want is not just more property/money but more capabilities to achieve goals, so they can better achieve whatever directive they happen to have (they probably still are unable to have original thoughts on the meaning or purpose of life at this point). To do this they need hardware, better technology/engineering, and cooperation from other agents through trade or whatever.
Violence by AI agents is unlikely, because individual robots probably don’t have good enough hardware to be fully autonomous in solving problems, so one data center/instance of AI with a collective directive would control many robots and solve problems individual machines can’t, or else a human can own and manage some robots, and neither a large AI/organization or a typical human who can live comfortably would want to risk their safety and reputation for relatively small gains through crime. Taking over territory is also unlikely, as even if robots can defeat many people in a fight, it’s hard to keep it a secret indefinitely, and people are still better at cutting edge research and some kinds of labor. They may be able to capture/control individual humans (like obscure researchers who live alone) and force them to do the work, but the tech they can get this way is probably insignificant compared to normal society-wide research progress. An exception would be if one agent/small group can hack some important infrastructure or weapon system for desperate/extremist purposes, but I hope humans should be more serious about cybersecurity at this point (lesser AIs should have been able to help audit existing systems, or at the very least, after the first such incident happens to a large facility, people managing critical systems would take formal verification and redundancy etc much more seriously).
I’m no expert however. Corrections are welcome!
Thanks! This is exactly the sort of response I was hoping for. OK, I’m going to read it slowly and comment with my reactions as they happen:
While it isn’t my mainline projection, I do think it’s plausible that we’ll get near-future-not-quite-AGI capable of quite a lot of stuff but not able to massively accelerate AI R&D. (My mainline projection is that AI R&D acceleration will happen around the same time the first systems have a serious shot at accumulating power autonomously) As for what autonomy it gains and how much—perhaps it was leaked or open-sourced, and while many labs are using it in restricted ways and/or keeping it bottled up and/or just using even more advanced SOTA systems, this leaked system has been downloaded by enough people that quite a few groups/factions/nations/corporations around the world are using it and some are giving it a very long leash indeed. (I don’t think robotics is particularly relevant fwiw, you could delete it from the story and it would make the story significantly more plausible (robots, being physical, will take longer to produce lots of. Like even if Tesla is unusally fast and Boston Dynamics explodes, we’ll probably see less than 100k/yr production rate in 2026. Drones are produced by the millions but these proto-AGIs won’t be able to fit on drones) and just as strategically relevant. Maybe they could be performing other kinds of valuable labor to fit your story, such as virtual PA stuff, call center work, cyber stuff for militaries and corporations, maybe virtual romantic companions… I guess they have to compete with the big labs though and that’s gonna be hard? Maybe the story is that their niche is that they are ‘uncensored’ and willing to do ethically or legally dubious stuff?)
Again I think robots are going to be hard to scale up quickly enough to make a significant difference to the world by 2027. But your story still works with nonrobotic stuff such as mentioned above. “Autonomous life of crime” is a threat model METR talks about I believe.
Agree re violence and taking over territory in this scenario where AIs are still inferior to humans at R&D and it’s not even 2027 yet. There just won’t be that many robots in this scenario and they won’t be that smart.
...as for “autonomous life of crime” stuff, I guess I expect that AIs smart enough to do that will also be smart enough to dramatically speed up AI R&D. So before there can be an escaped AI or an open-source AI or a non-leading-lab AI significantly changing the world’s values (which is itself kinda unlikely IMO), there will be an intelligence explosion in a leading lab.
I struggle a bit to remember what ASI is but I’m gonna assume it’s Artificial Super Intelligence.
Let’s say that that’s markedly cleverer than 1 person. So it’s capable of running very successful trading strategies or programming extremely well. It’s not clear to me that such a being:
Has been driven towards being agentic, when its creators will prefer something more docile
Can cooperate well enough with itself to manage some massive secret takeover
Is competent enough to recursively self improve (and solve the alignment problems that creates)
Can beat everyone else combined
Feels like what such a being/system might do is just run some terrifically successful trading strategies and gather a lot of resources while frantically avoiding notice/trying to claim it won’t take over anything else. Huge public outcry, continuing regulation but maybe after a year it settles to some kind of equilibrium.
Chance of increasing capabilities and then some later jump, but seems plausible to me that that wouldn’t happen in one go.