Hot take, though increasingly moving towards lukewarm: if you want to get a pause/international coordination on powerful AI (which would probably be net good, though likely it would strongly depend on implementation details), arguments about risks from destabilization/power dynamics and potential conflicts between various actors are probably both more legible and ‘truer’ than arguments about technical intent misalignment and loss of control (especially for not-wildly-superhuman AI).
arguments about risks from destabilization/power dynamics and potential conflicts between various actors are probably both more legible and ‘truer’
Say more?
I think the general impression of people on LW is that multipolar scenarios and concerns over “which monkey finds the radioactive banana and drags it home” are in large part a driver of AI racing instead of being a potential impediment/solution to it. Individuals, companies, and nation-states justifiably believe that whichever one of them accesses potentially superhuman AGI first will have the capacity to flip the gameboard at-will, obtain power over the entire rest of the Earth, and destabilize the currently-existing system. Standard game theory explains the final inferential step for how this leads to full-on racing (see the recent U.S.-China Commission’s report for a representative example of how this plays out in practice).
I get that we’d like to all recognize this problem and coordinate globally on finding solutions, by “mak[ing] coordinated steps away from Nash equilibria in lockstep”. But I would first need to see an example, a prototype, of how this can play out in practice on an important and highly salient issue. Stuff like the Montreal Protocol banning CFCs doesn’t count because the ban only happened once comparably profitable/efficient alternatives had already been designed; totally disanalogous to the spot we are in right now, where AGI will likely be incredibly economically profitable, perhaps orders of magnitude more so than the second-best alternative.
This is in large part why Eliezer often used to challenge readers and community members to ban gain-of-function research, as a trial run of sorts for how global coordination on pausing/slowing AI might go.
I think the general impression of people on LW is that multipolar scenarios and concerns over “which monkey finds the radioactive banana and drags it home” are in large part a driver of AI racing instead of being a potential impediment/solution to it. Individuals, companies, and nation-states justifiably believe that whichever one of them accesses potentially superhuman AGI first will have the capacity to flip the gameboard at-will, obtain power over the entire rest of the Earth, and destabilize the currently-existing system. Standard game theory explains the final inferential step for how this leads to full-on racing (see the recent U.S.-China Commission’s report for a representative example of how this plays out in practice).
At the risk of being overly spicy/unnuanced/uncharitable: I think quite a few MIRI [agent foundations] memes (“which monkey finds the radioactive banana and drags it home”, ″automating safety is like having the AI do your homework″, etc.) seem very lazy/un-truth-tracking and probably net-negative at this point, and I kind of wish they’d just stop propagating them (Eliezer being probably the main culprit here).
Perhaps even more spicily, I similarly think that the old MIRI threat model of Consequentialism is looking increasingly ‘tired’/un-truth-tracking, and there should be more updating away from it (and more so with every single increase in capabilities without ‘proportional’ increases in ‘Consequentialism’/egregious misalignment).
(Especially) In a world where the first AGIs are not egregiously misaligned, it very likely matters enormously who builds the first AGIs and what they decide to do with them. While this probably creates incentives towards racing in some actors (probably especially the ones with the best chances to lead the race), I suspect better informing more actors (especially more of the non-leading ones, who might especially see themselves as more on the losing side in the case of AGI and potential destabilization) should also create incentives for (attempts at) more caution and coordination, which the leading actors might at least somewhat take into consideration, e.g. for reasons along the lines of https://aiprospects.substack.com/p/paretotopian-goal-alignment.
I get that we’d like to all recognize this problem and coordinate globally on finding solutions, by “mak[ing] coordinated steps away from Nash equilibria in lockstep”. But I would first need to see an example, a prototype, of how this can play out in practice on an important and highly salient issue. Stuff like the Montreal Protocol banning CFCs doesn’t count because the ban only happened once comparably profitable/efficient alternatives had already been designed; totally disanalogous to the spot we are in right now, where AGI will likely be incredibly economically profitable, perhaps orders of magnitude more so than the second-best alternative.
I’m not particularly optimistic about coordination, especially the more ambitious kinds of plans (e.g. ‘shut it all down’, long pauses like in ‘A narrow path...’, etc.), and that’s to a large degree (combined with short timelines and personal fit) why I’m focused on automated safety reseach. I’m just saying: ‘if you feel like coordination is the best plan you can come up with/you’re most optimistic about, there are probably more legible and likely also more truth-tracking arguments than superintelligence misalignment and loss of control’.
This is in large part why Eliezer often used to challenge readers and community members to ban gain-of-function research, as a trial run of sorts for how global coordination on pausing/slowing AI might go.
This seems quite reasonable; might be too late as a ‘trial run’ at this point though, if taken literally.
I’d say the big factor that makes AI controllable right now is that the compute necessary to build AI that can do very good AI research to automate R&D and then the economy is locked behind TSMC/NVidia and ASML, and their processes are both nearly irreplaceable and very expensive to make, so it’s way easier to intervene on the checkpoints requiring AI development than gain-of function research.
Agreed. I think a type of “stop AGI research” argument that’s under-deployed is that there’s no process or actor in the world that society would trust with unilateral godlike power. At large, people don’t trust their own governments, don’t trust foreign governments, don’t trust international organizations, and don’t trust corporations or their CEOs. Therefore, preventing anyone from building ASI anywhere is the only thing we can all agree on.
I expect this would be much more effective messaging with some demographics, compared to even very down-to-earth arguments about loss of control. For one, it doesn’t need to dismiss the very legitimate fear that the AGI would be aligned to values that a given person would consider monstrous. (Unlike “stop thinking about it, we can’t align it to any values!”.)
I think if the argument is something along the lines of “maybe at some point other countries will demand that the US stop AI progress”, then from the perspective of the USG, I think it’s sensible to operate under the perspective of “OK so we need to advance AI progress as much as possible and try to hide some of it, and if at some future time other countries are threatening us we need to figure out how to respond.” But I don’t think it justifies anything like “we should pause or start initiating international agreements.”
(Separately, whether or not it’s “truer” depends a lot on one’s models of AGI development. Most notably: (a) how likely is misalignment and (b) how slow will takeoff be//will it be very obvious to other nations that super advanced AI is about to be developed, and (c) how will governments and bureaucracies react and will they be able to react quickly enough.)
(Also separately– I do think more people should be thinking about how these international dynamics might play out & if there’s anything we can be doing to prepare for them. I just don’t think they naturally lead to a “oh, so we should be internationally coordinating” mentality and instead lead to much more of a “we can do whatever we want unless/until other countries get mad at us & we should probably do things more secretly” mentality.)
I’m envisioning something like: scary powerful capabilities/demos/accidents leading to various/a coalition of other countries asking the US (and/or China) not to build any additional/larger data centers (and/or run any larger training runs), and, if they’re scared enough, potentially even threatening various (escalatory) measures, including economic sanctions, blockading the supply of compute/prerequisites to compute, sabotage, direct military strikes on the data centers, etc.
I’m far from an expert on the topic, but I suspect it might not be trivial to hide at least building a lot more new data centers/supplying a lot more compute, if a significant chunk of the rest of the world was watching very intently.
(Separately, whether or not it’s “truer” depends a lot on one’s models of AGI development. Most notably: (a) how likely is misalignment and (b) how slow will takeoff be//will it be very obvious to other nations that super advanced AI is about to be developed, and (c) how will governments and bureaucracies react and will they be able to react quickly enough.)
I’m envisioning a very near-casted scenario, on very short (e.g. Daniel Kokotajlo-cluster) timelines, egregious misalignment quite unlikely but not impossible, slow-ish (couple of years) takeoff (by default, if no deliberate pause), pretty multipolar, but with more-obviously-close-to-scary capabilities, like ML R&D automation evals starting to fall.
Thanks for spelling it out. I agree that more people should think about these scenarios. I could see something like this triggering central international coordination (or conflict).
(I still don’t think this would trigger the USG to take different actions in the near-term, except perhaps “try to be more secret about AGI development” and maybe “commission someone to do some sort of study or analysis on how we would handle these kinds of dynamics & what sorts of international proposals would advance US interests while preventing major conflict.” The second thing is a bit optimistic but maybe plausible.)
Hot take, though increasingly moving towards lukewarm: if you want to get a pause/international coordination on powerful AI (which would probably be net good, though likely it would strongly depend on implementation details), arguments about risks from destabilization/power dynamics and potential conflicts between various actors are probably both more legible and ‘truer’ than arguments about technical intent misalignment and loss of control (especially for not-wildly-superhuman AI).
Say more?
I think the general impression of people on LW is that multipolar scenarios and concerns over “which monkey finds the radioactive banana and drags it home” are in large part a driver of AI racing instead of being a potential impediment/solution to it. Individuals, companies, and nation-states justifiably believe that whichever one of them accesses potentially superhuman AGI first will have the capacity to flip the gameboard at-will, obtain power over the entire rest of the Earth, and destabilize the currently-existing system. Standard game theory explains the final inferential step for how this leads to full-on racing (see the recent U.S.-China Commission’s report for a representative example of how this plays out in practice).
I get that we’d like to all recognize this problem and coordinate globally on finding solutions, by “mak[ing] coordinated steps away from Nash equilibria in lockstep”. But I would first need to see an example, a prototype, of how this can play out in practice on an important and highly salient issue. Stuff like the Montreal Protocol banning CFCs doesn’t count because the ban only happened once comparably profitable/efficient alternatives had already been designed; totally disanalogous to the spot we are in right now, where AGI will likely be incredibly economically profitable, perhaps orders of magnitude more so than the second-best alternative.
This is in large part why Eliezer often used to challenge readers and community members to ban gain-of-function research, as a trial run of sorts for how global coordination on pausing/slowing AI might go.
At the risk of being overly spicy/unnuanced/uncharitable: I think quite a few MIRI [agent foundations] memes (“which monkey finds the radioactive banana and drags it home”, ″automating safety is like having the AI do your homework″, etc.) seem very lazy/un-truth-tracking and probably net-negative at this point, and I kind of wish they’d just stop propagating them (Eliezer being probably the main culprit here).
Perhaps even more spicily, I similarly think that the old MIRI threat model of Consequentialism is looking increasingly ‘tired’/un-truth-tracking, and there should be more updating away from it (and more so with every single increase in capabilities without ‘proportional’ increases in ‘Consequentialism’/egregious misalignment).
(Especially) In a world where the first AGIs are not egregiously misaligned, it very likely matters enormously who builds the first AGIs and what they decide to do with them. While this probably creates incentives towards racing in some actors (probably especially the ones with the best chances to lead the race), I suspect better informing more actors (especially more of the non-leading ones, who might especially see themselves as more on the losing side in the case of AGI and potential destabilization) should also create incentives for (attempts at) more caution and coordination, which the leading actors might at least somewhat take into consideration, e.g. for reasons along the lines of https://aiprospects.substack.com/p/paretotopian-goal-alignment.
I’m not particularly optimistic about coordination, especially the more ambitious kinds of plans (e.g. ‘shut it all down’, long pauses like in ‘A narrow path...’, etc.), and that’s to a large degree (combined with short timelines and personal fit) why I’m focused on automated safety reseach. I’m just saying: ‘if you feel like coordination is the best plan you can come up with/you’re most optimistic about, there are probably more legible and likely also more truth-tracking arguments than superintelligence misalignment and loss of control’.
This seems quite reasonable; might be too late as a ‘trial run’ at this point though, if taken literally.
(Also, what Thane Ruthenis commented below.)
I’d say the big factor that makes AI controllable right now is that the compute necessary to build AI that can do very good AI research to automate R&D and then the economy is locked behind TSMC/NVidia and ASML, and their processes are both nearly irreplaceable and very expensive to make, so it’s way easier to intervene on the checkpoints requiring AI development than gain-of function research.
I agree, but I think this is slightly beside the original points I wanted to make.
Agreed. I think a type of “stop AGI research” argument that’s under-deployed is that there’s no process or actor in the world that society would trust with unilateral godlike power. At large, people don’t trust their own governments, don’t trust foreign governments, don’t trust international organizations, and don’t trust corporations or their CEOs. Therefore, preventing anyone from building ASI anywhere is the only thing we can all agree on.
I expect this would be much more effective messaging with some demographics, compared to even very down-to-earth arguments about loss of control. For one, it doesn’t need to dismiss the very legitimate fear that the AGI would be aligned to values that a given person would consider monstrous. (Unlike “stop thinking about it, we can’t align it to any values!”.)
And it is, of course, true.
What kinds of conflicts are you envisioning?
I think if the argument is something along the lines of “maybe at some point other countries will demand that the US stop AI progress”, then from the perspective of the USG, I think it’s sensible to operate under the perspective of “OK so we need to advance AI progress as much as possible and try to hide some of it, and if at some future time other countries are threatening us we need to figure out how to respond.” But I don’t think it justifies anything like “we should pause or start initiating international agreements.”
(Separately, whether or not it’s “truer” depends a lot on one’s models of AGI development. Most notably: (a) how likely is misalignment and (b) how slow will takeoff be//will it be very obvious to other nations that super advanced AI is about to be developed, and (c) how will governments and bureaucracies react and will they be able to react quickly enough.)
(Also separately– I do think more people should be thinking about how these international dynamics might play out & if there’s anything we can be doing to prepare for them. I just don’t think they naturally lead to a “oh, so we should be internationally coordinating” mentality and instead lead to much more of a “we can do whatever we want unless/until other countries get mad at us & we should probably do things more secretly” mentality.)
I’m envisioning something like: scary powerful capabilities/demos/accidents leading to various/a coalition of other countries asking the US (and/or China) not to build any additional/larger data centers (and/or run any larger training runs), and, if they’re scared enough, potentially even threatening various (escalatory) measures, including economic sanctions, blockading the supply of compute/prerequisites to compute, sabotage, direct military strikes on the data centers, etc.
I’m far from an expert on the topic, but I suspect it might not be trivial to hide at least building a lot more new data centers/supplying a lot more compute, if a significant chunk of the rest of the world was watching very intently.
I’m envisioning a very near-casted scenario, on very short (e.g. Daniel Kokotajlo-cluster) timelines, egregious misalignment quite unlikely but not impossible, slow-ish (couple of years) takeoff (by default, if no deliberate pause), pretty multipolar, but with more-obviously-close-to-scary capabilities, like ML R&D automation evals starting to fall.
Thanks for spelling it out. I agree that more people should think about these scenarios. I could see something like this triggering central international coordination (or conflict).
(I still don’t think this would trigger the USG to take different actions in the near-term, except perhaps “try to be more secret about AGI development” and maybe “commission someone to do some sort of study or analysis on how we would handle these kinds of dynamics & what sorts of international proposals would advance US interests while preventing major conflict.” The second thing is a bit optimistic but maybe plausible.)