A thing I am confused about: what is the medium-to-long-term actual policy outcome you’re aiming for? And what is the hopeful outcome which that policy unlocks?
You say “implement international AI compute governance frameworks and controls sufficient for halting the development of any dangerous AI development activity, and streamlined functional processes for doing so”. The picture that brings to my mind is something like:
Track all compute centers large enough for very high-flop training runs
Put access controls in place for such high-flop runs
A prototypical “AI pause” policy in this vein would be something like “no new training runs larger than the previous largest run”.
Now, the obvious-to-me shortcoming of that approach is that algorithmic improvement is moving at least as fast as scaling, a fact which I doubt Eliezer or Nate have overlooked. Insofar as that algorithmic improvement is itself compute-dependent, it’s mostly dependent on small test runs rather than big training runs, so a pause-style policy would slow down the algorithmic component of AI progress basically not-at-all. So whatever your timelines look like, even a full pause on training runs larger than the current record should less than double our time.
… and that still makes implementation of a pause-style policy a very worthwhile thing for a lot of people to work on, but I’m somewhat confused that Eliezer and Nate specifically currently see that as their best option? Where is the hope here? What are they hoping happens with twice as much time, which would not happen with one times as much time? Or is there some other policy target (including e.g. “someone else figures out a better policy”) which would somehow buy a lot more time?
I don’t speak for Nate or Eliezer in this reply; where I speak about Eliezer I am of course describing my model of him, which may be flawed.
Three somewhat disjoint answers:
From my perspective, your point about algorithmic improvement only underlines the importance of having powerful people actually get what the problem is and have accurate working models. If this becomes true, then the specific policy measures have some chance of adapting to current conditions, or of being written in an adaptive manner in the first place.
Eliezer said a few years ago that “I consider the present gameboard to look incredibly grim” and while he has more hope now than he had then about potential political solutions, it is not the case (as I understand it) that he now feels hopeful that these solutions will work. Our policy proposals are an incredible longshot.
One thing we can hope for, if we get a little more time rather than a lot more time, is that we might get various forms of human cognitive enhancement working, and these smarter humans can make more rapid progress on AI alignment.
It seems like including this in the strategy statement is crucial to communicating that strategy clearly (at least to people who understand enough of the background). A long-shot strategy looks very different from one where you expect to achieve at least useful parts of your goals.
A reasonable point, thank you. We said it pretty clearly in the MIRI strategy post in January, and I linked to that post here, but perhaps I should have reiterated it.
For clarity: we mostly just expect to die. But while we can see viable paths forward at all, we’ll keep trying not to.
These next changes implemented in the US, Europe and East Asia would probably buy us many decades:
Close all the AI labs and return their assets to their shareholders;
Require all “experts” (e.g., researchers, instructors) in AI to leave their jobs; give them money to compensate them for their temporary loss of earnings power;
Make it illegal to communicate technical knowledge about machine learning or AI; this includes publishing papers, engaging in informal conversations, tutoring, talking about it in a classroom; even distributing already-published titles on the subject gets banned.
Of course it is impractical to completely stop these activities (especially the distribution of already-published titles), but we do not have to completely stop them; we need only sufficiently reduce the rate at which the AI community worldwide produces algorithmic improvements. Here we are helped by the fact that figuring out how to create an AI capable of killing us all is probably still a very hard research problem.
What is most dangerous about the current situation is the tens of thousands of researchers world-wide with tens of billions in funding who feel perfectly free to communicate and collaborate with each other and who expect that they will be praised and rewarded for increasing our society’s ability to create powerful AIs. If instead they come to expect more criticism than praise and more punishment than reward, most of them will stop—and more importantly almost no young person is going to put in the years of hard work needed to become an AI researcher.
I know how awful this sounds to many of the people reading this, including the person I am replying to, but you did ask, “Is there some other policy target which would somehow buy a lot more time?”
I know how awful this sounds to many of the people reading this, including the person I am replying to...
I actually find this kind of thinking quite useful. I mean, the particular policies proposed are probably pareto-suboptimal, but there’s a sound method in which we first ask “what policies would buy a lot more time?”, allowing for pretty bad policies as a first pass, and then think through how to achieve the same subgoals in more palatable ways.
If there’s a legal ceiling on AI capabilities, that reduces the short term economic incentive to improve algorithms. If improving algorithms gets you categorised as uncool at parties, that might also reduce the short term incentive to improve algorithms.
It is thus somewhat plausible to me that an enforced legal limit on AI capabilities backed by high-status-cool-party-attending-public opinion would slow down algorithmic progress significantly.
A thing I am confused about: what is the medium-to-long-term actual policy outcome you’re aiming for? And what is the hopeful outcome which that policy unlocks?
You say “implement international AI compute governance frameworks and controls sufficient for halting the development of any dangerous AI development activity, and streamlined functional processes for doing so”. The picture that brings to my mind is something like:
Track all compute centers large enough for very high-flop training runs
Put access controls in place for such high-flop runs
A prototypical “AI pause” policy in this vein would be something like “no new training runs larger than the previous largest run”.
Now, the obvious-to-me shortcoming of that approach is that algorithmic improvement is moving at least as fast as scaling, a fact which I doubt Eliezer or Nate have overlooked. Insofar as that algorithmic improvement is itself compute-dependent, it’s mostly dependent on small test runs rather than big training runs, so a pause-style policy would slow down the algorithmic component of AI progress basically not-at-all. So whatever your timelines look like, even a full pause on training runs larger than the current record should less than double our time.
… and that still makes implementation of a pause-style policy a very worthwhile thing for a lot of people to work on, but I’m somewhat confused that Eliezer and Nate specifically currently see that as their best option? Where is the hope here? What are they hoping happens with twice as much time, which would not happen with one times as much time? Or is there some other policy target (including e.g. “someone else figures out a better policy”) which would somehow buy a lot more time?
I don’t speak for Nate or Eliezer in this reply; where I speak about Eliezer I am of course describing my model of him, which may be flawed.
Three somewhat disjoint answers:
From my perspective, your point about algorithmic improvement only underlines the importance of having powerful people actually get what the problem is and have accurate working models. If this becomes true, then the specific policy measures have some chance of adapting to current conditions, or of being written in an adaptive manner in the first place.
Eliezer said a few years ago that “I consider the present gameboard to look incredibly grim” and while he has more hope now than he had then about potential political solutions, it is not the case (as I understand it) that he now feels hopeful that these solutions will work. Our policy proposals are an incredible longshot.
One thing we can hope for, if we get a little more time rather than a lot more time, is that we might get various forms of human cognitive enhancement working, and these smarter humans can make more rapid progress on AI alignment.
It seems like including this in the strategy statement is crucial to communicating that strategy clearly (at least to people who understand enough of the background). A long-shot strategy looks very different from one where you expect to achieve at least useful parts of your goals.
A reasonable point, thank you. We said it pretty clearly in the MIRI strategy post in January, and I linked to that post here, but perhaps I should have reiterated it.
For clarity: we mostly just expect to die. But while we can see viable paths forward at all, we’ll keep trying not to.
Has MIRI considered supporting work on human cognitive enhancement? e.g. Foresight’s work on WBE.
These next changes implemented in the US, Europe and East Asia would probably buy us many decades:
Close all the AI labs and return their assets to their shareholders;
Require all “experts” (e.g., researchers, instructors) in AI to leave their jobs; give them money to compensate them for their temporary loss of earnings power;
Make it illegal to communicate technical knowledge about machine learning or AI; this includes publishing papers, engaging in informal conversations, tutoring, talking about it in a classroom; even distributing already-published titles on the subject gets banned.
Of course it is impractical to completely stop these activities (especially the distribution of already-published titles), but we do not have to completely stop them; we need only sufficiently reduce the rate at which the AI community worldwide produces algorithmic improvements. Here we are helped by the fact that figuring out how to create an AI capable of killing us all is probably still a very hard research problem.
What is most dangerous about the current situation is the tens of thousands of researchers world-wide with tens of billions in funding who feel perfectly free to communicate and collaborate with each other and who expect that they will be praised and rewarded for increasing our society’s ability to create powerful AIs. If instead they come to expect more criticism than praise and more punishment than reward, most of them will stop—and more importantly almost no young person is going to put in the years of hard work needed to become an AI researcher.
I know how awful this sounds to many of the people reading this, including the person I am replying to, but you did ask, “Is there some other policy target which would somehow buy a lot more time?”
I actually find this kind of thinking quite useful. I mean, the particular policies proposed are probably pareto-suboptimal, but there’s a sound method in which we first ask “what policies would buy a lot more time?”, allowing for pretty bad policies as a first pass, and then think through how to achieve the same subgoals in more palatable ways.
>I actually find this kind of thinking quite useful
I’m glad.
If there’s a legal ceiling on AI capabilities, that reduces the short term economic incentive to improve algorithms. If improving algorithms gets you categorised as uncool at parties, that might also reduce the short term incentive to improve algorithms.
It is thus somewhat plausible to me that an enforced legal limit on AI capabilities backed by high-status-cool-party-attending-public opinion would slow down algorithmic progress significantly.