That post says “We plan to say more in the future about the criteria for strategically adequate projects in 7a” and also “A number of the points above require further explanation and motivation, and we’ll be providing more details on our view of the strategic landscape in the near future”. As far as I can tell, MIRI hasn’t published any further explanation of this strategic plan (I expected there to be something in the 2018 update but that post talks about other things). Is MIRI still planning to say more about its strategic plan in the near future, and if so, is there a concrete timeframe (e.g. “in a few months”, “in a year”, “in two years”) for publishing such an explanation?
2. High-level thoughts on things like “what we think AGI developers probably need to do” and “what we think the world probably needs to do” to successfully navigate the acute risk period.
Most of the stuff discussed in “strategic background” is about 2: not MIRI’s organizational plan, but our model of some of the things humanity likely needs to do in order for the long-run future to go well. Some of these topics are reasonably sensitive, and we’ve gone back and forth about how best to talk about them.
Within the macrostrategy / “high-level thoughts” part of the post, the densest part was maybe 7a. The criteria we listed for a strategically adequate AGI project were “strong opsec, research closure, trustworthy command, a commitment to the common good, security mindset, requisite resource levels, and heavy prioritization of alignment work”.
With most of these it’s reasonably clear what’s meant in broad strokes, though there’s a lot more I’d like to say about the specifics. “Trustworthy command” and “a commitment to the common good” are maybe the most opaque. By “trustworthy command” we meant things like:
The organization’s entire command structure is fully aware of the difficulty and danger of alignment.
Non-technical leadership can’t interfere and won’t object if technical leadership needs to delete a code base or abort the project.
By “a commitment to the common good” we meant a commitment to both short-term goodness (the immediate welfare of present-day Earth) and long-term goodness (the achievement of transhumanist astronomical goods), paired with a real commitment to moral humility: not rushing ahead to implement every idea that sounds good to them.
We still plan to produce more long-form macrostrategy exposition, but given how many times we’ve failed to word our thoughts in a way we felt comfortable publishing, and given how much other stuff we’re also juggling, I don’t currently expect us to have any big macrostrategy posts in the next 6 months. (Note that I don’t plan to give up on trying to get more of our thoughts out sooner than that, if possible. We’ll see.)
The post says “On our current view of the technological landscape, there are a number of plausible future technologies that could be leveraged to end the acute risk period.” I’m wondering what these other plausible future technologies are. (I’m guessing things like whole brain emulation and intelligence enhancement count, but are there any others?)
One of the footnotes says “There are other paths to good outcomes that we view as lower-probability, but still sufficiently high-probability that the global community should allocate marginal resources to their pursuit.” What do some of these other paths look like?
I’m confused about the differences between “minimal aligned AGI” and “task AGI”. (As far as I know, this post is the only place MIRI has used the term “minimal aligned AGI”, so I have very little to go on.) Is “minimal aligned AGI” the larger class, and “task AGI” the specific kind of minimal aligned AGI that MIRI has decided is most promising? Or is the plan to first build a minimal aligned AGI, which then builds a task AGI, which then performs a pivotal task/helps build a Sovereign?
If the latter, then it seems like MIRI has gone from a one-step view (“build a Sovereign”), to a two-step view (“build a task-directed AGI first, then go for Sovereign”), to a three-step view (“build a minimal aligned AGI, then task AGI, then Sovereign”). I’m not sure why “three” is the right number of stages (why not two or four?), and I don’t think MIRI has explained this. In fact, I don’t think MIRI has even explained why it switched to the two-step view in the first place. (Wei Dai made this point here.)
That post says “We plan to say more in the future about the criteria for strategically adequate projects in 7a” and also “A number of the points above require further explanation and motivation, and we’ll be providing more details on our view of the strategic landscape in the near future”. As far as I can tell, MIRI hasn’t published any further explanation of this strategic plan (I expected there to be something in the 2018 update but that post talks about other things). Is MIRI still planning to say more about its strategic plan in the near future, and if so, is there a concrete timeframe (e.g. “in a few months”, “in a year”, “in two years”) for publishing such an explanation?
Oops, I saw your question when you first posted it but forgot to get back to you, Issa. (Issa re-asked here.) My apologies.
I think there are two main kinds of strategic thought we had in mind when we said “details forthcoming”:
1. Thoughts on MIRI’s organizational plans, deconfusion research, and how we think MIRI can help play a role in improving the future — this is covered by our November 2018 update post, https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/.
2. High-level thoughts on things like “what we think AGI developers probably need to do” and “what we think the world probably needs to do” to successfully navigate the acute risk period.
Most of the stuff discussed in “strategic background” is about 2: not MIRI’s organizational plan, but our model of some of the things humanity likely needs to do in order for the long-run future to go well. Some of these topics are reasonably sensitive, and we’ve gone back and forth about how best to talk about them.
Within the macrostrategy / “high-level thoughts” part of the post, the densest part was maybe 7a. The criteria we listed for a strategically adequate AGI project were “strong opsec, research closure, trustworthy command, a commitment to the common good, security mindset, requisite resource levels, and heavy prioritization of alignment work”.
With most of these it’s reasonably clear what’s meant in broad strokes, though there’s a lot more I’d like to say about the specifics. “Trustworthy command” and “a commitment to the common good” are maybe the most opaque. By “trustworthy command” we meant things like:
The organization’s entire command structure is fully aware of the difficulty and danger of alignment.
Non-technical leadership can’t interfere and won’t object if technical leadership needs to delete a code base or abort the project.
By “a commitment to the common good” we meant a commitment to both short-term goodness (the immediate welfare of present-day Earth) and long-term goodness (the achievement of transhumanist astronomical goods), paired with a real commitment to moral humility: not rushing ahead to implement every idea that sounds good to them.
We still plan to produce more long-form macrostrategy exposition, but given how many times we’ve failed to word our thoughts in a way we felt comfortable publishing, and given how much other stuff we’re also juggling, I don’t currently expect us to have any big macrostrategy posts in the next 6 months. (Note that I don’t plan to give up on trying to get more of our thoughts out sooner than that, if possible. We’ll see.)
Thanks! I have some remaining questions:
The post says “On our current view of the technological landscape, there are a number of plausible future technologies that could be leveraged to end the acute risk period.” I’m wondering what these other plausible future technologies are. (I’m guessing things like whole brain emulation and intelligence enhancement count, but are there any others?)
One of the footnotes says “There are other paths to good outcomes that we view as lower-probability, but still sufficiently high-probability that the global community should allocate marginal resources to their pursuit.” What do some of these other paths look like?
I’m confused about the differences between “minimal aligned AGI” and “task AGI”. (As far as I know, this post is the only place MIRI has used the term “minimal aligned AGI”, so I have very little to go on.) Is “minimal aligned AGI” the larger class, and “task AGI” the specific kind of minimal aligned AGI that MIRI has decided is most promising? Or is the plan to first build a minimal aligned AGI, which then builds a task AGI, which then performs a pivotal task/helps build a Sovereign?
If the latter, then it seems like MIRI has gone from a one-step view (“build a Sovereign”), to a two-step view (“build a task-directed AGI first, then go for Sovereign”), to a three-step view (“build a minimal aligned AGI, then task AGI, then Sovereign”). I’m not sure why “three” is the right number of stages (why not two or four?), and I don’t think MIRI has explained this. In fact, I don’t think MIRI has even explained why it switched to the two-step view in the first place. (Wei Dai made this point here.)