(Update: I’m less optimistic about this than I was when I wrote this comment, but I still think it seems promising.)
Multiplier effects: Delaying timelines by 1 year gives the entire alignment community an extra year to solve the problem.
This is the most and fastest I’ve updated on a single sentence as far back as I can remember. I am deeply gratefwl for learning this, and it’s definitely worth Taking Seriously. Hoping to look into it in January unless stuff gets in the way.
Have other people written about this anywhere?
I have one objection to claim 3a, however: Buying-time interventions are plausibly more heavy-tailed than alignment research in some cases because 1) the bottleneck for buying time is social influence and 2) social influence follows a power law due to preferential attachment. Luckily, the traits that make for top alignment researchers have limited (but not insignificant) overlap with the traits that make for top social influencers. So I think top alignment researchers should still not switch in most cases on the margin.
I would not have made this update by reading your post, and I think you are saying very different things. The thing I updated on from this post wasn’t “let’s try to persuade AI people to do safety instead,” it was the following:
If I am capable of doing an average amount of alignment work ¯x per unit time, and I have n units of time available before the development of transformative AI, I will have contributed ¯x∗n work. But if I expect to delay transformative AI by m units of time if I focus on it, everyone will have that additional time to do alignment work, which means my impact is ¯x∗m∗p, where p is the number of people doing work. Naively then, if m∗p>n, I should be focusing on buying time.[1]
The part of my post I meant to highlight was the last sentence: “To put it bluntly, we should—on all fronts—scale up efforts to recruit talented AI capabilities researchers into AI safety research, in order to slow down the former in comparison to the latter. ”
Perhaps I should have made this point front-and-center.
No, this isn’t the same. If you wish, you could try to restate what I think the main point of this post is, and I could say if I think that’s accurate. At the moment, it seems to me like you’re misunderstanding what this post is saying.
I think the point of Thomas, Akash, and Olivia’s post is that more people should focus on buying time, because solving the AI safety/alignment problem before capabilities increase to the point of AGI is important, and right now the latter is progressing much faster than the former.
See the first two paragraphs of my post, although I could have made its point and the implicit modeling assumptions more explicitly clear:
Assuming that AI capabilities research continues to outpace AI safety research, the former will eventually result in the most negative externality in history: a significant risk of human extinction. Despite this, a free-rider problem causes AI capabilities research to myopically push forward, both because of market competition and great power competition (e.g., U.S. and China). AI capabilities research is thus analogous to the societal production and usage of fossil fuels, and AI safety research is analogous to green-energy research. We want to scale up and accelerate green-energy research as soon as possible, so that we can halt the negative externalities of fossil fuel use.”
If the “multiplier effects” framing helped you update, then that’s really great! (I also found this framing helpful when I wrote it in this summer at SERI MATS, in the Alignment Game Tree group exercise for John Wentworth’s stream.)
I do think that in order for the “multiplier effects” explanation to hold, it needs to slow down capabilities research relative to safety research. Doing the latter with maximum efficiency is the core phenomenon that proves the optimality of the proposed action, not the former.
That’s fair, but sorry[1] I misstated my intended question. I meant that I was under the impression that you didn’t understand the argument, not that you didn’t understand the action they advocated for.
I understand that your post and this post argue for actions that are similar in effect. And your post is definitely relevant to the question I asked in my first comment, so I appreciate you linking it.
Actually sorry. Asking someone a question that you don’t expect yourself or the person to benefit from is not nice, even if it was just due to careless phrasing. I just wasted your time.
(Update: I’m less optimistic about this than I was when I wrote this comment, but I still think it seems promising.)
This is the most and fastest I’ve updated on a single sentence as far back as I can remember. I am deeply gratefwl for learning this, and it’s definitely worth Taking Seriously. Hoping to look into it in January unless stuff gets in the way.
Have other people written about this anywhere?
I have one objection to claim 3a, however: Buying-time interventions are plausibly more heavy-tailed than alignment research in some cases because 1) the bottleneck for buying time is social influence and 2) social influence follows a power law due to preferential attachment. Luckily, the traits that make for top alignment researchers have limited (but not insignificant) overlap with the traits that make for top social influencers. So I think top alignment researchers should still not switch in most cases on the margin.
Please check out my writeup from April! https://forum.effectivealtruism.org/posts/juhMehg89FrLX9pTj/a-grand-strategy-to-recruit-ai-capabilities-researchers-into
I would not have made this update by reading your post, and I think you are saying very different things. The thing I updated on from this post wasn’t “let’s try to persuade AI people to do safety instead,” it was the following:
If I am capable of doing an average amount of alignment work ¯x per unit time, and I have n units of time available before the development of transformative AI, I will have contributed ¯x∗n work. But if I expect to delay transformative AI by m units of time if I focus on it, everyone will have that additional time to do alignment work, which means my impact is ¯x∗m∗p, where p is the number of people doing work. Naively then, if m∗p>n, I should be focusing on buying time.[1]
This assumes time-buying and direct alignment-work is independent, whereas I expect doing either will help with the other to some extent.
That’s totally fair!
The part of my post I meant to highlight was the last sentence: “To put it bluntly, we should—on all fronts—scale up efforts to recruit talented AI capabilities researchers into AI safety research, in order to slow down the former in comparison to the latter. ”
Perhaps I should have made this point front-and-center.
No, this isn’t the same. If you wish, you could try to restate what I think the main point of this post is, and I could say if I think that’s accurate. At the moment, it seems to me like you’re misunderstanding what this post is saying.
I think the point of Thomas, Akash, and Olivia’s post is that more people should focus on buying time, because solving the AI safety/alignment problem before capabilities increase to the point of AGI is important, and right now the latter is progressing much faster than the former.
See the first two paragraphs of my post, although I could have made its point and the implicit modeling assumptions more explicitly clear:
“AI capabilities research seems to be substantially outpacing AI safety research. It is most likely true that successfully solving the AI alignment problem before the successful development of AGI is critical for the continued survival and thriving of humanity.
Assuming that AI capabilities research continues to outpace AI safety research, the former will eventually result in the most negative externality in history: a significant risk of human extinction. Despite this, a free-rider problem causes AI capabilities research to myopically push forward, both because of market competition and great power competition (e.g., U.S. and China). AI capabilities research is thus analogous to the societal production and usage of fossil fuels, and AI safety research is analogous to green-energy research. We want to scale up and accelerate green-energy research as soon as possible, so that we can halt the negative externalities of fossil fuel use.”
If the “multiplier effects” framing helped you update, then that’s really great! (I also found this framing helpful when I wrote it in this summer at SERI MATS, in the Alignment Game Tree group exercise for John Wentworth’s stream.)
I do think that in order for the “multiplier effects” explanation to hold, it needs to slow down capabilities research relative to safety research. Doing the latter with maximum efficiency is the core phenomenon that proves the optimality of the proposed action, not the former.
That’s fair, but sorry[1] I misstated my intended question. I meant that I was under the impression that you didn’t understand the argument, not that you didn’t understand the action they advocated for.
I understand that your post and this post argue for actions that are similar in effect. And your post is definitely relevant to the question I asked in my first comment, so I appreciate you linking it.
Actually sorry. Asking someone a question that you don’t expect yourself or the person to benefit from is not nice, even if it was just due to careless phrasing. I just wasted your time.