If you agree that there will be problems worth working on at some point, then when to start working on them becomes a judgement call about how hard the problems are, which warning sign will leave enough time to solve them, how much better tools and understanding will get in the future (without us working specifically to improve such tools/understanding), and how current resources trade against future resources. If you agree with this, I suggest that another reason for urgency besides foom is a judgment that we’ve already passed such warning signs where it becomes worthwhile to work on the problems. (There are people such as Paul Christiano who don’t think foom is highly likely and almost certainly has a good understanding of the tradeoffs you bring up here, who nevertheless think it’s urgent to work on alignment.) You might disagree with this judgment but it seems wrong to say “foom scenario is still the main reason for society to be concerned about AI risk now”. (Unless you’re saying something like, according to your own inside view, foom is the best argument for urgency on AI risk, but I’m assuming you’re talking about other people’s motivations?)
By “main reason for concern” I mean best arguments; I’m not trying to categorize people’s motivations.
AGI isn’t remotely close, and I just don’t believe people who think they see signs of that. Yes for any problem that we’ll eventually want to work on, a few people should work on it now just so someone is tracking the problem, ready to tell the rest of us if they see signs of it coming soon. But I see people calling for much more than that minimal tracking effort.
Most people who work in research areas call for more relative funding for their areas. So the rest of us just can’t be in the habit of believing such calls. We must hold a higher standard than “people who get $ to work on this say more $ should go to this now.”
AGI isn’t remotely close, and I just don’t believe people who think they see signs of that.
You don’t seem to believe in foom either, but you’re at least willing to mention it as a reason some people give for urgency and even engage in extensive debates about it. I don’t understand how “no foom, but AGI may be close enough that it’s worthwhile to do substantial alignment work now” could be so much less likely in your mind than foom that it’s not even worth mentioning as a reason that some other (seemingly smart and sane) people give for urgency.
Most people who work in research areas call for more relative funding for their areas. So the rest of us just can’t be in the habit of believing such calls.
What do you propose that “the rest of us” do? I guess some of us can try to evaluate the object-level arguments ourselves, but what about those who lack the domain knowledge or even the raw intelligence to do that? (This is not a rhetorical question; I actually don’t know.)
We must hold a higher standard than “people who get $ to work on this say more $ should go to this now.”
I’m pretty sure Paul can make more money by going into some other line of work than AI safety, plus he’s actually spending his own money to fund AI alignment research by others. I personally do not get $ to work on this (except by winning some informal prizes funded by Paul which far from covers the value of time I’ve spent on the topic) and I plan to keep it that way for the foreseeable future. (Of course we’re still fairly likely to be biased for other reasons.)
ETA, it looks like you added this part to your comment after I typed the above:
By “main reason for concern” I mean best arguments; I’m not trying to categorize people’s motivations.
Ok, that was not clear, since you did present a Twitter poll in the same post asking about “motives for AI risk concern”.
I’m not actually aware of a really good argument for AGI coming soon (i.e., within next few decades). As far as I can tell, most people use their own intuitions and/or surveys of AI researchers (both of which are of course likely to be biased). My sense is that it’s hard to reason explicitly about AGI timelines (in a way that’s good enough to be more trustworthy than intuitions/surveys) and there seem to be enough people concerned about foom and/or short timelines that funding isn’t a big constraint so there’s not a lot of incentives for AI risk people to spend time on making such explicit arguments. (ETA: Although I could well be wrong about this, and there’s a good argument somewhere that I’m not aware of.) To give a sense of how people are thinking about this, I’ll quote a Paul Christiano interview:
I normally think about this question in terms of what’s the probability of some particular development by 10 or 20 years rather than thinking about a median because those seem like the most decision relevant numbers, basically. Maybe one could also, if you had very short timelines give probabilities on less than 10 years. I think that my probability for human labor being obsolete within 10 years is probably something in the ballpark of 15%, and within 20 years is something within the ballpark of 35%. AI would then have, prior to human labor being obsolete, you have some window of maybe a few years during which stuff is already getting quite extremely crazy. Probably AI [risk 01:09:04] becomes a big deal. We can have permanently have sunk the ship like somewhat before, one to two years before, we actually have human labor being obsolete.
Those are my current best guesses. I feel super uncertain about … I have numbers off hand because I’ve been asked before, but I still feel very uncertain about those numbers. I think it’s quite likely they’ll change over the coming year. Not just because new evidence comes in, but also because I continue to reflect on my views. I feel like a lot of people, whose views I think are quite reasonable, who push for numbers both higher and lower, or there are a lot of people making reasonable arguments for numbers both much, like shorter timelines than that and longer timelines than that.
Overall, I come away pretty confused with why people currently are as confident as they are in their views. I think compared to the world at large, the view I’ve described is incredibly aggressive, incredibly soon. I think compared to the community of people who think about this a lot, I’m more somewhere in, I’m still on the middle of the distribution. But amongst people whose thinking I most respect, maybe I’m somewhere in the middle of the distribution. I don’t quite understand why people come away with much higher or much lower numbers than that. I don’t have a good … It seems to me like the arguments people are making on both sides are really quite shaky. I can totally imagine that after doing … After being more thoughtful, I would come away with higher or lower numbers, but I don’t feel convinced that people who are much more confident one way or the other have actually done the kind of analysis that I should defer to them on. That’s said, I also I don’t think I’ve done the kind of analysis that other people should really be deferring to me on.
My own thinking here is that even if AGI comes a century or more from now, the safestalignment approaches seem to require solving a number of hard philosophical problems which may well take that long to solve even if we start now. Certainly it would be pretty hopeless if we only started when we saw a clear 10-year warning. This possibility also justifies looking more deeply into other approaches now to see if they could potentially be just as safe without solving the hard philosophical problems.
Another thought that is prompted by your question is that given funding does not seem to be the main constraint on current alignment work (people more often cite “talent”), it’s not likely to be a limiting constraint in the future either, when the warning signs are even clearer. But “resources today can be traded for a lot more resources later” doesn’t seem to apply if we interpret “resources” as “talent”.
We have to imagine that we have some influence over the allocation of something, or there’s nothing to debate here. Call it “resources” or “talent” or whatever, if there’s nothing to move, there’s nothing to discuss.
I’m skeptical solving hard philosophical problems will be of much use here. Once we see the actual form of relevant systems then we can do lots of useful work on concrete variations.
I’d call “human labor being obsolete within 10 years … 15%, and within 20 years … 35%” crazy extreme predictions, and happily bet against them.
If we look at direct economic impact, we’ve had a pretty steady trend for at least a century of jobs displaced by automation, and the continuation of past trend puts full AGI a long way off. So you need a huge unprecedented foom-like lump of innovation to have change that big that soon.
We have to imagine that we have some influence over the allocation of something, or there’s nothing to debate here. Call it “resources” or “talent” or whatever, if there’s nothing to move, there’s nothing to discuss.
Let me rephrase my argument to be clearer. You suggested earlier, “and if resources today can be traded for a lot more resources later, the temptation to wait should be strong.” This advice could be directed at either funders or researchers (or both). It doesn’t seem to make sense for researchers, since they can’t, by not working on AI alignment today, cause more AI alignment researchers to appear in the future. And I think a funder should think, “There will be plenty of funding for AI alignment research in the future when there are clearer warning signs. I could save and invest this money, and spend it in the future on alignment, but it will just be adding to the future pool of funding, and the marginal utility will be pretty low because at the margins, it will be hard to turn money into qualified alignment researchers in the future just as it is hard to do that today.”
So I’m saying this particular reallocation of resources that you suggested does not make sense, but the money/talent could still be reallocated some other way (for example to some other altruistic cause today). Do you have either a counterargument or another suggestion that you think is better than spending on AI alignment today?
I’m skeptical solving hard philosophical problems will be of much use here.
Once we see the actual form of relevant systems then we can do lots of useful work on concrete variations.
Sure, but why can’t philosophical work be a complement to that?
I’d call “human labor being obsolete within 10 years … 15%, and within 20 years … 35%” crazy extreme predictions, and happily bet against them.
I won’t defend these numbers because I haven’t put much thought into this topic personally (since my own reasons don’t depend on these numbers, and I doubt that I can do much better than deferring to others). But at what probabilities would you say that substantial work on alignment today would start to be worthwhile (assuming the philosophical difficulty argument doesn’t apply)? What do you think a world where such probabilities are reasonable would look like?
If you agree that there will be problems worth working on at some point, then when to start working on them becomes a judgement call about how hard the problems are, which warning sign will leave enough time to solve them, how much better tools and understanding will get in the future (without us working specifically to improve such tools/understanding), and how current resources trade against future resources. If you agree with this, I suggest that another reason for urgency besides foom is a judgment that we’ve already passed such warning signs where it becomes worthwhile to work on the problems. (There are people such as Paul Christiano who don’t think foom is highly likely and almost certainly has a good understanding of the tradeoffs you bring up here, who nevertheless think it’s urgent to work on alignment.) You might disagree with this judgment but it seems wrong to say “foom scenario is still the main reason for society to be concerned about AI risk now”. (Unless you’re saying something like, according to your own inside view, foom is the best argument for urgency on AI risk, but I’m assuming you’re talking about other people’s motivations?)
By “main reason for concern” I mean best arguments; I’m not trying to categorize people’s motivations.
AGI isn’t remotely close, and I just don’t believe people who think they see signs of that. Yes for any problem that we’ll eventually want to work on, a few people should work on it now just so someone is tracking the problem, ready to tell the rest of us if they see signs of it coming soon. But I see people calling for much more than that minimal tracking effort.
Most people who work in research areas call for more relative funding for their areas. So the rest of us just can’t be in the habit of believing such calls. We must hold a higher standard than “people who get $ to work on this say more $ should go to this now.”
You don’t seem to believe in foom either, but you’re at least willing to mention it as a reason some people give for urgency and even engage in extensive debates about it. I don’t understand how “no foom, but AGI may be close enough that it’s worthwhile to do substantial alignment work now” could be so much less likely in your mind than foom that it’s not even worth mentioning as a reason that some other (seemingly smart and sane) people give for urgency.
What do you propose that “the rest of us” do? I guess some of us can try to evaluate the object-level arguments ourselves, but what about those who lack the domain knowledge or even the raw intelligence to do that? (This is not a rhetorical question; I actually don’t know.)
I’m pretty sure Paul can make more money by going into some other line of work than AI safety, plus he’s actually spending his own money to fund AI alignment research by others. I personally do not get $ to work on this (except by winning some informal prizes funded by Paul which far from covers the value of time I’ve spent on the topic) and I plan to keep it that way for the foreseeable future. (Of course we’re still fairly likely to be biased for other reasons.)
ETA, it looks like you added this part to your comment after I typed the above:
Ok, that was not clear, since you did present a Twitter poll in the same post asking about “motives for AI risk concern”.
Can you point to a good/best argument for the claim that AGI is coming soon enough to justify lots of effort today?
I’m not actually aware of a really good argument for AGI coming soon (i.e., within next few decades). As far as I can tell, most people use their own intuitions and/or surveys of AI researchers (both of which are of course likely to be biased). My sense is that it’s hard to reason explicitly about AGI timelines (in a way that’s good enough to be more trustworthy than intuitions/surveys) and there seem to be enough people concerned about foom and/or short timelines that funding isn’t a big constraint so there’s not a lot of incentives for AI risk people to spend time on making such explicit arguments. (ETA: Although I could well be wrong about this, and there’s a good argument somewhere that I’m not aware of.) To give a sense of how people are thinking about this, I’ll quote a Paul Christiano interview:
My own thinking here is that even if AGI comes a century or more from now, the safest alignment approaches seem to require solving a number of hard philosophical problems which may well take that long to solve even if we start now. Certainly it would be pretty hopeless if we only started when we saw a clear 10-year warning. This possibility also justifies looking more deeply into other approaches now to see if they could potentially be just as safe without solving the hard philosophical problems.
Another thought that is prompted by your question is that given funding does not seem to be the main constraint on current alignment work (people more often cite “talent”), it’s not likely to be a limiting constraint in the future either, when the warning signs are even clearer. But “resources today can be traded for a lot more resources later” doesn’t seem to apply if we interpret “resources” as “talent”.
We have to imagine that we have some influence over the allocation of something, or there’s nothing to debate here. Call it “resources” or “talent” or whatever, if there’s nothing to move, there’s nothing to discuss.
I’m skeptical solving hard philosophical problems will be of much use here. Once we see the actual form of relevant systems then we can do lots of useful work on concrete variations.
I’d call “human labor being obsolete within 10 years … 15%, and within 20 years … 35%” crazy extreme predictions, and happily bet against them.
If we look at direct economic impact, we’ve had a pretty steady trend for at least a century of jobs displaced by automation, and the continuation of past trend puts full AGI a long way off. So you need a huge unprecedented foom-like lump of innovation to have change that big that soon.
Let me rephrase my argument to be clearer. You suggested earlier, “and if resources today can be traded for a lot more resources later, the temptation to wait should be strong.” This advice could be directed at either funders or researchers (or both). It doesn’t seem to make sense for researchers, since they can’t, by not working on AI alignment today, cause more AI alignment researchers to appear in the future. And I think a funder should think, “There will be plenty of funding for AI alignment research in the future when there are clearer warning signs. I could save and invest this money, and spend it in the future on alignment, but it will just be adding to the future pool of funding, and the marginal utility will be pretty low because at the margins, it will be hard to turn money into qualified alignment researchers in the future just as it is hard to do that today.”
So I’m saying this particular reallocation of resources that you suggested does not make sense, but the money/talent could still be reallocated some other way (for example to some other altruistic cause today). Do you have either a counterargument or another suggestion that you think is better than spending on AI alignment today?
Have you seen my recent posts that argued for or supported this? If not I can link them: Three AI Safety Related Ideas, Two Neglected Problems in Human-AI Safety, Beyond Astronomical Waste, The Argument from Philosophical Difficulty.
Sure, but why can’t philosophical work be a complement to that?
I won’t defend these numbers because I haven’t put much thought into this topic personally (since my own reasons don’t depend on these numbers, and I doubt that I can do much better than deferring to others). But at what probabilities would you say that substantial work on alignment today would start to be worthwhile (assuming the philosophical difficulty argument doesn’t apply)? What do you think a world where such probabilities are reasonable would look like?