We have to imagine that we have some influence over the allocation of something, or there’s nothing to debate here. Call it “resources” or “talent” or whatever, if there’s nothing to move, there’s nothing to discuss.
Let me rephrase my argument to be clearer. You suggested earlier, “and if resources today can be traded for a lot more resources later, the temptation to wait should be strong.” This advice could be directed at either funders or researchers (or both). It doesn’t seem to make sense for researchers, since they can’t, by not working on AI alignment today, cause more AI alignment researchers to appear in the future. And I think a funder should think, “There will be plenty of funding for AI alignment research in the future when there are clearer warning signs. I could save and invest this money, and spend it in the future on alignment, but it will just be adding to the future pool of funding, and the marginal utility will be pretty low because at the margins, it will be hard to turn money into qualified alignment researchers in the future just as it is hard to do that today.”
So I’m saying this particular reallocation of resources that you suggested does not make sense, but the money/talent could still be reallocated some other way (for example to some other altruistic cause today). Do you have either a counterargument or another suggestion that you think is better than spending on AI alignment today?
I’m skeptical solving hard philosophical problems will be of much use here.
Once we see the actual form of relevant systems then we can do lots of useful work on concrete variations.
Sure, but why can’t philosophical work be a complement to that?
I’d call “human labor being obsolete within 10 years … 15%, and within 20 years … 35%” crazy extreme predictions, and happily bet against them.
I won’t defend these numbers because I haven’t put much thought into this topic personally (since my own reasons don’t depend on these numbers, and I doubt that I can do much better than deferring to others). But at what probabilities would you say that substantial work on alignment today would start to be worthwhile (assuming the philosophical difficulty argument doesn’t apply)? What do you think a world where such probabilities are reasonable would look like?
Let me rephrase my argument to be clearer. You suggested earlier, “and if resources today can be traded for a lot more resources later, the temptation to wait should be strong.” This advice could be directed at either funders or researchers (or both). It doesn’t seem to make sense for researchers, since they can’t, by not working on AI alignment today, cause more AI alignment researchers to appear in the future. And I think a funder should think, “There will be plenty of funding for AI alignment research in the future when there are clearer warning signs. I could save and invest this money, and spend it in the future on alignment, but it will just be adding to the future pool of funding, and the marginal utility will be pretty low because at the margins, it will be hard to turn money into qualified alignment researchers in the future just as it is hard to do that today.”
So I’m saying this particular reallocation of resources that you suggested does not make sense, but the money/talent could still be reallocated some other way (for example to some other altruistic cause today). Do you have either a counterargument or another suggestion that you think is better than spending on AI alignment today?
Have you seen my recent posts that argued for or supported this? If not I can link them: Three AI Safety Related Ideas, Two Neglected Problems in Human-AI Safety, Beyond Astronomical Waste, The Argument from Philosophical Difficulty.
Sure, but why can’t philosophical work be a complement to that?
I won’t defend these numbers because I haven’t put much thought into this topic personally (since my own reasons don’t depend on these numbers, and I doubt that I can do much better than deferring to others). But at what probabilities would you say that substantial work on alignment today would start to be worthwhile (assuming the philosophical difficulty argument doesn’t apply)? What do you think a world where such probabilities are reasonable would look like?