I like your clusters (though of course I could quibble about which research goes where). I expect these clusters strongly disagree with how many researchers imagine their research will be used, and you’ll probably get some pushback as a result, but it’s still a useful categorization about what impact different research agendas are actually likely to have. I’m still not convinced that they’re natural clusters, as opposed to a useful classification.
Basically anyone defending type-2 will probably argue that tractability/funding/etc should not be ignored. I’ll make the opposite argument: don’t search under the streetlight. In practice, working on the right problem is usually orders of magnitude more important than the amount of progress made on the problem—i.e. 1 unit of progress in the best direction is more important than 1000 units of progress in a random useful direction. This is a consequence of living in a high-dimensional world. (That said, remember that tractability and neglectedness both also offer some evidence of importance—not necessarily very much evidence, but some.)
On the ease of working on superintelligent alignment now vs later… I haven’t read Rohin’s comment yet, but I assume he’ll point out that future non-superintelligent AI could help us align more powerful AI. This is a good argument. I would much rather not trust even non-superintelligent AI that much—we’d basically be rolling the dice on whether that non-superintelligent AI is aligned well enough to get the superintelligent AI perfectly aligned to humans (not just to the non-super AI) - but it’s still a good argument.
On the need for perfect alignment: we need AI to be basically-perfectly aligned if it’s capable of making big changes quickly, regardless of whether it’s optimizing things. E.g. nukes need to be very well aligned, even though they’re not optimizers. And if we’re going to get the full value out of AI, then it needs to be capable of making big changes quickly. (This matters because some people argue that we can de-risk AI by using non-optimizer architectures. I don’t think that’s sufficient to avoid the need for alignment.)
On the ease of working on superintelligent alignment now vs later… I haven’t read Rohin’s comment yet, but I assume he’ll point out that future non-superintelligent AI could help us align more powerful AI. This is a good argument. I would much rather not trust even non-superintelligent AI that much—we’d basically be rolling the dice on whether that non-superintelligent AI is aligned well enough to get the superintelligent AI perfectly aligned to humans (not just to the non-super AI) - but it’s still a good argument.
Amusingly to me, I said basically the same thing. I do think that “we’ll have a better idea of what AGI will look like” is a more important reason for optimism about future research.
Unless you mean an omnipotent superintelligence, in which case we probably don’t get much of an idea of what that looks like before it no longer matters what we do. In that case I argue that our job is not to align the omnipotent superintelligence, but to instead align the better-than-human AI whose job it is to build and align the next iteration of AI systems, and then say we’ll have a better idea of what the better-than-human AI looks like in the future.
This matters because some people argue that we can de-risk AI by using non-optimizer architectures. I don’t think that’s sufficient to avoid the need for alignment.
+1
I’ll make the opposite argument: don’t search under the streetlight. In practice, working on the right problem is usually orders of magnitude more important than the amount of progress made on the problem—i.e. 1 unit of progress in the best direction is more important than 1000 units of progress in a random useful direction.
+1, though note that you can have beneficial effects other than “solving the problem”, e.g. convincing people there is a problem, field-building (both reputation of the field, and people working in the field). It’s still quite important for these other effects to focus on the right problem (it’s not great if you build a field that then solves the wrong problem).
I like your clusters (though of course I could quibble about which research goes where). I expect these clusters strongly disagree with how many researchers imagine their research will be used, and you’ll probably get some pushback as a result, but it’s still a useful categorization about what impact different research agendas are actually likely to have. I’m still not convinced that they’re natural clusters, as opposed to a useful classification.
Basically anyone defending type-2 will probably argue that tractability/funding/etc should not be ignored. I’ll make the opposite argument: don’t search under the streetlight. In practice, working on the right problem is usually orders of magnitude more important than the amount of progress made on the problem—i.e. 1 unit of progress in the best direction is more important than 1000 units of progress in a random useful direction. This is a consequence of living in a high-dimensional world. (That said, remember that tractability and neglectedness both also offer some evidence of importance—not necessarily very much evidence, but some.)
On the ease of working on superintelligent alignment now vs later… I haven’t read Rohin’s comment yet, but I assume he’ll point out that future non-superintelligent AI could help us align more powerful AI. This is a good argument. I would much rather not trust even non-superintelligent AI that much—we’d basically be rolling the dice on whether that non-superintelligent AI is aligned well enough to get the superintelligent AI perfectly aligned to humans (not just to the non-super AI) - but it’s still a good argument.
On the need for perfect alignment: we need AI to be basically-perfectly aligned if it’s capable of making big changes quickly, regardless of whether it’s optimizing things. E.g. nukes need to be very well aligned, even though they’re not optimizers. And if we’re going to get the full value out of AI, then it needs to be capable of making big changes quickly. (This matters because some people argue that we can de-risk AI by using non-optimizer architectures. I don’t think that’s sufficient to avoid the need for alignment.)
Amusingly to me, I said basically the same thing. I do think that “we’ll have a better idea of what AGI will look like” is a more important reason for optimism about future research.
Unless you mean an omnipotent superintelligence, in which case we probably don’t get much of an idea of what that looks like before it no longer matters what we do. In that case I argue that our job is not to align the omnipotent superintelligence, but to instead align the better-than-human AI whose job it is to build and align the next iteration of AI systems, and then say we’ll have a better idea of what the better-than-human AI looks like in the future.
+1
+1, though note that you can have beneficial effects other than “solving the problem”, e.g. convincing people there is a problem, field-building (both reputation of the field, and people working in the field). It’s still quite important for these other effects to focus on the right problem (it’s not great if you build a field that then solves the wrong problem).