Considering that the default alternative would be no alignment research, I would say, yes, it is a net positive. But I also agree that alignment research can be dual use, which would be your second and third point. I don’t think the first one is a big problem, since comparatively few AI researchers seems to care about the safety of AGI to start with. Even if you believe that some approaches to alignment will not help and can only provide a safe sense of certainty, pursuing them grows the field and can IMO only help attract more attention from the larger ML community. What do you imagine a solution to the last point to look like? Doesn’t preventing malicious actors from seeking power mean solving morality or establishing some kind of utopia or so? Without having looked into it, I am pessimistic that we can find a way to utopia through political science.
Considering that the default alternative would be no alignment research, I would say, yes, it is a net positive. But I also agree that alignment research can be dual use, which would be your second and third point. I don’t think the first one is a big problem, since comparatively few AI researchers seems to care about the safety of AGI to start with. Even if you believe that some approaches to alignment will not help and can only provide a safe sense of certainty, pursuing them grows the field and can IMO only help attract more attention from the larger ML community. What do you imagine a solution to the last point to look like? Doesn’t preventing malicious actors from seeking power mean solving morality or establishing some kind of utopia or so? Without having looked into it, I am pessimistic that we can find a way to utopia through political science.