Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)
I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:
I think AI safety likely involves solving a number of difficult conceptual problems, such that it would take >5 years (I would guess something like 10-30 years, with very wide error bars) of research to have solutions that we are happy with. Moreover, many of the relevant problems have short-term analogues that can be worked on today. (Indeed, some of these align with your own research interests, e.g. imputing value functions of agents from actions/decisions; although I am particularly interested in the agnostic case where the value function might lie outside of the given model family, which I think makes things much harder.)
I suppose the summary point of the above is that even if you think AI is a ways off (my median estimate is ~50 years, again with high error bars) research is not something that can happen instantaneously, and conceptual research in particular can move slowly due to being harder to work on / parallelize.
While I have uncertainty about fast take-off, that still leaves some probability that fast take-off will happen, and in that world it is an important enough problem that it is worth thinking about. (It is also very worthwhile to think about the probability of fast take-off, as better estimates would help to better direct resources even within the AI safety space.)
Finally, I think there are a number of important safety problems even from sub-human AI systems. Tech-driven unemployment is I guess the standard one here, although I spend more time thinking about cyber-warfare/autonomous weapons, as well as changes in the balance of power between nation-states and corporations. These are not as clearly an existential risk as unfriendly AI, but I think in some forms would qualify as a global catastrophic risk; on the other hand I would guess that most people who care about AI safety (at least on this website) do not care about it for this reason, so this is more idiosyncratic to me.
Happy to expand on/discuss any of the above points if you are interested.
Good points all; these are good reasons to work on AI safety (and of course as a theorist I’m very happy to think about interesting problems even if they don’t have immediate impact :-) I’m definitely interested in the short-term issues, and have been spending a lot of my research time lately thinking about fairness/privacy in ML. Inverse-RL/revealed preferences learning is also quite interesting, and I’d love to see some more theory results in the agnostic case.
Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)
I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:
I think AI safety likely involves solving a number of difficult conceptual problems, such that it would take >5 years (I would guess something like 10-30 years, with very wide error bars) of research to have solutions that we are happy with. Moreover, many of the relevant problems have short-term analogues that can be worked on today. (Indeed, some of these align with your own research interests, e.g. imputing value functions of agents from actions/decisions; although I am particularly interested in the agnostic case where the value function might lie outside of the given model family, which I think makes things much harder.)
I suppose the summary point of the above is that even if you think AI is a ways off (my median estimate is ~50 years, again with high error bars) research is not something that can happen instantaneously, and conceptual research in particular can move slowly due to being harder to work on / parallelize.
While I have uncertainty about fast take-off, that still leaves some probability that fast take-off will happen, and in that world it is an important enough problem that it is worth thinking about. (It is also very worthwhile to think about the probability of fast take-off, as better estimates would help to better direct resources even within the AI safety space.)
Finally, I think there are a number of important safety problems even from sub-human AI systems. Tech-driven unemployment is I guess the standard one here, although I spend more time thinking about cyber-warfare/autonomous weapons, as well as changes in the balance of power between nation-states and corporations. These are not as clearly an existential risk as unfriendly AI, but I think in some forms would qualify as a global catastrophic risk; on the other hand I would guess that most people who care about AI safety (at least on this website) do not care about it for this reason, so this is more idiosyncratic to me.
Happy to expand on/discuss any of the above points if you are interested.
Best,
Jacob
Good points all; these are good reasons to work on AI safety (and of course as a theorist I’m very happy to think about interesting problems even if they don’t have immediate impact :-) I’m definitely interested in the short-term issues, and have been spending a lot of my research time lately thinking about fairness/privacy in ML. Inverse-RL/revealed preferences learning is also quite interesting, and I’d love to see some more theory results in the agnostic case.