I read the full report, excluding the appendices. I’m a layperson. That is, I don’t expect I’ll ever do direct work on existential risk mitigation, or engineering safety into machine intelligence. Further, I’m not well-versed in the technicalities of either Friendliness theory from the MIRI, or literature from academic studies of artificial intelligence.
As a layperson, I don’t know how to assess how much labor is assigned to AI-soon vs. AI-later outcomes. What does AI-soon work look like? What are its features or qualities? Do you consider the MIRI’s current research as ‘AI-soon’ labor?
I don’t have the source now, but Robin Hanson mentioned on a couple past blog posts on Overcoming Bias that work on existential risk reduction isn’t being done, because nobody knows how to do it. This is a rather cynical perspective, as it would not count the work of the FHI, or the MIRI, as being on existential risk reduction. I believe Dr. Hanson meant it appears no direct, or object-level work is being done. These were from a couple years ago, so his position may be different. The landscape for AI safety has changed dramatically in the last two years. Still, I wonder if it’s hard to tell how much, or which, research is oriented toward “sooner” rather than “later” outcomes is because: we don’t know how to do that (object-level) work.
This is a good question. To some extent I didn’t want to take a position on exactly which work is appropriate for this, as that’s independent of the rest of the analysis (although obviously feeds into model parameter estimates).
Something which would definitely help would be just to systematically review what might be useful for AI-soon outcomes.
Possibilities include: working to study the architecture of the more plausible candidates for producing AI; design work on containment mechanisms; producing high-quality data sets of ‘human values’ (in case value-learning is easy). I think those could all turn out to be useless ex post, but they may still be worth trying more for the possibility that they are useful.
There may also be useful lines which are already being pursued to a serious degree as part of cybersecurity.
One application of this might be for the FLI, and where they decide to grant the money they’ve received from Elon Musk. In addition to other considerations, it seems the correct conclusion from your paper would be not to underestimate the value of funding research aimed at AI-soon scenarios, as well as fund it because it could create a research environment that makes a greater quantity and quality of research on even AI-later scenarios. Whatever ratio of funding for either scenario they decide works isn’t as useless if nobody can discern what counts as AI-soon vs. AI-later research.
I read the full report, excluding the appendices. I’m a layperson. That is, I don’t expect I’ll ever do direct work on existential risk mitigation, or engineering safety into machine intelligence. Further, I’m not well-versed in the technicalities of either Friendliness theory from the MIRI, or literature from academic studies of artificial intelligence.
As a layperson, I don’t know how to assess how much labor is assigned to AI-soon vs. AI-later outcomes. What does AI-soon work look like? What are its features or qualities? Do you consider the MIRI’s current research as ‘AI-soon’ labor?
I don’t have the source now, but Robin Hanson mentioned on a couple past blog posts on Overcoming Bias that work on existential risk reduction isn’t being done, because nobody knows how to do it. This is a rather cynical perspective, as it would not count the work of the FHI, or the MIRI, as being on existential risk reduction. I believe Dr. Hanson meant it appears no direct, or object-level work is being done. These were from a couple years ago, so his position may be different. The landscape for AI safety has changed dramatically in the last two years. Still, I wonder if it’s hard to tell how much, or which, research is oriented toward “sooner” rather than “later” outcomes is because: we don’t know how to do that (object-level) work.
This is a good question. To some extent I didn’t want to take a position on exactly which work is appropriate for this, as that’s independent of the rest of the analysis (although obviously feeds into model parameter estimates).
Something which would definitely help would be just to systematically review what might be useful for AI-soon outcomes.
Possibilities include: working to study the architecture of the more plausible candidates for producing AI; design work on containment mechanisms; producing high-quality data sets of ‘human values’ (in case value-learning is easy). I think those could all turn out to be useless ex post, but they may still be worth trying more for the possibility that they are useful.
There may also be useful lines which are already being pursued to a serious degree as part of cybersecurity.
One application of this might be for the FLI, and where they decide to grant the money they’ve received from Elon Musk. In addition to other considerations, it seems the correct conclusion from your paper would be not to underestimate the value of funding research aimed at AI-soon scenarios, as well as fund it because it could create a research environment that makes a greater quantity and quality of research on even AI-later scenarios. Whatever ratio of funding for either scenario they decide works isn’t as useless if nobody can discern what counts as AI-soon vs. AI-later research.