Here is my takeout from the report. It is not a summary, and some of the implications are mine.
The 4 Theses (conjectures, really):
Inevitability of Intelligence (defined as cross-domain optimization power) Explosion due to recursive self-improvement
Orthogonality (intelligence/rationality is preference-independent)
Instrumental Convergence (most optimizers compete for the same resources)
Complexity of Value (values are not easily formalizable, no 3 laws of robotics) if true, imply that AGI is an x-risk, because an AGI emerging in an ad hoc fashion will compete with humans and inevitably win.
The difference between 1-3 and 4 is that 1-3 are outside of human control, but there is a hope for solving 4, hence FAI research.
There are a few outs, which the report considers unlikely:
If Intelligence Explosion is not inevitable, preventing it may avert the x-risk. If it’s impossible, we are in the clear, as well.
If all sufficiently advanced agents tend to converge on “humane” values, a la David Pearce, we have nothing to worry about
If powerful optimizers are likely to find their own resources, they might leave us alone
Given the above, the obvious first step is formalizing each of the theses 1-3 as a step toward evaluating their validity. The report outlines potential steps toward formalizing thesis 1, Intelligence Explosion (IE):
Step 1 is basically categorizing existing positions on IE and constructing explicit models for them, hopefully somewhat formal, checking them for self-consistency and comparing them with past precedents, where possible
Step 2 is comparing the models formalized in Step 1 by constructing a common theory of which they would be sub-cases (I am not at all sure if this is what Eliezer means)
Step 3 is constructing a model which is likely to contain “reality” and eventually be able to answer the question “AI go FOOM?” with some probability it is confident in.
The answer to this last question would then determine the direction of the FAI effort, if any.
This was basically the content of the first 4 chapters, as far as I can tell. (Chapter 2 is the advocacy of an outside view and chapter 3 is mostly advocacy of the hard take-off.) Chapters 5 and 6 are a mix of open questions in IE relevant to the Step 3 above, some speculations and MIRI policy arguments, as well as some musings about the scope/effort/qualifications required to tackle the problem.
Here is my takeout from the report. It is not a summary, and some of the implications are mine.
The 4 Theses (conjectures, really):
Inevitability of Intelligence (defined as cross-domain optimization power) Explosion due to recursive self-improvement
Orthogonality (intelligence/rationality is preference-independent)
Instrumental Convergence (most optimizers compete for the same resources)
Complexity of Value (values are not easily formalizable, no 3 laws of robotics)
if true, imply that AGI is an x-risk, because an AGI emerging in an ad hoc fashion will compete with humans and inevitably win.
The difference between 1-3 and 4 is that 1-3 are outside of human control, but there is a hope for solving 4, hence FAI research.
There are a few outs, which the report considers unlikely:
If Intelligence Explosion is not inevitable, preventing it may avert the x-risk. If it’s impossible, we are in the clear, as well.
If all sufficiently advanced agents tend to converge on “humane” values, a la David Pearce, we have nothing to worry about
If powerful optimizers are likely to find their own resources, they might leave us alone
Given the above, the obvious first step is formalizing each of the theses 1-3 as a step toward evaluating their validity. The report outlines potential steps toward formalizing thesis 1, Intelligence Explosion (IE):
Step 1 is basically categorizing existing positions on IE and constructing explicit models for them, hopefully somewhat formal, checking them for self-consistency and comparing them with past precedents, where possible
Step 2 is comparing the models formalized in Step 1 by constructing a common theory of which they would be sub-cases (I am not at all sure if this is what Eliezer means)
Step 3 is constructing a model which is likely to contain “reality” and eventually be able to answer the question “AI go FOOM?” with some probability it is confident in.
The answer to this last question would then determine the direction of the FAI effort, if any.
This was basically the content of the first 4 chapters, as far as I can tell. (Chapter 2 is the advocacy of an outside view and chapter 3 is mostly advocacy of the hard take-off.) Chapters 5 and 6 are a mix of open questions in IE relevant to the Step 3 above, some speculations and MIRI policy arguments, as well as some musings about the scope/effort/qualifications required to tackle the problem.