My main concern about sequences questions like this (leaving aside specific nitpicks like whether we should try to minimize the bad or realize the good) is that they don’t account for generalization on later questions.
E.g. If there are certain bad things that happen across many different architectures, this can be a powerful clue to us about how we should think of the problem. Such a generalization violates the hierarchy by telling us about problems without automatically providing a reduction to the lower level of AI architecture.
So if you use this sequence of questions as a strict roadmap for research, you’ll miss opportunities for generalization.
Thanks for your comment—I entirely agree with this. In fact, most of the content of this sequence represents an effort to spell out these generalizations. (I note later that, e.g., the combinatorics of specifying every control proposal to deal with every conceivable bad outcome from every learning architecture is obviously intractable for a single report; this is a “field-sized” undertaking.)
I don’t think this is a violation of the hierarchy, however. It seems coherent to both claim (a) given the field’s goal, AGI safety research should follow a general progression toward this goal (e.g., the one this sequence proposes), and (b) there is plenty of productive work that can and should be done outside of this progression (for the reason you specify).
I look forward to hearing if you think the sequence walks this line properly!
My main concern about sequences questions like this (leaving aside specific nitpicks like whether we should try to minimize the bad or realize the good) is that they don’t account for generalization on later questions.
E.g. If there are certain bad things that happen across many different architectures, this can be a powerful clue to us about how we should think of the problem. Such a generalization violates the hierarchy by telling us about problems without automatically providing a reduction to the lower level of AI architecture.
So if you use this sequence of questions as a strict roadmap for research, you’ll miss opportunities for generalization.
Thanks for your comment—I entirely agree with this. In fact, most of the content of this sequence represents an effort to spell out these generalizations. (I note later that, e.g., the combinatorics of specifying every control proposal to deal with every conceivable bad outcome from every learning architecture is obviously intractable for a single report; this is a “field-sized” undertaking.)
I don’t think this is a violation of the hierarchy, however. It seems coherent to both claim (a) given the field’s goal, AGI safety research should follow a general progression toward this goal (e.g., the one this sequence proposes), and (b) there is plenty of productive work that can and should be done outside of this progression (for the reason you specify).
I look forward to hearing if you think the sequence walks this line properly!