One can trivially get (some variants of) HCH to do things by e.g. having each human manually act like a transistor, arranging them like a computer, and then passing in a program for an AGI to the HCH-computer. This, of course, is entirely useless for alignment.
There’s a whole class of problem factorization proposals which fall prey to that sort of issue. I didn’t talk about it in the post, but it sounds like it probably (?) applies to the sort of thing you’re picturing.
It’s not useless, but it’s definitely risky to do it, and the things required for safety would mean distillation has to be very cheap. And here we come to the question “How doomed by default are we if AGI is created?” If the chance is low, I agree that it’s not a risk worth taking. If high, then you’ll probably have to do it. The more doomed by default you think creating AGI is, the more risk you should take, especially with short timelines. So MIRI would probably want to do this, given their atypically high levels of doominess, but most other organizations probably won’t do it due to thinking of fairly low risk from AGI.
Yeah bio analogies! I love this comment.
One can trivially get (some variants of) HCH to do things by e.g. having each human manually act like a transistor, arranging them like a computer, and then passing in a program for an AGI to the HCH-computer. This, of course, is entirely useless for alignment.
There’s a whole class of problem factorization proposals which fall prey to that sort of issue. I didn’t talk about it in the post, but it sounds like it probably (?) applies to the sort of thing you’re picturing.
It’s not useless, but it’s definitely risky to do it, and the things required for safety would mean distillation has to be very cheap. And here we come to the question “How doomed by default are we if AGI is created?” If the chance is low, I agree that it’s not a risk worth taking. If high, then you’ll probably have to do it. The more doomed by default you think creating AGI is, the more risk you should take, especially with short timelines. So MIRI would probably want to do this, given their atypically high levels of doominess, but most other organizations probably won’t do it due to thinking of fairly low risk from AGI.