I think people who value empirical alignment work now probably think that (to some extent) we can predict at a high level what future problems we might face (contrasting with “there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory”). Obviously this is a spectrum, but I think the chip fab analogy is I think further towards people believing there are unknown unknowns in the problem space than people at OpenAI are (e.g. OpenAI people possibly think outer alignment and inner alignment capture all of the kinds of problems we’ll face).
However, they probably don’t believe you can work on solutions to those problems without being able to empirically demonstrate those problems and hence iterate on them (and again one could probably appeal to a track record here of most proposed solutions to problems not working unless they were developed by iterating on the actual problem). We can maybe vaguely postulate what the solutions could look like (they would say), but it’s going to be much better to try and actually implement solutions on versions of the problem we can demonstrate, and iterate from there. (Note that they probably also perhaps try and produce demonstrations of the problems such that they can then work on those solutions, but this is still all empirical).
Otherwise I do think your ITT does seem reasonable to me, although I don’t think I’d put myself in the class of people you’re trying to ITT, so that’s not much evidence.
I think people who value empirical alignment work now probably think that (to some extent) we can predict at a high level what future problems we might face (contrasting with “there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory”). Obviously this is a spectrum, but I think the chip fab analogy is I think further towards people believing there are unknown unknowns in the problem space than people at OpenAI are (e.g. OpenAI people possibly think outer alignment and inner alignment capture all of the kinds of problems we’ll face).
However, they probably don’t believe you can work on solutions to those problems without being able to empirically demonstrate those problems and hence iterate on them (and again one could probably appeal to a track record here of most proposed solutions to problems not working unless they were developed by iterating on the actual problem). We can maybe vaguely postulate what the solutions could look like (they would say), but it’s going to be much better to try and actually implement solutions on versions of the problem we can demonstrate, and iterate from there. (Note that they probably also perhaps try and produce demonstrations of the problems such that they can then work on those solutions, but this is still all empirical).
Otherwise I do think your ITT does seem reasonable to me, although I don’t think I’d put myself in the class of people you’re trying to ITT, so that’s not much evidence.