I don’t expect everyone to disregard the danger; I do expect most people building capable AI systems to continue to hide hard problems. Hiding the hard problems is much easier than solving them, but I guess produces plausible-sounding solutions just as well.
Roughly human level humans don’t contribute significantly to AI alignment research and can’t be pivotally used. So I don’t think you think that a roughly human level AI system can contribute significantly to AI alignment research. Maybe you (as many seem to) think that if someone runs not-that-superhuman language models with clever prompt engendering, fine-tuning, and systems around, than the whole system can solve alignment or be pivotally used, and the point of the post is that the whole system is superhuman, not roughly human-level, if it’s capable enough to solve alignment or be pibotally used, and you need to direct the whole system somewhere, and unless you made the whole system optimize for something you actually want, it probably kills you before it solves alignment.
I don’t expect everyone to disregard the danger; I do expect most people building capable AI systems to continue to hide hard problems. Hiding the hard problems is much easier than solving them, but I guess produces plausible-sounding solutions just as well.
Roughly human level humans don’t contribute significantly to AI alignment research and can’t be pivotally used. So I don’t think you think that a roughly human level AI system can contribute significantly to AI alignment research. Maybe you (as many seem to) think that if someone runs not-that-superhuman language models with clever prompt engendering, fine-tuning, and systems around, than the whole system can solve alignment or be pivotally used, and the point of the post is that the whole system is superhuman, not roughly human-level, if it’s capable enough to solve alignment or be pibotally used, and you need to direct the whole system somewhere, and unless you made the whole system optimize for something you actually want, it probably kills you before it solves alignment.