paulfchristiano comments on Learning the prior

paulfchristiano Jun 14, 2022, 8:22 PM
2 points
Yes. We’re just aiming for a distillation of the overseer’s judgments, but that’s what existing imagenet models are anyway, so we’ll be vulnerable to adversarial examples for the same reason.