Makes sense— I think the thing I’m trying to point at is “what do you think better safety research actually looks like?”
I suspect there’s some risk that, absent some sort of pre-registrarion, your definition of “good safety research” ends up gradually drifting to be more compatible with the kind of research Anthropic does.
Of course, not all of this will be a bad thing— hopefully you will genuinely learn some new things that change your opinion of what “good research” is.
But the nice thing about pre-registration is that you can be more confident that belief changes are stemming from a deliberate or at least self-aware process, as opposed to some sort of “maybe I thought this all along//i didn’t really know what i believed before I joined” vibe. (and perhaps this is sufficiently covered in your doc)
Makes sense— I think the thing I’m trying to point at is “what do you think better safety research actually looks like?”
I suspect there’s some risk that, absent some sort of pre-registrarion, your definition of “good safety research” ends up gradually drifting to be more compatible with the kind of research Anthropic does.
Of course, not all of this will be a bad thing— hopefully you will genuinely learn some new things that change your opinion of what “good research” is.
But the nice thing about pre-registration is that you can be more confident that belief changes are stemming from a deliberate or at least self-aware process, as opposed to some sort of “maybe I thought this all along//i didn’t really know what i believed before I joined” vibe. (and perhaps this is sufficiently covered in your doc)