Theoretically yes, but the field of machine learning is not steered by what can be theoretically deduced from all information. My guess is that not publishing the first paper with impressive results is pretty important, and there are ways to get feedback short of that, like talking to people, or writing an Alignment Forum post for less dangerous ideas like we did once.
The full quote is also “object level infohazards could be limited”; subtle mindset infohazards likely present a trickier balance to strike and we’d have to think about them more.
That is not how information theory works!
Theoretically yes, but the field of machine learning is not steered by what can be theoretically deduced from all information. My guess is that not publishing the first paper with impressive results is pretty important, and there are ways to get feedback short of that, like talking to people, or writing an Alignment Forum post for less dangerous ideas like we did once.
The full quote is also “object level infohazards could be limited”; subtle mindset infohazards likely present a trickier balance to strike and we’d have to think about them more.