Peter S. Park comments on Private alignment research sharing and coordination

Peter S. Park 5 Sep 2022 22:52 UTC
4 points
2
In general, it is much easier to keep potentially concerning material out of the AGI’s training set if it’s already a secret rather than something that’s been published on the Internet. This is because there may be copies, references, and discussions of the material elsewhere in the training set that we fail to catch.

If it’s already posted on the Internet and it’s too late, we should of course still try our best to keep it out of the training set.

As for the question of “should we give up on security after AGI attains high capabilities?” we shouldn’t give up as long as our preparation could non-negligibly increase our probability of escaping doom, even if the probability increase is small. We should always maximize expected utility, even if we are probably doomed.