porby comments on Private alignment research sharing and coordination

porby 5 Sep 2022 19:53 UTC
3 points
1
I suppose that’s an additional consideration. Keeping potentially concerning material out of trivially scraped training sets is pretty low cost and worth it.
I wouldn’t want to sacrifice much usability beyond the standard security measures to focus on that angle, though; that would mean trying to directly fight a threat which is 1. already able to misuse observed research, 2. already able to otherwise socially or technically engineer its way to gaining access to that research, and 3. somehow not already massively lethal without that research.
- Peter S. Park 5 Sep 2022 22:52 UTC
  4 points
  2
  Parent
  In general, it is much easier to keep potentially concerning material out of the AGI’s training set if it’s already a secret rather than something that’s been published on the Internet. This is because there may be copies, references, and discussions of the material elsewhere in the training set that we fail to catch.
  
  If it’s already posted on the Internet and it’s too late, we should of course still try our best to keep it out of the training set.
  
  As for the question of “should we give up on security after AGI attains high capabilities?” we shouldn’t give up as long as our preparation could non-negligibly increase our probability of escaping doom, even if the probability increase is small. We should always maximize expected utility, even if we are probably doomed.