habryka comments on Counterarguments to the basic AI x-risk case

habryka Oct 16, 2022, 7:59 PM
3 points
0
ML systems still use plenty of reinforcement learning, and systems that apply straightforward optimization pressure. We’ve also built a few systems more recently that do something closer to trying to recreate samples from a distribution, but that doesn’t actually help you improve on (or even achieve) human-level performance. In order to improve on human level performance, you either have to hand-code ontologies (by plugging multiple simulator systems together in a CAIS fashion), or just do something like reinforcement learning, which then very quickly does display the error modes everyone is talking about.

Current systems do not display a lack of edge-instantiation behavior. Some of them seem more robust, but the ones that do also seem fundamentally limited (and also, they will likely still show edge-instantiation for their inner objective, but that’s harder to talk about).

And also just to make the very concrete point, Katja linked to a bunch of faces generated by a GAN, which straightforwardly has the problems people are talking about, so there really is no mismatch in the kinds of things that Katja is talking about, and Nate is talking about. We could perform a more optimized search for things that are definitely faces according to the discriminator, and we would likely get something horrifying.
- jacob_cannell Oct 18, 2022, 4:02 PM
  5 points
  4
  Parent
  
  We could perform a more optimized search for things that are definitely faces according to the discriminator, and we would likely get something horrifying.
  
  Sure you could do that, but people usually don’t—unless they intentionally want something horrifying. So if your argument is now “sure, new ML systems totally can solve the faciness problem, but only if you choose to use them correctly”—then great, finally we agree.
  
  Interestingly enough in diffusion planning models if you crank up the discriminator you get trajectories that are higher utility but increasingly unrealistic. You get lower utility trajectories by cranking down the discriminator.
- cfoster0 Oct 16, 2022, 8:21 PM
  2 points
  0
  Parent
  Clarifying questions, either for you or for someone else, to aid my own confusion:
  
  What does “applying optimization pressure” mean? Is steering random noise into the narrow part of configuration space that contains plausible images-of-X (the thing DDPMs and GAN generators do) a straightforward example of it?
  
  EDIT: Split up above question into two.