Dakara comments on If we solve alignment, do we die anyway?

Dakara 22 Nov 2024 17:31 UTC
1 point
0
After some thought, I think this is a potentially really large issue which I don’t know how we can even begin to solve. We can have aligned AI, being aligned with someone who wants to create bioweapons. Is there anything being done (or anything that can be done) to prevent that?
- Noosphere89 22 Nov 2024 17:40 UTC
  3 points
  1
  Parent
  The answers to this question is actually 2 things:
  1. This is why I expect we will eventually have to fight to ban open-source, and we will have to get the political will to ban both open-source and open-weights AI.
  2. This is where the unlearning field comes in. If we could make the AI unlearn knowledge, an example being nuclear weapons, we could possibly distribute AI safely without causing novices to create dangerous stuff.
  More here:
  
  https://www.lesswrong.com/posts/mFAvspg4sXkrfZ7FA/deep-forgetting-and-unlearning-for-safely-scoped-llms
  
  https://www.lesswrong.com/posts/9AbYkAy8s9LvB7dT5/the-case-for-unlearning-that-removes-information-from-llm
  
  But the solutions are intentionally going to make AI safe without relying on alignment.