dr_s comments on If we had known the atmosphere would ignite

dr_s 17 Aug 2023 20:31 UTC
3 points
0
Well, architecture specific is something: maybe some different architectures other than LLMs/ANNs are more amenable to alignment, and that’s that. Or it could be a more general result about e.g. what can be achieved with SGD. Though I expect there may also be a general proof altogether, akin to the undecidability of the halting problem.
- Remmelt 5 Nov 2024 9:51 UTC
  1 point
  0
  Parent
  Yes, I think there is a more general proof available. This proof form would combine limits to predictability and so on, with a lethal dynamic that falls outside those limits.