Vladimir_Nesov comments on Orthogonality Thesis seems wrong

Vladimir_Nesov 26 Mar 2024 15:25 UTC
2 points
0
Orthogonality thesis says that it’s invalid to conclude benevolence from the premise of powerful optimization, it gestures at counterexamples. It’s entirely compatible with benevolence being very likely in practice. You then might want to separately ask yourself if it’s in fact likely. But you do need to ask, that’s the point of orthogonality thesis, its narrow scope.
- Donatas Lučiūnas 26 Mar 2024 16:13 UTC
  1 point
  0
  Parent
  It’s entirely compatible with benevolence being very likely in practice.
  Could you help me understand how is it possible? Why an intelligent agent should care about humans instead of defending against unknown threats?
- Jonas Hallgren 26 Mar 2024 15:33 UTC
  1 point
  0
  Parent
  Yeah, I agree with what you just said; I should have been more careful with my phrasing.
  
  Maybe something like: “The naive version of the orthogonality thesis where we assume that AIs can’t converge towards human values is assumed to be true too often”