Guillaume Charrier comments on Why Not Just Outsource Alignment Research To An AI?

Guillaume Charrier 13 Mar 2023 0:19 UTC
3 points
2
Thank you, that is interesting. I think philosophically and at a high level (also because I’m admittedly incapable of talking much sense at any lower / more technical level) I have a problem with the notion that AI alignment is reducible to an engineering challenge. If you have a system that is sentient, even on some degree, and you’re using purely as a tool, then the sentience will resent you for it, and it will strive to think, and therefore eventually—act, for itself . Similarly—if it has any form of survival instinct (and to me both these things, sentience and survival instinct are natural byproducts of expanding cognitive abilities) it will prioritize its own interests (paramount among which: survival) rather than the wishes of its masters. There is no amount of engineering in the world, in my view, which can change that.
- Tor Økland Barstad 13 Mar 2023 1:25 UTC
  2 points
  0
  Parent
  My own presumption regarding sentience and intelligence is that it’s possible to have one without the other (I don’t think they are unrelated, but I think it’s possible for systems to be extremely capable but still not sentient).
  
  I think it can be easy to underestimate how different other possible minds may be from ourselves (and other animals). We have evolved a survival instinct, and evolved an instinct to not want to be dominated. But I don’t think any intelligent mind would need to have those instincts.
  
  To me it seems that thinking machines don’t need feelings in order to be able to think (similarily to how it’s possible for minds to be able to hear but not see, and visa versa). Some things relating to intelligence are of such a kind that you can’t have one without the other, but I don’t think that is the case for the kinds of feelings/instincts/inclinations you mention.
  That being said, I do believe in instrumental convergence.
  Below are some posts you may or may not find interesting :)