Eugene D comments on AGI Safety FAQ / all-dumb-questions-allowed thread

Eugene D 17 Jun 2022 12:42 UTC
1 point
0
thank you. Make some sense...but does “rewriting its own code” (the very code we thought would perhaps permanently influence it before it got-going) nullify our efforts at hardcoding our intentions?
- Kaj_Sotala 17 Jun 2022 13:42 UTC
  3 points
  0
  Parent
  I’m not a psychopath, and if I got the opportunity to rewrite my own source code to become a psychopath, I wouldn’t do it.
  
  At the same time, it’s the evolutionary and cultural programming in my source code that contains the desire not to become a psychopath.
  
  In other words, once the desire to not become a psychopath is there in my source code, I will do my best not to become one, even if I have the ability to modify my source code.
  - Eugene D 17 Jun 2022 14:33 UTC
    3 points
    2
    Parent
    That makes sense. My intention was not to argue from the position of it becoming a psychopath though (my apologies if it came out that way)...but instead from a perspective of an entity which starts-out as supposedly Aligned (centered-on human safety, let’s say), but then, bc it’s orders of magnitude smarter than we are (by definition), it quickly develops a different perspective. But you’re saying it will remain ‘aligned’ in some vitally-important way, even when it discovers ways the code could’ve been written differently?
- Aleksi Liimatainen 17 Jun 2022 13:19 UTC
  1 point
  0
  Parent
  The AI would be expected to care about preserving its motivations under self-modification for similar reasons as it would care about defending them against outside intervention. There could be a window where the AI operates outside immediate human control but isn’t yet good at keeping its goals stable under self-modification. It’s been mentioned as a concern in the past; I don’t know what the state of current thinking is.

Eugene D comments on AGI Safety FAQ /​ all-dumb-questions-allowed thread

Eugene D comments on AGI Safety FAQ / all-dumb-questions-allowed thread