Here’s a distinction you could make: an AI is self-modifying if it is effectively capable of making any change to its source code at any time, and non-self-modifying if it is not. (The phrase “capable of” is vague, of course.)
I can imagine non-self-modifying AI having an advantage over self-modifying AI, because it might be possible for an NSM AI to be protected from its own stupidity, so to speak. If the AI were to believe that overwriting all of its beliefs with the digits of pi is a good idea, nothing bad would happen, because it would be unable to do that. Of course, these same restrictions that make the AI incapable of breaking itself might also make it incapable of being really smart.
I believe I’ve heard someone say that any AI capable of being really smart must be effectively self-modifying, because being really smart involves the ability to make arbitrary calculations, and if you can make arbitrary calculations, then you’re not restricted. My objection is that there’s a big difference between making arbitrary calculations and running arbitrary code; namely, the ability to run arbitrary code allows you to alter other calculations running on the same machine.
Lemme expand on my thoughts a little bit. I imagine a non-self-modifying AI to be made of three parts: a thinking algorithm, a decision algorithm, and a belief database. The thinking and decision algorithms are immutable, and the belief database is (obviously) mutable. The supergoal is coded into the decision algorithm, so it can’t be changed. (Problem: the supergoal only makes sense in the concept of certain beliefs, and beliefs are mutable.) The contents of the belief database influence the thinking algorithm’s behavior, but they don’t determine its behavior.
The ideal possibility is that we can make the following happen:
The belief database is flexible enough that it can accommodate all types of beliefs from the very beginning. (If the thinking algorithm is immutable, it can’t be updated to handle new types of beliefs.)
The thinking algorithm is sufficiently flexible that the beliefs in the belief database can lead the algorithm in the right directions, producing super-duper intelligence.
The thinking algorithm is sufficiently inflexible that the beliefs in the belief database cannot cause the algorithm to do something really bad, producing insanity.
The supergoal remains meaningful in the context of the belief database regardless of how the thinking algorithm ends up behaving.
(My ideas haven’t been taken seriously in the past, and I have no special knowledge in this area, so it’s likely that my ideas are worthless. They feel valuable to me, however.)
Here’s a distinction you could make: an AI is self-modifying if it is effectively capable of making any change to its source code at any time, and non-self-modifying if it is not. (The phrase “capable of” is vague, of course.)
I can imagine non-self-modifying AI having an advantage over self-modifying AI, because it might be possible for an NSM AI to be protected from its own stupidity, so to speak. If the AI were to believe that overwriting all of its beliefs with the digits of pi is a good idea, nothing bad would happen, because it would be unable to do that. Of course, these same restrictions that make the AI incapable of breaking itself might also make it incapable of being really smart.
I believe I’ve heard someone say that any AI capable of being really smart must be effectively self-modifying, because being really smart involves the ability to make arbitrary calculations, and if you can make arbitrary calculations, then you’re not restricted. My objection is that there’s a big difference between making arbitrary calculations and running arbitrary code; namely, the ability to run arbitrary code allows you to alter other calculations running on the same machine.
Lemme expand on my thoughts a little bit. I imagine a non-self-modifying AI to be made of three parts: a thinking algorithm, a decision algorithm, and a belief database. The thinking and decision algorithms are immutable, and the belief database is (obviously) mutable. The supergoal is coded into the decision algorithm, so it can’t be changed. (Problem: the supergoal only makes sense in the concept of certain beliefs, and beliefs are mutable.) The contents of the belief database influence the thinking algorithm’s behavior, but they don’t determine its behavior.
The ideal possibility is that we can make the following happen:
The belief database is flexible enough that it can accommodate all types of beliefs from the very beginning. (If the thinking algorithm is immutable, it can’t be updated to handle new types of beliefs.)
The thinking algorithm is sufficiently flexible that the beliefs in the belief database can lead the algorithm in the right directions, producing super-duper intelligence.
The thinking algorithm is sufficiently inflexible that the beliefs in the belief database cannot cause the algorithm to do something really bad, producing insanity.
The supergoal remains meaningful in the context of the belief database regardless of how the thinking algorithm ends up behaving.
(My ideas haven’t been taken seriously in the past, and I have no special knowledge in this area, so it’s likely that my ideas are worthless. They feel valuable to me, however.)