I guess all AGIs that aren’t explicity forbidden to will self-modify (75%); self-modification will mostly start with a backup (code has this option) (95%), and maybe half the methods of backup/compare will approve improvements and throw out undesirable changes.
Really? This seems to ignore that certain structures will have a lot of trouble self-modifying. For example, consider an AI that is a hard-encoded silicon chip with a fixed amount of RAM. Unless it is already very clever, there’s no way it can self-improve.
This actually illustrates nicely some issues with the whole notion of “self-improving.”
Suppose Sally is an AI on a hard-encoded silicon chip with fixed RAM. One day Sally is given the job of establishing a policy to control resource allocation at the Irrelevant Outputs factory, and concludes that the most efficient mechanism for doing so is to implement in software on the IO network the same algorithms that its own silicon chip implements in hardware, so it does so.
The program Sally just wrote can be thought of as a version of Sally that is not constrained to a particular silicon chip. (It probably also runs much slower, though that’s not entirely clear.)
In this scenario, is Sally self-modifying? Is it self-improving? I’m not even sure those are the right questions.
Hard-coding onto chips, or even making specific structures electromechanical in nature, is one way of how humans would achieve “explicitly forbidden to self-modify” in AIs. I estimated that one in every four AGI projects will desire to forbid their project from self-modification. I thought this was optimistic; I haven’t seen any discussion of fixed AGI. Although maybe that might be something military research and development is interested in.
Really? This seems to ignore that certain structures will have a lot of trouble self-modifying. For example, consider an AI that is a hard-encoded silicon chip with a fixed amount of RAM. Unless it is already very clever, there’s no way it can self-improve.
This actually illustrates nicely some issues with the whole notion of “self-improving.”
Suppose Sally is an AI on a hard-encoded silicon chip with fixed RAM. One day Sally is given the job of establishing a policy to control resource allocation at the Irrelevant Outputs factory, and concludes that the most efficient mechanism for doing so is to implement in software on the IO network the same algorithms that its own silicon chip implements in hardware, so it does so.
The program Sally just wrote can be thought of as a version of Sally that is not constrained to a particular silicon chip. (It probably also runs much slower, though that’s not entirely clear.)
In this scenario, is Sally self-modifying? Is it self-improving? I’m not even sure those are the right questions.
Hard-coding onto chips, or even making specific structures electromechanical in nature, is one way of how humans would achieve “explicitly forbidden to self-modify” in AIs. I estimated that one in every four AGI projects will desire to forbid their project from self-modification. I thought this was optimistic; I haven’t seen any discussion of fixed AGI. Although maybe that might be something military research and development is interested in.
My point was that even in some cases where people aren’t thinking about self-modification, self-modification won’t happen by default.