It’s not hard to design a program with a model of the world that includes itself (though actually coding it requires more effort). The first step is to forget about self-modeling, and just ask, how can I model a world with programs? Then later on you put that model in a program, and then you add a few variables or data structures which represent properties of that program itself.
None of this solves problems about consciousness, objective referential meaning of data structures, and so on. But it’s not hard to design a program which will make choices according to a utility function which refers in turn to the program itself.
Well, I don’t want to solve the problem of consciousness right now. You seem to be thinking along correct lines, but I’d appreciate it if you gave a more fleshed out example—not necessarily working code, but an unambiguous spec would be nice.
Getting a program to represent aspects of itself is a well-studied topic. As for representing its relationship to a larger environment, two simple examples:
1) It would be easy to write a program whose “goal” is to always be the biggest memory hog. All it has to do is constantly run a background calculation of adjustable computational intensity, periodically consult its place in the rankings, and if it’s not number one, increase its demand on CPU resources.
2) Any nonplayer character in a game which fights to preserve itself is also engaged in a limited form of self-preservation. And the computational mechanisms for this example should be directly transposable to a physical situation, like robots in a gladiator arena.
All these examples work through indirect self-reference. The program or robot doesn’t know that it is representing itself. This is why I said that self-modeling is not the challenge. If you want your program to engage in sophisticated feats of self-analysis and self-preservation—e.g. figuring out ways to prevent its mainframe from being switched off, asking itself whether a particular port to another platform would still preserve its identity, and so on—the hard part is not the self part. The hard part is to create a program that can reason about such topics at all, whether or not they apply to itself. If you can create an AI which could solve such problems (keeping the power on, protecting core identity) for another AI, you are more than 99% of the way to having an AI that can solve those problems for itself.
It’s not hard to design a program with a model of the world that includes itself (though actually coding it requires more effort). The first step is to forget about self-modeling, and just ask, how can I model a world with programs? Then later on you put that model in a program, and then you add a few variables or data structures which represent properties of that program itself.
None of this solves problems about consciousness, objective referential meaning of data structures, and so on. But it’s not hard to design a program which will make choices according to a utility function which refers in turn to the program itself.
Well, I don’t want to solve the problem of consciousness right now. You seem to be thinking along correct lines, but I’d appreciate it if you gave a more fleshed out example—not necessarily working code, but an unambiguous spec would be nice.
Getting a program to represent aspects of itself is a well-studied topic. As for representing its relationship to a larger environment, two simple examples:
1) It would be easy to write a program whose “goal” is to always be the biggest memory hog. All it has to do is constantly run a background calculation of adjustable computational intensity, periodically consult its place in the rankings, and if it’s not number one, increase its demand on CPU resources.
2) Any nonplayer character in a game which fights to preserve itself is also engaged in a limited form of self-preservation. And the computational mechanisms for this example should be directly transposable to a physical situation, like robots in a gladiator arena.
All these examples work through indirect self-reference. The program or robot doesn’t know that it is representing itself. This is why I said that self-modeling is not the challenge. If you want your program to engage in sophisticated feats of self-analysis and self-preservation—e.g. figuring out ways to prevent its mainframe from being switched off, asking itself whether a particular port to another platform would still preserve its identity, and so on—the hard part is not the self part. The hard part is to create a program that can reason about such topics at all, whether or not they apply to itself. If you can create an AI which could solve such problems (keeping the power on, protecting core identity) for another AI, you are more than 99% of the way to having an AI that can solve those problems for itself.