Our brains generally do an okay job of fulfilling our biological imperative without having it explicitly defined.
“The biological imperative” is “survive, have offspring, and make sure your offspring do well”. That’s a vastly simpler goal than “create the kind of a society that humans would want to want to have”. (An AI that carries out the biological imperative is called a paperclipper.)
Also, even if our brain don’t have an explicit definition for it, it’s still implicitly there. You can’t not define a goal for an AI—the only question is whether you do it explicitly or implicitly.
2) with the society of AI’s idea to balance and punish those who stray to far from the humanistic (there are some examples in human society of pressure to be biologically normal, e.g. straight, non self mutilating etc).
Postulating a society of AIs being in control, instead of a single AI being in control, wouldn’t seem to make the task of goal system design any easier. Now, instead of just having to figure out a way to make a single AI’s goals to be what we want, you have to design an entire society of AIs in a way that the goals the society ends up promoting overall will be what we want. Taking a society of AIs as the basic unit only makes things more complex.
Do you accept Omohondro’s drives? One of them being self-preservation. This needs some notion of self that is not anthropomorphic.
I accept them. Like I said in the comment above, the definition of a self in the context of the “self-preservation” drive will be based on the AI’s goals.
Remember that the self-preservation drive is based on the simple notion that the AI wants its goals to be achieved, and if it is destroyed, then those goals (probably) cannot be achieved, since there’s nobody around who’d work to achieve them. Whether or not the AI itself survives isn’t actually relevant—what’s relevant is whether such AIs (or minds-in-general) survive which will continue to carry out the goals, regardless of whether or not those AIs happen to be “same” AI.
If an AI has “maximize the amount of paperclips in the universe” as its goal, then the “self-preservation drive” corresponds to “protect agents which share this goal (and are the most capable of achieving it)”.
If an AI has as its goal “maximize the amount of paperclips produced by this unit”, then the “this unit” that it will try to preserve will depend on how its programming (implicitly or explicitly) defines “this unit”… or so it would seem at first.
If we taboo “this unit”, we get “maximize the amount of paperclips produced by [something that meets some specific criteria]”. To see the problem in this, consider that if we had an AI that had been built to “maximize the amount of paperclips produced by the children of its inventor”, that too would get taboo’d into “maximize the amount of paperclips produced by [something that meets some specific criteria]”. The self-preservation drive again collapses into “protect agents which share this goal (and are the most capable of achieving it)”: there’s no particular reason to use the term “self”.
To put it differently: with regard to the self-preservation drive, there’s no difference in whether the goal is to “maximize paperclips produced by this AI” or “maximize the happiness of humanity”. In both cases, the AI is trying to do something to some entity which is defined in some specific way. In order for something to be done to such an entity, some agent must survive which has as its goal the task of doing such things to such an entity, and the AI will make sure that such agents try to survive.
Of course it must also make sure that the target of the optimizing-activity survives, but that’s separate from the self-preservation drive.
(I’m not sure of how clearly I’m expressing myself here—did folks understand what I was trying to say?)
Also, even if our brain don’t have an explicit definition for it, it’s still implicitly there. You can’t not define a goal for an AI—the only question is whether you do it explicitly or implicitly.
Can we make a system that has a human as (part of) an implicit definition of its goal system? When you allow implicit definitions they can be made non-spatially located, although some information will need to flow between them.
I’m not sure if I am making myself clear. Just to check. I am interested in exploring systems where a human is an important computational component (not just pointed at) of an implicit goal system for an advanced computer system.
Because the human part is implicit, the system might not make the correct inferences and judge the human to be important. If there was a society of these systems, then if we engineered things correctly then most of them would make the correct inference and judge the human an important part of their goal system and they may be able to exert pressure on those that didn’t.
Okay. I thought that you meant something like that, but this clarified it.
I’m not sure why you think it’s better to build a society of these systems than to build just a single one. It seems to just make things more difficult: instead of trying to make sure that one AI does things right, we need to make sure that the overall dynamic that emerges from a society of interacting AIs does things right. That sounds a lot harder.
1) I am more skeptical of singleton take off. While I think it is possible, I don’t think it is likely that humans will be able to engineer it.
2)Logistics. If identity requires high bandwidth data connections between the two parts it would be easier to have a distributed system.
3) Politics. I doubt politicians will trust anyone to build a giant system to look after the world.
4) Letting the future take care of itself. If the systems do consider the human part of themselves, then they might be better placed to figure out an overarching way to balance everyones needs.
“The biological imperative” is “survive, have offspring, and make sure your offspring do well”. That’s a vastly simpler goal than “create the kind of a society that humans would want to want to have”. (An AI that carries out the biological imperative is called a paperclipper.)
Also, even if our brain don’t have an explicit definition for it, it’s still implicitly there. You can’t not define a goal for an AI—the only question is whether you do it explicitly or implicitly.
Postulating a society of AIs being in control, instead of a single AI being in control, wouldn’t seem to make the task of goal system design any easier. Now, instead of just having to figure out a way to make a single AI’s goals to be what we want, you have to design an entire society of AIs in a way that the goals the society ends up promoting overall will be what we want. Taking a society of AIs as the basic unit only makes things more complex.
I accept them. Like I said in the comment above, the definition of a self in the context of the “self-preservation” drive will be based on the AI’s goals.
Remember that the self-preservation drive is based on the simple notion that the AI wants its goals to be achieved, and if it is destroyed, then those goals (probably) cannot be achieved, since there’s nobody around who’d work to achieve them. Whether or not the AI itself survives isn’t actually relevant—what’s relevant is whether such AIs (or minds-in-general) survive which will continue to carry out the goals, regardless of whether or not those AIs happen to be “same” AI.
If an AI has “maximize the amount of paperclips in the universe” as its goal, then the “self-preservation drive” corresponds to “protect agents which share this goal (and are the most capable of achieving it)”.
If an AI has as its goal “maximize the amount of paperclips produced by this unit”, then the “this unit” that it will try to preserve will depend on how its programming (implicitly or explicitly) defines “this unit”… or so it would seem at first.
If we taboo “this unit”, we get “maximize the amount of paperclips produced by [something that meets some specific criteria]”. To see the problem in this, consider that if we had an AI that had been built to “maximize the amount of paperclips produced by the children of its inventor”, that too would get taboo’d into “maximize the amount of paperclips produced by [something that meets some specific criteria]”. The self-preservation drive again collapses into “protect agents which share this goal (and are the most capable of achieving it)”: there’s no particular reason to use the term “self”.
To put it differently: with regard to the self-preservation drive, there’s no difference in whether the goal is to “maximize paperclips produced by this AI” or “maximize the happiness of humanity”. In both cases, the AI is trying to do something to some entity which is defined in some specific way. In order for something to be done to such an entity, some agent must survive which has as its goal the task of doing such things to such an entity, and the AI will make sure that such agents try to survive.
Of course it must also make sure that the target of the optimizing-activity survives, but that’s separate from the self-preservation drive.
(I’m not sure of how clearly I’m expressing myself here—did folks understand what I was trying to say?)
Can we make a system that has a human as (part of) an implicit definition of its goal system? When you allow implicit definitions they can be made non-spatially located, although some information will need to flow between them.
In principle? Sure.
In practice? I have no idea.
I’m not sure if I am making myself clear. Just to check. I am interested in exploring systems where a human is an important computational component (not just pointed at) of an implicit goal system for an advanced computer system.
Because the human part is implicit, the system might not make the correct inferences and judge the human to be important. If there was a society of these systems, then if we engineered things correctly then most of them would make the correct inference and judge the human an important part of their goal system and they may be able to exert pressure on those that didn’t.
Does that make more sense?
Okay. I thought that you meant something like that, but this clarified it.
I’m not sure why you think it’s better to build a society of these systems than to build just a single one. It seems to just make things more difficult: instead of trying to make sure that one AI does things right, we need to make sure that the overall dynamic that emerges from a society of interacting AIs does things right. That sounds a lot harder.
A few reasons.
1) I am more skeptical of singleton take off. While I think it is possible, I don’t think it is likely that humans will be able to engineer it.
2)Logistics. If identity requires high bandwidth data connections between the two parts it would be easier to have a distributed system.
3) Politics. I doubt politicians will trust anyone to build a giant system to look after the world.
4) Letting the future take care of itself. If the systems do consider the human part of themselves, then they might be better placed to figure out an overarching way to balance everyones needs.