an AI (even if not in a society of AIs) could either have mistakes built into its structure or make mistakes when changing itself.
I am not very well versed in AI at all. But reading this, my automatic response is to question how an emotional response is different from any other response, for an AI.
I understand that emotional responses are different in complexity than trivial responses. But I think of emotional responses (for an AI) as fitting somewhere in the fairly straightforward continuum between “what color should my desktop be” and “how do I judge the validity of a moral structure to apply to humans when they cannot agree on any meaningful criteria themselves”. I would assume that even AI emotional ecology is closer to the complex side of the spectrum, but it seems like a problem that should be fully open to internal inspection and modification by the AI—or if it is limited, at least no more difficult to adjust than any equally important calculation.
Building an AI with hidden subconscious seems like an unfortunate combination of stupid and malicious. The most likely reason for such a thing to exist that I can think of is as a hidden backdoor to allow humans to manipulate the AI without it knowing what is going on, but inducing schizophrenia-like symptoms is probably not the sane way to control our constructs.
But I may be under-applying important concepts—particularly, I may be underestimating the importance of emergent properties, especially in a hard takeoff scenario.
I brought up emotional healing because I’d recently read a strong example of it, but you raise a bunch of interesting points, and because, as I said, people seem to have a base state of emotional health—a system 1 which is compatible with living well. It seems as though people have some capacity for improving their system 1 reactions, though it tends to be slow and difficult.
Let’s see if I can generalize healing for AIs. I’m inclined to think that AIs will have something like a system 1 / system 2 distinction—subsystems for faster/lower cost reactions and ways of combining subsystems for slower/higher cost reactions. This will presumably be more complex than the human system, but I’m not sure the difference matters for this discussion.
I think an AI wouldn’t need to have emotions, but if it’s going to be useful, it needs to have drives—to take care of itself, to take care of people (if FAI), to explore not-obviously useful information.
It wouldn’t exactly have a subconscious in the human sense, but I don’t think it can completely keep track of itself—that would take its whole capacity and then some.
What is a good balance between the drives? To analogize a human problem, suppose that an FAI starts out having to fend off capable UFAIs. It’s going to have to do extensive surveillance, which may be too much under other circumstances—a waste of resources. How does it decide how much is too much?
This one isn’t so much about emotional healing, though emotions are part of how people tell how there lives are going. Suppose it makes a large increase in its capacity. How can it tell whether or not it’s made an improvement? Or a mistake? How does it choose what to go back to, when it may have changed its standards for what’s an improvement?
I don’t think it can completely keep track of itself—that would take its whole capacity and then some.
I have a different view of AI (I do not know if it is better or more likely). I would see the AI as a system almost entirely devoted to keeping track of itself. The theory behind a hard takeoff is that we already have pretty much all the resources to do the tasks required for a functional AI; all that is missing is the AI itself. The AI is the entity that organizes and develops the exiting resources into a more useful structure. This is not a trivial task, but it is founded on drives and goals. Assuming that we aren’t talking about a paperclip maximizer, the AI must have an active and self-modifying sense of purpose.
Humans got here the messy way—we started out as wiggly blobs wanting various critical things (light/food/sex), and it made sense to be barely better than paperclip maximizers. In the last million years we started developing systems in which maximizing the satisfaction of drives stopped being effective strategies. We have a lot of problems with mental ecology that probably derive from that.
It’s not obvious what the fundamental drives of an AI would be—it is arguable that ‘fundamental’ just doesn’t mean the same thing to an AI as it does to a biological being… except in the unlucky case that AIs are essentially an advanced form of computer virus, gobbling up all the processor time they can. But it seems that any useful AI—those AI in which we care about mental/emotional healing—would have to be first and foremost a drive/goal tuning agent, and only after that a resource management agent.
This almost has to be the case, because the set of AIs that are driven first by output and second by goal-tuning are going to be either paperclip maximizers (mental economy may be complex, but conflict will almost always be solved by the simple question “what makes more paperclips?”), insane (the state of having multiple conflicting primary drives each more compelling than the drive to correct the conflict seems to fall entirely within the set that we would consider insane, even for particularly strict definitions of insane), or below the threshold for general AI (although I admit this depends on how pessimistic your view of humans is).
Suppose it makes a large increase in its capacity. How can it tell whether or not it’s made an improvement? Or a mistake?
These are complex decisions, but not particularly damaging ones. I can’t think of any problem in this area that an AI should find inherently unhealthy. Some matters may be hard, or indeterminate, or undetermined, but it is simply a fact about living in the universe that an effective agent will have to have the mental framework for making educated guesses (and sometimes uneducated guesses), and processing the consequences without a mental breakdown.
The simple case would be having an AI predict the outcome of a coin flip without going insane—too little information, a high failure rate, and no improvement over time could drive a person insane, if they did not have the mental capacity to understand that this is simply a situation that is not under their control. Any functional AI has to have the ability to judge when a guess is necessary and to deal with that. Likewise, it has to be able to know its capability to process outcomes, and not break down when faced with an outcome that is not what it wanted, or that requires a change in thought processes, or simply cannot be interpreted with the current information.
There are certainly examples of hard problems (most of Asimovs’ stories about robots involve questions that are hard to resolve under a simple rule system), and his robots do have nervous breakdowns.… but you and I would have no trouble giving rules that would prevent a nervous breakdown. In fact, usually the rule is something simple like “if you can’t make a decision that is clearly best, rank the tied options as equal, and choose randomly”. We just don’t want to recommend that rule to beings that have the power to randomly ruin our lives—but that only becomes a problem if we are the ones setting the rules. If the AI has power over its own rule set, the problem disappears.
To analogize a human problem, suppose that an FAI starts out having to fend off capable UFAIs. It’s going to have to do extensive surveillance, which may be too much under other circumstances—a waste of resources. How does it decide how much is too much?
This is a complex question, but it is also the sort of question that breaks down nicely.
How big a threat is this? (Best guess may be not-so-good, but if AI cannot handle not-so-good guesses, AI will have a massive nervous breakdown early on, and will no longer concern us).
How much resources should I devote to a problem that big?
What is the most effective way(s) to apply those resources to that problem?
Do that thing.
Move on to the next problem.
As I write this out, I see that a large part of my argument is that AIs that do not have good mental ecology with a foundation of self-monitoring and goal/drive analysis will simply die out or go insane (or go paperclip) rather than become a healthy, interesting, and useful agent. So really, I agree that mental health is critically important, I just think that it is either in place from the start, or we have an unfriendly AI on our hands.
I realize that I may be shifting the goal posts by focusing on general AI. Please shift them back as appropriate.
I don’t think it can completely keep track of itself—that would take its whole capacity and then some.
I have a different view of AI (I do not know if it is better or more likely). I would see the AI as a system almost entirely devoted to keeping track of itself.
You’re probably right about the trend. I’ve heard that lizards do a lot less processing of their sensory information than we do. It’s amusing that people put in so much work through meditation to be more like lizards. This is not to deny that meditation is frequently good for people.
However, an AI using a high proportion of its resources to keep track of itself is not the same thing as it being able to keep complete track of itself.
In re the possibly over-vigilant/over reactive AI: my notion is that its ability to decide how big a threat is will be affected by its early history.
As I write this out, I see that a large part of my argument is that AIs that do not have good mental ecology with a foundation of self-monitoring and goal/drive analysis will simply die out or go insane (or go paperclip) rather than become a healthy, interesting, and useful agent. So really, I agree that mental health is critically important, I just think that it is either in place from the start, or we have an unfriendly AI on our hands.
That’s where I started. We have an evolved capacity to heal. Designing the capacity to heal for a very different sort of system is going to be hard if it’s possible at all.
I am not very well versed in AI at all. But reading this, my automatic response is to question how an emotional response is different from any other response, for an AI.
I understand that emotional responses are different in complexity than trivial responses. But I think of emotional responses (for an AI) as fitting somewhere in the fairly straightforward continuum between “what color should my desktop be” and “how do I judge the validity of a moral structure to apply to humans when they cannot agree on any meaningful criteria themselves”. I would assume that even AI emotional ecology is closer to the complex side of the spectrum, but it seems like a problem that should be fully open to internal inspection and modification by the AI—or if it is limited, at least no more difficult to adjust than any equally important calculation.
Building an AI with hidden subconscious seems like an unfortunate combination of stupid and malicious. The most likely reason for such a thing to exist that I can think of is as a hidden backdoor to allow humans to manipulate the AI without it knowing what is going on, but inducing schizophrenia-like symptoms is probably not the sane way to control our constructs.
But I may be under-applying important concepts—particularly, I may be underestimating the importance of emergent properties, especially in a hard takeoff scenario.
I brought up emotional healing because I’d recently read a strong example of it, but you raise a bunch of interesting points, and because, as I said, people seem to have a base state of emotional health—a system 1 which is compatible with living well. It seems as though people have some capacity for improving their system 1 reactions, though it tends to be slow and difficult.
Let’s see if I can generalize healing for AIs. I’m inclined to think that AIs will have something like a system 1 / system 2 distinction—subsystems for faster/lower cost reactions and ways of combining subsystems for slower/higher cost reactions. This will presumably be more complex than the human system, but I’m not sure the difference matters for this discussion.
I think an AI wouldn’t need to have emotions, but if it’s going to be useful, it needs to have drives—to take care of itself, to take care of people (if FAI), to explore not-obviously useful information.
It wouldn’t exactly have a subconscious in the human sense, but I don’t think it can completely keep track of itself—that would take its whole capacity and then some.
What is a good balance between the drives? To analogize a human problem, suppose that an FAI starts out having to fend off capable UFAIs. It’s going to have to do extensive surveillance, which may be too much under other circumstances—a waste of resources. How does it decide how much is too much?
This one isn’t so much about emotional healing, though emotions are part of how people tell how there lives are going. Suppose it makes a large increase in its capacity. How can it tell whether or not it’s made an improvement? Or a mistake? How does it choose what to go back to, when it may have changed its standards for what’s an improvement?
I have a different view of AI (I do not know if it is better or more likely). I would see the AI as a system almost entirely devoted to keeping track of itself. The theory behind a hard takeoff is that we already have pretty much all the resources to do the tasks required for a functional AI; all that is missing is the AI itself. The AI is the entity that organizes and develops the exiting resources into a more useful structure. This is not a trivial task, but it is founded on drives and goals. Assuming that we aren’t talking about a paperclip maximizer, the AI must have an active and self-modifying sense of purpose.
Humans got here the messy way—we started out as wiggly blobs wanting various critical things (light/food/sex), and it made sense to be barely better than paperclip maximizers. In the last million years we started developing systems in which maximizing the satisfaction of drives stopped being effective strategies. We have a lot of problems with mental ecology that probably derive from that.
It’s not obvious what the fundamental drives of an AI would be—it is arguable that ‘fundamental’ just doesn’t mean the same thing to an AI as it does to a biological being… except in the unlucky case that AIs are essentially an advanced form of computer virus, gobbling up all the processor time they can. But it seems that any useful AI—those AI in which we care about mental/emotional healing—would have to be first and foremost a drive/goal tuning agent, and only after that a resource management agent.
This almost has to be the case, because the set of AIs that are driven first by output and second by goal-tuning are going to be either paperclip maximizers (mental economy may be complex, but conflict will almost always be solved by the simple question “what makes more paperclips?”), insane (the state of having multiple conflicting primary drives each more compelling than the drive to correct the conflict seems to fall entirely within the set that we would consider insane, even for particularly strict definitions of insane), or below the threshold for general AI (although I admit this depends on how pessimistic your view of humans is).
These are complex decisions, but not particularly damaging ones. I can’t think of any problem in this area that an AI should find inherently unhealthy. Some matters may be hard, or indeterminate, or undetermined, but it is simply a fact about living in the universe that an effective agent will have to have the mental framework for making educated guesses (and sometimes uneducated guesses), and processing the consequences without a mental breakdown.
The simple case would be having an AI predict the outcome of a coin flip without going insane—too little information, a high failure rate, and no improvement over time could drive a person insane, if they did not have the mental capacity to understand that this is simply a situation that is not under their control. Any functional AI has to have the ability to judge when a guess is necessary and to deal with that. Likewise, it has to be able to know its capability to process outcomes, and not break down when faced with an outcome that is not what it wanted, or that requires a change in thought processes, or simply cannot be interpreted with the current information.
There are certainly examples of hard problems (most of Asimovs’ stories about robots involve questions that are hard to resolve under a simple rule system), and his robots do have nervous breakdowns.… but you and I would have no trouble giving rules that would prevent a nervous breakdown. In fact, usually the rule is something simple like “if you can’t make a decision that is clearly best, rank the tied options as equal, and choose randomly”. We just don’t want to recommend that rule to beings that have the power to randomly ruin our lives—but that only becomes a problem if we are the ones setting the rules. If the AI has power over its own rule set, the problem disappears.
This is a complex question, but it is also the sort of question that breaks down nicely.
How big a threat is this? (Best guess may be not-so-good, but if AI cannot handle not-so-good guesses, AI will have a massive nervous breakdown early on, and will no longer concern us).
How much resources should I devote to a problem that big?
What is the most effective way(s) to apply those resources to that problem?
Do that thing.
Move on to the next problem.
As I write this out, I see that a large part of my argument is that AIs that do not have good mental ecology with a foundation of self-monitoring and goal/drive analysis will simply die out or go insane (or go paperclip) rather than become a healthy, interesting, and useful agent. So really, I agree that mental health is critically important, I just think that it is either in place from the start, or we have an unfriendly AI on our hands.
I realize that I may be shifting the goal posts by focusing on general AI. Please shift them back as appropriate.
You’re probably right about the trend. I’ve heard that lizards do a lot less processing of their sensory information than we do. It’s amusing that people put in so much work through meditation to be more like lizards. This is not to deny that meditation is frequently good for people.
However, an AI using a high proportion of its resources to keep track of itself is not the same thing as it being able to keep complete track of itself.
In re the possibly over-vigilant/over reactive AI: my notion is that its ability to decide how big a threat is will be affected by its early history.
That’s where I started. We have an evolved capacity to heal. Designing the capacity to heal for a very different sort of system is going to be hard if it’s possible at all.