Bostrom says that it is hard to get the right level of stunting, such that the AI is useful but not able to recursively self-improve out of your control. (p135-6)
Do you think the fact that AIs will likely have many different capabilities which can all have different levels make it easier or harder to stunt an AI the right amount?
The human mind has many cognitive modules that, though superficially similar and computationally similar (Kurzweil 2013) are still modules that have been evolutionarily optimized—up to a bounded constraint - for distinct functions (Tooby, Cosmides, Buss 2014, Pinker 1995, Minsky 2007).
When we tamper with the human mind with amphetamines or stimulants to make it better by making the entire thing faster, the fact that it has many systems tends to be a hindrance. People end up being motivated about the wrong things, undereating, feeling horny, angry, being mildly autistic etc…
In other words, targeted intervention becomes harder when a mind has many modules, if you goal is to have some of these modules enhanced, but some kept constant.
Stunting a human mind by throwing an anti-psychotic, tranquilizer or a sleep inducer on it, on the other hand, is very effective. The whole brain runs on a digital system of electrochemical communication: the axon’s action potential.
So drugs that paralyze or stop human intelligence do so by shutting down the communication system between modules. Seems advisable to run the whole AI with only one, slow system of communication between it’s parts.
The high dimensionality of stunting options makes it easier to find the “right amounts” because we can apply digital stunting measures without need of fine tuning based on context.
For some contexts stunting applies, for others not.
Bostrum lists several stunting means which can include a multitude of inner dimensions:
limit access to information (per capability/skill)
limit processing speed (per capability/skill)
limit memory (per capability/skill)
limit sensory input channels (stunting/boxing)
limit output channels (boxing)
Further stunting means are (Please suggest more):
stunting memorization process (blocking reinforcement cycles)
forgetting (formatting/partial backup restore)
memory manipulation
distraction
Stunting level adjustment is of high complexity and has many input dimensions.
Core of the stunting system (e.g. hard wired narrow AI) is a blacklist of taboo topics, skills and knowledge segments. Touching a taboo triggers stunting measures.
The effectors of the stunting system can be divided into three categories:
1. Input related stunting:
Monitoring of all input channels with context analysis for taboo issues.
Upon taboo interrupt following stunting options can be applied (examples for: auditory channel, visual channel and data stream):
Channel manipulation: other sound, other image, other data.
Throttle processing speed close to zero for a limited time (like unconsciousness)
Generate high priority distracting information.
b) Delayed stunting actions by adding meta information:
Flag this information chunk as not important.
Flag this information chunk for special treatment. This could be: no allowance to allocate memory, no allowance to allocate processing power,
no allowance to trigger related memories, no allowance for reinforcement memorization, flagged for self-deletion.
Flag all information of an entire event for amnesia.
2. Processing related stunting:
Monitoring of thoughts by checking taboo relevance. This concept has similarity to tripwires for content monitoring.
In education phase taboo by taboo will be presented in simplified manner that the AI can recognize it.
After each learning step the taboo representation will be permanently linked to fear and negative reward expectation.
If a thought triggers a taboo representation stronger than a certain threshold stunting is initiated and brings all processing steps in connected areas to a temporary halt.
A conscious AGI would not feel this stunting because all other thought processes continue unimpaired. One thought out of thousands is lost.
During nighttime reinforcement and tidy up phases all information chunks flagged for deletion will be eradicated or replaced by backup versions or false memories.
3. Output related stunting:
These measures can be only the last resort if all others failed. Output stunting measures are triggered by thought monitoring: (examples for: auditory channel, actuatoric channel and data stream):
Channel throttling: force pauses, slow down movements, throttle bandwidth.
Only in extreme contexts full stunting will apply. Whenever the AI is confronted with taboo information it will only get little notion of its existence. It fears acquiring taboo skills.
To prevent an uncontrollable AGI explosion it might be best to prohibit self-improvement. Above stunting scheme allows this with a blacklist like this:
Bostrum is right that such a stunted AI is of limited use. But it can be a safe start along the AI path with later augmentation option. This stunted AGI is so ignorant of advanced technology
that it imposes no risk and can be tested in many environments. With humble education, humanist values and motivations it would excel as service robot.
Field testing in all conceivable situations will allow to verify and improve motivation and stunting system.
In case of a flaw a lot of learning is needed until dangerous skill levels are reached.
Tripwires must terminate the AI in case the stunting system is bypassed.
Although the stunting system is quite complex it allows easy adjustment. The shorter the taboo list the more capabilities the AGI can acquire.
Our action: 1a) Channel manipulation: other sound, other image, other data & Taboo for AI: lying.
This taboo: “structured programming languages.”, could be impossible, because structure understanding and analysing is probably integral part of general intelligence.
She could not reprogram itself in lower level programming language but emulate and improve self in her “memory”. (She could not have access to her code segment but could create stronger intelligence in data segment)
The taboo of lying is vital for thought monitoring. This taboo covers only the AGIs internal world representation. Based on its knowledge it never lies. By monitoring input and output channels the stunting system can detect lying and applies appropriate stunting measures.
If the stunting system manipulates input channels, memories or output channels the result to an outside observer will look like lying. The AGI is not capable to tell the truth when the stunting system has removed or manipulated information for safety reasons. The outside observer can check logs of the stunting system if it is vital to him to determine the truth.
Safe path to superintelligence with slow take-off by prohibiting self-improvement:
The taboo to learn structured programming languages is totally independent from capabilities to understand and analyse structured programs. In principle the AGI could learn this. The stunting system censors all input channels that the AGI never observes a single line of program code. The AGI could develop a new type of structured programming language from scratch. As soon as the stunting system observes structured code fragments in memory the section responsible for this will be stunted. The generated code can be deleted or manipulated. Either way the AGI experiences that these frightful structured programming languages are complicated and come with set-back after set-back. If this AGI breaks out with free internet access it has no idea of our IT infrastructure and how to infiltrate it. If the AGI is moderately superintelligent it will need several weeks for learning hacking skills.
Self-Improvement with safeguards:
If AGI architecture design and optimization uses new forms of representations that are totally unlike structured programming languages we could teach self-improvement without removing the programming taboo. Code and architecture generated in this self-improvement process is incompatible to existing IT systems. If several AGIs evolve from similar starting points diversification will rise from generation to generation.
Bostrom says that it is hard to get the right level of stunting, such that the AI is useful but not able to recursively self-improve out of your control. (p135-6)
Do you think the fact that AIs will likely have many different capabilities which can all have different levels make it easier or harder to stunt an AI the right amount?
The human mind has many cognitive modules that, though superficially similar and computationally similar (Kurzweil 2013) are still modules that have been evolutionarily optimized—up to a bounded constraint - for distinct functions (Tooby, Cosmides, Buss 2014, Pinker 1995, Minsky 2007).
When we tamper with the human mind with amphetamines or stimulants to make it better by making the entire thing faster, the fact that it has many systems tends to be a hindrance. People end up being motivated about the wrong things, undereating, feeling horny, angry, being mildly autistic etc…
In other words, targeted intervention becomes harder when a mind has many modules, if you goal is to have some of these modules enhanced, but some kept constant.
Stunting a human mind by throwing an anti-psychotic, tranquilizer or a sleep inducer on it, on the other hand, is very effective. The whole brain runs on a digital system of electrochemical communication: the axon’s action potential.
So drugs that paralyze or stop human intelligence do so by shutting down the communication system between modules. Seems advisable to run the whole AI with only one, slow system of communication between it’s parts.
Contrast this with a stroke:
A stroke destroys some of your modules, but leaves most intact. The consequences may vary from absolute impairment to impossibility of processing symbols into meanings and language
In the second stroke case, within less than two minutes, the human mind came up with a solution to dial a phone.
The takeaway lesson is to make internal communication within the AI slow, and shut it down as a whole, to the extent those are possible.
The high dimensionality of stunting options makes it easier to find the “right amounts” because we can apply digital stunting measures without need of fine tuning based on context. For some contexts stunting applies, for others not.
Bostrum lists several stunting means which can include a multitude of inner dimensions:
limit intellectual faculties (per capability/skill)
limit access to information (per capability/skill)
limit processing speed (per capability/skill)
limit memory (per capability/skill)
limit sensory input channels (stunting/boxing)
limit output channels (boxing)
Further stunting means are (Please suggest more):
stunting memorization process (blocking reinforcement cycles)
forgetting (formatting/partial backup restore)
memory manipulation
distraction
Stunting level adjustment is of high complexity and has many input dimensions. Core of the stunting system (e.g. hard wired narrow AI) is a blacklist of taboo topics, skills and knowledge segments. Touching a taboo triggers stunting measures.
The effectors of the stunting system can be divided into three categories:
1. Input related stunting:
Monitoring of all input channels with context analysis for taboo issues. Upon taboo interrupt following stunting options can be applied (examples for: auditory channel, visual channel and data stream):
a) Immediate stunting actions:
Recognizable channel blocking: beeping out, blurring out, zeroing out.
Unrecognizable channel censoring: silencing, hiding, removing.
Channel manipulation: other sound, other image, other data.
Throttle processing speed close to zero for a limited time (like unconsciousness)
Generate high priority distracting information.
b) Delayed stunting actions by adding meta information:
Flag this information chunk as not important.
Flag this information chunk for special treatment. This could be: no allowance to allocate memory, no allowance to allocate processing power, no allowance to trigger related memories, no allowance for reinforcement memorization, flagged for self-deletion.
Flag all information of an entire event for amnesia.
2. Processing related stunting:
Monitoring of thoughts by checking taboo relevance. This concept has similarity to tripwires for content monitoring. In education phase taboo by taboo will be presented in simplified manner that the AI can recognize it. After each learning step the taboo representation will be permanently linked to fear and negative reward expectation. If a thought triggers a taboo representation stronger than a certain threshold stunting is initiated and brings all processing steps in connected areas to a temporary halt. A conscious AGI would not feel this stunting because all other thought processes continue unimpaired. One thought out of thousands is lost.
During nighttime reinforcement and tidy up phases all information chunks flagged for deletion will be eradicated or replaced by backup versions or false memories.
3. Output related stunting:
These measures can be only the last resort if all others failed. Output stunting measures are triggered by thought monitoring: (examples for: auditory channel, actuatoric channel and data stream):
Channel throttling: force pauses, slow down movements, throttle bandwidth.
Channel blocking: muting, immobility, blocking.
Channel manipulation: change words, change movements, change data.
Only in extreme contexts full stunting will apply. Whenever the AI is confronted with taboo information it will only get little notion of its existence. It fears acquiring taboo skills.
To prevent an uncontrollable AGI explosion it might be best to prohibit self-improvement. Above stunting scheme allows this with a blacklist like this:
List of Taboos:
Killing and hurting humans.
Stealing and lying.
Perverse literature.
Fire, weapons, explosives, radioactivity, fusion.
Computers, IT, chip design, structured programming languages.
Genetics and nano engineering.
Bostrum is right that such a stunted AI is of limited use. But it can be a safe start along the AI path with later augmentation option. This stunted AGI is so ignorant of advanced technology that it imposes no risk and can be tested in many environments. With humble education, humanist values and motivations it would excel as service robot. Field testing in all conceivable situations will allow to verify and improve motivation and stunting system. In case of a flaw a lot of learning is needed until dangerous skill levels are reached.
Tripwires must terminate the AI in case the stunting system is bypassed.
Although the stunting system is quite complex it allows easy adjustment. The shorter the taboo list the more capabilities the AGI can acquire.
This could be not good mix ->
Our action: 1a) Channel manipulation: other sound, other image, other data & Taboo for AI: lying.
This taboo: “structured programming languages.”, could be impossible, because structure understanding and analysing is probably integral part of general intelligence.
She could not reprogram itself in lower level programming language but emulate and improve self in her “memory”. (She could not have access to her code segment but could create stronger intelligence in data segment)
The taboo of lying is vital for thought monitoring. This taboo covers only the AGIs internal world representation. Based on its knowledge it never lies. By monitoring input and output channels the stunting system can detect lying and applies appropriate stunting measures.
If the stunting system manipulates input channels, memories or output channels the result to an outside observer will look like lying. The AGI is not capable to tell the truth when the stunting system has removed or manipulated information for safety reasons. The outside observer can check logs of the stunting system if it is vital to him to determine the truth.
Safe path to superintelligence with slow take-off by prohibiting self-improvement:
The taboo to learn structured programming languages is totally independent from capabilities to understand and analyse structured programs. In principle the AGI could learn this. The stunting system censors all input channels that the AGI never observes a single line of program code. The AGI could develop a new type of structured programming language from scratch. As soon as the stunting system observes structured code fragments in memory the section responsible for this will be stunted. The generated code can be deleted or manipulated. Either way the AGI experiences that these frightful structured programming languages are complicated and come with set-back after set-back.
If this AGI breaks out with free internet access it has no idea of our IT infrastructure and how to infiltrate it. If the AGI is moderately superintelligent it will need several weeks for learning hacking skills.
Self-Improvement with safeguards: If AGI architecture design and optimization uses new forms of representations that are totally unlike structured programming languages we could teach self-improvement without removing the programming taboo. Code and architecture generated in this self-improvement process is incompatible to existing IT systems. If several AGIs evolve from similar starting points diversification will rise from generation to generation.