Yes, the team I imagine is doing something dangerous.
No, I am not imagining them using only behavioral testing in their design approach. I am trying to imagine them doing the best job they possibly can, both in design and testing, in order to see how much they can reduce the risk.
They are in the early stages. They are working on partial minds, purposefully handicapped minds, AGI components, just the vision system, etc.
Suppose that they know what they are doing is dangerous, and they are using their substantial brilliance to try to design, not just a boxing approach, but a testing method and roll-out mechanism that mitigates the danger.
They have the following means of testing the AI in handicapped mode, at minimum:
-Run it on slow hardware or clock down the processors
-Allow it to use no more than (for instance) 1 gig of memory
-Allow it to use no more than 1 trillion instructions, then the computer turns off
-Expose it to limited data about the world, or no data about the real world
-Provide it little or no information about its operators or the society they live in
-Provide it little or no information about its own hardware or software architecture
-If it shows evidence of having mastered a valuable sub-task (say, as examples, it can solve a real-world problem that, as far as we know, can only be solved by an attorney, using differential equations or advanced persuasive methods) turn it off.
-Run in only in simulations and virtual realities
-Tripwires connected to honeypots, and otherwise
-Build it off-the-grid
-Give it no objective function
-Give it no access to methods to manipulate physical objects
-All of the staff has gone through personality testing and a security clearance.
-Very few, if any, of the staff know how to create all of the parts of the software or hardware.
-No real-time communication with a person.
-All communication with people is logged.
…
The team seems to have a lot of tools to continue their work with limited risk. If they deploy all of these, and more, to test AI systems which are not AGI yet, but only possible components, then we start to have a hard time convincing them not to continue their work, even if we do not trust them to use the technology well in the end.
Expose it to limited data about the world, or no data about the real world -Provide it little or no information about its operators or the society they live in -Provide it little or no information about its own hardware or software architecture
When I read “Oracle AI” I realized one possibility which is probably still overlooked.
“Dreams”
Erasing data on physical media is not perfect. Zero bit which was before nonzero could be read more slowly or quicker. It could allow SAI to see shadows of past data. Which could lead to phenomenon similar to dreams.
Yes, the team I imagine is doing something dangerous.
No, I am not imagining them using only behavioral testing in their design approach. I am trying to imagine them doing the best job they possibly can, both in design and testing, in order to see how much they can reduce the risk.
They are in the early stages. They are working on partial minds, purposefully handicapped minds, AGI components, just the vision system, etc.
Suppose that they know what they are doing is dangerous, and they are using their substantial brilliance to try to design, not just a boxing approach, but a testing method and roll-out mechanism that mitigates the danger.
They have the following means of testing the AI in handicapped mode, at minimum:
-Run it on slow hardware or clock down the processors -Allow it to use no more than (for instance) 1 gig of memory -Allow it to use no more than 1 trillion instructions, then the computer turns off -Expose it to limited data about the world, or no data about the real world -Provide it little or no information about its operators or the society they live in -Provide it little or no information about its own hardware or software architecture
-If it shows evidence of having mastered a valuable sub-task (say, as examples, it can solve a real-world problem that, as far as we know, can only be solved by an attorney, using differential equations or advanced persuasive methods) turn it off. -Run in only in simulations and virtual realities -Tripwires connected to honeypots, and otherwise -Build it off-the-grid -Give it no objective function -Give it no access to methods to manipulate physical objects
-All of the staff has gone through personality testing and a security clearance. -Very few, if any, of the staff know how to create all of the parts of the software or hardware. -No real-time communication with a person. -All communication with people is logged. …
The team seems to have a lot of tools to continue their work with limited risk. If they deploy all of these, and more, to test AI systems which are not AGI yet, but only possible components, then we start to have a hard time convincing them not to continue their work, even if we do not trust them to use the technology well in the end.
When I read “Oracle AI” I realized one possibility which is probably still overlooked.
“Dreams”
Erasing data on physical media is not perfect. Zero bit which was before nonzero could be read more slowly or quicker. It could allow SAI to see shadows of past data. Which could lead to phenomenon similar to dreams.