Test harnesses might turn out to be very useful, but this isn’t a trivial task, and I don’t think the development and use of such harnesses can be taken for granted. It’s not just that it must be safely contained, but that it also has to be able to interact with the outside world in a manner that can’t be dangerous, but is still informative enough to decide whether its friendly- this seems hard.
The original subject of disagreement was “is AI failure the rule or exception?”. This isn’t a precisely specified question, but it just seemed like you were arguing that the “most minds are unfriendly” argument is not important because it is either irrelevant or universally understood and accounted for. I think that this argument is not universally understood among those that might design an AI and that failure to understand this would also result in the AI not being placed in a suitably secure test harness.
It’s not just that it must be safely contained, but that it also has to be able to interact with the outside world in a manner that can’t be dangerous, but is still informative enough to decide whether its friendly- this seems hard.
Restore it to factory settings between applications of the test suite.
Not remembering what your actions were should make it “pretty tricky” to link those actions to their consequences.
Making the prisons is the more challenging part of the problem—IMO.
Test harnesses might turn out to be very useful, but this isn’t a trivial task, and I don’t think the development and use of such harnesses can be taken for granted. It’s not just that it must be safely contained, but that it also has to be able to interact with the outside world in a manner that can’t be dangerous, but is still informative enough to decide whether its friendly- this seems hard.
The original subject of disagreement was “is AI failure the rule or exception?”. This isn’t a precisely specified question, but it just seemed like you were arguing that the “most minds are unfriendly” argument is not important because it is either irrelevant or universally understood and accounted for. I think that this argument is not universally understood among those that might design an AI and that failure to understand this would also result in the AI not being placed in a suitably secure test harness.
Restore it to factory settings between applications of the test suite.
Not remembering what your actions were should make it “pretty tricky” to link those actions to their consequences.
Making the prisons is the more challenging part of the problem—IMO.