If we were certain a uFAI were going online in a matter of days, it would be everyone’s responsibility to stop it by any means possible. Imminent threat to humanity and all that.
However, it’s a very low probability that it’ll ever get to that point. Talking about and endorsing (hypothetical) unethical activity will impose social costs in the meanwhile. So, it’s a net negative to discuss it.
I’d argue the latter. It’s hard to imagine how you could know in advance that a uFAI has a high chance of working, rather than being one of thousands of ambitious AGI projects that simply fail.
(Douglas Lenat comes to you, saying that he’s finished a powerful fully general self-modifying AI program called Eurisko, which has done very impressive things in its early trials, so he’s about to run it on some real-world problems on a supercomputer with Internet access; and by the way, he’ll be alone all tomorrow fiddling with it, would you like to come over...)
Sorry, I was imprecise. I consider it likely that eventually we’ll be able to make uFAI, but unlikely that any particular project will make uFAI. Moreover, we probably won’t get appreciable warning for uFAI because if researchers knew they were making a uFAI then they wouldn’t make one.
Thus, we have to adopt a general strategy that can’t target any specific research group. Sabotage does not scale well, and would only drive research underground while imposing social costs on us meanwhile. The best bet then is to promote awareness of uFAI risks and try to have friendliness theory completed by the time the first AGI goes online. Not surprisingly, this seems to be what SIAI is already doing. Discussion of sabotage just harms that strategy.
If we were certain a uFAI were going online in a matter of days, it would be everyone’s responsibility to stop it by any means possible. Imminent threat to humanity and all that.
However, it’s a very low probability that it’ll ever get to that point. Talking about and endorsing (hypothetical) unethical activity will impose social costs in the meanwhile. So, it’s a net negative to discuss it.
What specifically do you consider low probability? That an uFAI will ever be launched, or that there will be an advance high credibility warning?
I’d argue the latter. It’s hard to imagine how you could know in advance that a uFAI has a high chance of working, rather than being one of thousands of ambitious AGI projects that simply fail.
(Douglas Lenat comes to you, saying that he’s finished a powerful fully general self-modifying AI program called Eurisko, which has done very impressive things in its early trials, so he’s about to run it on some real-world problems on a supercomputer with Internet access; and by the way, he’ll be alone all tomorrow fiddling with it, would you like to come over...)
Sorry, I was imprecise. I consider it likely that eventually we’ll be able to make uFAI, but unlikely that any particular project will make uFAI. Moreover, we probably won’t get appreciable warning for uFAI because if researchers knew they were making a uFAI then they wouldn’t make one.
Thus, we have to adopt a general strategy that can’t target any specific research group. Sabotage does not scale well, and would only drive research underground while imposing social costs on us meanwhile. The best bet then is to promote awareness of uFAI risks and try to have friendliness theory completed by the time the first AGI goes online. Not surprisingly, this seems to be what SIAI is already doing. Discussion of sabotage just harms that strategy.