Well, even if you had figured out how to encode ‘solve FAI’ isn’t there still scope for things going wrong?
Sure, in the sense that an alien UFAI could still arrive the next day and wipe us out, or a large asteroid, or any other low probability catastrophe. Or the FAI could just honestly fail at its goal, and produce an UFAI by accident.
There is always scope for things going wrong. However, encoding ‘solve FAI’ turns out to be essentially the same problem as encoding ‘FAI’, because ‘FAI’ isn’t a fixed thing, its a complex dynamic. More specifically FAI is an AI that creates improved successor versions of itself, thus it has ‘solve FAI’ as part of its description already.
Also, as you go through the iterations of the AI, you’d get better and better ideas as how to solve FAI; either ourselves, or by the next AI. Because you can see where the AI has gone wrong, and how, and prevent that in the next iteration.
Yes—with near certainty the road to complex AI involves iterative evolutionary development like any other engineering field. MIRI seems to want to solve the whole safety issue in pure theory first. Meanwhile the field of machine learning is advancing rather quickly to AGI, and in that field progress is driven more by experimental research than pure theory—as there is only so much one can do with math on paper.
The risk stems from a few considerations: once we have AGI then superintelligence could follow very shortly thereafter, and thus the first AGI to scale to superintelligence could potentially takeover the world and prevent any further experimentation with other designs.
Your particular proposal involves constraints on the intelligence of the AGI—a class of techniques discussed in detail in Bostrom’s Superintelligence. The danger there is that any such constraints increase the liklihood that some other less safe competitor will then reach superintelligence first. It would be better to have a design that is intrinsically benevolent/safe and doesnt need such constraints—if such a thing is possible. The tradeoffs are rather complex.
Alright, what I got from your post is that if you know the definition of an FAI and can instruct a computer to design one, you’ve basically already made one. That is, having the precise definition of the thing massively reduces the difficulty of creating it i.e. when people ask ‘do we have free will?’ defining free will greatly reduces the complexity of the problem. Is that correct?
Alright, what I got from your post is that if you know the definition of an FAI and can instruct a computer to design one, you’ve basically already made one.
Yes. Although to be clear, the most likely path probably involves a very indirect form of specification based on learning from humans.
Ok. So why could you not replace ‘encode an FAI’ with ‘define an FAI?’ And you would place all the restrictions I detailed on that AI. Or is there still a problem?
Sure, in the sense that an alien UFAI could still arrive the next day and wipe us out, or a large asteroid, or any other low probability catastrophe. Or the FAI could just honestly fail at its goal, and produce an UFAI by accident.
There is always scope for things going wrong. However, encoding ‘solve FAI’ turns out to be essentially the same problem as encoding ‘FAI’, because ‘FAI’ isn’t a fixed thing, its a complex dynamic. More specifically FAI is an AI that creates improved successor versions of itself, thus it has ‘solve FAI’ as part of its description already.
Yes—with near certainty the road to complex AI involves iterative evolutionary development like any other engineering field. MIRI seems to want to solve the whole safety issue in pure theory first. Meanwhile the field of machine learning is advancing rather quickly to AGI, and in that field progress is driven more by experimental research than pure theory—as there is only so much one can do with math on paper.
The risk stems from a few considerations: once we have AGI then superintelligence could follow very shortly thereafter, and thus the first AGI to scale to superintelligence could potentially takeover the world and prevent any further experimentation with other designs.
Your particular proposal involves constraints on the intelligence of the AGI—a class of techniques discussed in detail in Bostrom’s Superintelligence. The danger there is that any such constraints increase the liklihood that some other less safe competitor will then reach superintelligence first. It would be better to have a design that is intrinsically benevolent/safe and doesnt need such constraints—if such a thing is possible. The tradeoffs are rather complex.
Alright, what I got from your post is that if you know the definition of an FAI and can instruct a computer to design one, you’ve basically already made one. That is, having the precise definition of the thing massively reduces the difficulty of creating it i.e. when people ask ‘do we have free will?’ defining free will greatly reduces the complexity of the problem. Is that correct?
Yes. Although to be clear, the most likely path probably involves a very indirect form of specification based on learning from humans.
Ok. So why could you not replace ‘encode an FAI’ with ‘define an FAI?’ And you would place all the restrictions I detailed on that AI. Or is there still a problem?