No, you are assuming that we can build an AI that optimises a specific thing, “interpret all directives according to your makers intentions”. I’m assuming that we can build an AI that can optimise something, which is very different.
An AI that can self-improve considerably does already interpret a vast amount of directives according to its makers intentions, since self-improvement is an intentional feature.
Being able to predict a programs behavior is a prerequisite if you want the program to work well. Since unpredictable behavior tends to be chaotic and detrimental to the overall performance. In other words, if you got an AI that does not work according to its makers intentions, then you got an AI that does not work, or which is not very powerful.
An AI that can self-improve considerably does already interpret a vast amount of directives according to its makers intentions, since self-improvement is an intentional feature.
Goedel machines already specify self-improvement in formal mathematical form. If you can specify human morality in a similar formal manner, I’ll be a lot more relaxed.
Also, I don’t assume self improvement. Some model of powerful intelligences don’t require it.
An AI that can self-improve considerably does already interpret a vast amount of directives according to its makers intentions, since self-improvement is an intentional feature.
Being able to predict a programs behavior is a prerequisite if you want the program to work well. Since unpredictable behavior tends to be chaotic and detrimental to the overall performance. In other words, if you got an AI that does not work according to its makers intentions, then you got an AI that does not work, or which is not very powerful.
Goedel machines already specify self-improvement in formal mathematical form. If you can specify human morality in a similar formal manner, I’ll be a lot more relaxed.
Also, I don’t assume self improvement. Some model of powerful intelligences don’t require it.