Well, it’s logically impossible for the last item in your post to be true for any AI.
I don’t see how.
I am not saying that a thermostat is going to do anything else than what it has been designed for. But an AI is very likely going to be designed to exhibit user-friendliness. That doesn’t mean that one can design an AI that won’t. But the default outcome seems to be that an AI is not just going to act according to its utility-function but also according to more basic drives, i.e. acting intelligently.
One implicit outcome of AGI might be recursive self-improvement. And I don’t think that it is logically impossible that this does include an improvement to its goals as well, if it wasn’t specifically designed to have a stable utility-function.
What would constitute an improvement to its goals? I think the context in which its goals were meant to be interpreted is important. And that context is human volition.
You would have to assume a specific AGI design to call this logically impossible. And I don’t see that your specific AGI design will be the first AGI in every possible world.
Any human who does pursue a business realizes that a contract with its customers includes unspoken, implicit parameters. Respecting those implied values of their customers is not a result of their shared evolutionary history but a result of their intelligence that allows them to realize that the goal of their business implicitly includes those values.
Well, it’s logically impossible for the last item in your post to be true for any AI. Specific AIs only : )
I don’t see how.
I am not saying that a thermostat is going to do anything else than what it has been designed for. But an AI is very likely going to be designed to exhibit user-friendliness. That doesn’t mean that one can design an AI that won’t. But the default outcome seems to be that an AI is not just going to act according to its utility-function but also according to more basic drives, i.e. acting intelligently.
One implicit outcome of AGI might be recursive self-improvement. And I don’t think that it is logically impossible that this does include an improvement to its goals as well, if it wasn’t specifically designed to have a stable utility-function.
What would constitute an improvement to its goals? I think the context in which its goals were meant to be interpreted is important. And that context is human volition.
You would have to assume a specific AGI design to call this logically impossible. And I don’t see that your specific AGI design will be the first AGI in every possible world.
Any human who does pursue a business realizes that a contract with its customers includes unspoken, implicit parameters. Respecting those implied values of their customers is not a result of their shared evolutionary history but a result of their intelligence that allows them to realize that the goal of their business implicitly includes those values.