And I believe that there is a small chance that the first AGI will respect human values. In other words, friendliness might turn out to be much easier than thought and be implied by some AGI designs even if it is not explicitly defined.
This might for example be the case if the first AGI is the outcome of some sort of evolutionary process in which it competed with a vast number of other AGI designs and thereby evolve some sort of altruism, which in turn caused it to have some limited amount of compassion for humans and provide us with a share of the universe.
I am just saying that this isn’t logically impossible.
This might for example be the case if the first AGI is the outcome of some sort of evolutionary process in which it competed with a vast number of other AGI designs and thereby evolve some sort of altruism, which in turn caused it to have some limited amount of compassion for humans and provide us with a share of the universe.
There will probably be major selection pressures by humans for safe machines that can act as nannies, assistants, etc.
Our relationship with machines looks set to start out on the right foot, mostly. Of course there will probably be some who lose their jobs and fail to keep up along the way.
Humans won’t get “a share of the universe”, though. Our pitch should be for our bodies to survive in the history simulations and for our minds to get uploaded.
Well, it’s logically impossible for the last item in your post to be true for any AI.
I don’t see how.
I am not saying that a thermostat is going to do anything else than what it has been designed for. But an AI is very likely going to be designed to exhibit user-friendliness. That doesn’t mean that one can design an AI that won’t. But the default outcome seems to be that an AI is not just going to act according to its utility-function but also according to more basic drives, i.e. acting intelligently.
One implicit outcome of AGI might be recursive self-improvement. And I don’t think that it is logically impossible that this does include an improvement to its goals as well, if it wasn’t specifically designed to have a stable utility-function.
What would constitute an improvement to its goals? I think the context in which its goals were meant to be interpreted is important. And that context is human volition.
You would have to assume a specific AGI design to call this logically impossible. And I don’t see that your specific AGI design will be the first AGI in every possible world.
Any human who does pursue a business realizes that a contract with its customers includes unspoken, implicit parameters. Respecting those implied values of their customers is not a result of their shared evolutionary history but a result of their intelligence that allows them to realize that the goal of their business implicitly includes those values.
And I believe that there is a small chance that the first AGI will respect human values. In other words, friendliness might turn out to be much easier than thought and be implied by some AGI designs even if it is not explicitly defined.
This might for example be the case if the first AGI is the outcome of some sort of evolutionary process in which it competed with a vast number of other AGI designs and thereby evolve some sort of altruism, which in turn caused it to have some limited amount of compassion for humans and provide us with a share of the universe.
I am just saying that this isn’t logically impossible.
There will probably be major selection pressures by humans for safe machines that can act as nannies, assistants, etc.
Our relationship with machines looks set to start out on the right foot, mostly. Of course there will probably be some who lose their jobs and fail to keep up along the way.
Humans won’t get “a share of the universe”, though. Our pitch should be for our bodies to survive in the history simulations and for our minds to get uploaded.
Well, it’s logically impossible for the last item in your post to be true for any AI. Specific AIs only : )
I don’t see how.
I am not saying that a thermostat is going to do anything else than what it has been designed for. But an AI is very likely going to be designed to exhibit user-friendliness. That doesn’t mean that one can design an AI that won’t. But the default outcome seems to be that an AI is not just going to act according to its utility-function but also according to more basic drives, i.e. acting intelligently.
One implicit outcome of AGI might be recursive self-improvement. And I don’t think that it is logically impossible that this does include an improvement to its goals as well, if it wasn’t specifically designed to have a stable utility-function.
What would constitute an improvement to its goals? I think the context in which its goals were meant to be interpreted is important. And that context is human volition.
You would have to assume a specific AGI design to call this logically impossible. And I don’t see that your specific AGI design will be the first AGI in every possible world.
Any human who does pursue a business realizes that a contract with its customers includes unspoken, implicit parameters. Respecting those implied values of their customers is not a result of their shared evolutionary history but a result of their intelligence that allows them to realize that the goal of their business implicitly includes those values.