What I meant is that the argument is that you have to make it care about humans so as not to harm them. Yet it is assumed that it does a lot without having to care about it, e.g. creating paperclips or self-improvement. My question is, why do people believe that you don’t have to make it care to do those things but you have to make it care to not harm humans. It is clear that if it only cares about one thing, doing that one thing could harm humans. Yet why would it do that one thing to an extent that is either not defined or which it is not deliberately made to care about. The assumptions seems to be that AI’s will do something, anything but being passive. Why isn’t limited behavior, failure and impassivity together not more likely than harming humans as a result of own goals or as a result to follow all goals but the one that limits its scope?
What I meant is that the argument is that you have to make it care about humans so as not to harm them. Yet it is assumed that it does a lot without having to care about it, e.g. creating paperclips or self-improvement. My question is, why do people believe that you don’t have to make it care to do those things but you have to make it care to not harm humans. It is clear that if it only cares about one thing, doing that one thing could harm humans. Yet why would it do that one thing to an extent that is either not defined or which it is not deliberately made to care about. The assumptions seems to be that AI’s will do something, anything but being passive. Why isn’t limited behavior, failure and impassivity together not more likely than harming humans as a result of own goals or as a result to follow all goals but the one that limits its scope?