You might argue that each individual service must be dangerous, since it is superintelligent at its particular task. However, since the service is optimizing for some bounded task, it is not going to run a long-term planning process [...]
Does this assume that we’ll be able to build generally intelligent systems (e.g. the service-creating-service) that optimize for a bounded task?
Depends what you mean by “generally intelligent”. Any individual service could certainly have deep and broad knowledge about the world (as with eg. a language translation service), but no service will be able to do all tasks (eg. the service-creating-service is not going to be able to edit genomes, except by creating a new service that learns how to edit genomes).
With that caveat, yes, this assumes that we’ll be able to build services that optimize for bounded tasks. But this is meant more as a description of how existing AI systems already work. Current RL agents are best modeled as optimizing for maximizing reward obtained for the current episode. (This isn’t exactly right, because the value function is trying to capture the reward that can be obtained in the future, but in practice this doesn’t make much of a difference.)
Does this assume that we’ll be able to build generally intelligent systems (e.g. the service-creating-service) that optimize for a bounded task?
Depends what you mean by “generally intelligent”. Any individual service could certainly have deep and broad knowledge about the world (as with eg. a language translation service), but no service will be able to do all tasks (eg. the service-creating-service is not going to be able to edit genomes, except by creating a new service that learns how to edit genomes).
With that caveat, yes, this assumes that we’ll be able to build services that optimize for bounded tasks. But this is meant more as a description of how existing AI systems already work. Current RL agents are best modeled as optimizing for maximizing reward obtained for the current episode. (This isn’t exactly right, because the value function is trying to capture the reward that can be obtained in the future, but in practice this doesn’t make much of a difference.)