I consider it important to further clarify the notion of a bounded utility function.
A deployed neural network has a utility function that can be described as outputting a description of the patterns it sees in its most recent input, according to whatever algorithm it’s been trained to apply. It’s pretty clear to any expert that the neural network doesn’t care about anything beyond a specific set of numbers that it outputs.
A neural network that is in the process of being trained is slightly harder to analyze, but essentially the same. It cares about generating an algorithm that will be used in a deployed neural network. At any one training step, it is focused solely on applying fixed algorithms to produce improvements to the deployable algorithm. It has no concept that would lead it to look beyond its immediate task of incremental improvements to that deployable algorithm.
And in some important sense, those steps are the main ways in which AI gets used to produce cars that have superhuman driving ability, and the designers can prove (at least to themselves) that the cars won’t go out and buy more processing power, or forage for more energy.
Many forms of AI will be more complex than neural networks (e.g. they might be a mix of RL and neural networks), and I don’t have the expertise to extend this analysis to those systems. I’m confident that it’s possible in principle to get general-purpose superhuman AIs using only this kind of bounded utility function, but I’m uncertain how practical that is compared to a more unified agent with a broader utility function.
To clarify, when you say “bounded utility function” you mean that it’s only defined over a fixed set of inputs, right?
(As opposed to meaning that the output of the function is never infinite, as in this post, which is what I first think of when I hear “bounded utility function”. In other words, I expected bounded utility to refer to the range of the function, but you seem to be referring to the domain. Not sure which is more standard, but thought it worth calling out for other readers who may be confused.)
I’m not talking about the range. Domain seems possibly right, but not as informative as I’d like. I’m talking about what parts of spacetime it cares about, and saying that it only cares about specific outputs of a specific process. Drexler refers to this as “bounded scope and duration”. Note that this will normally be an implicit utility function, that we infer from our understanding of the system.
“bounded utility function” is definitely not an ideal way of referring to this.
I consider it important to further clarify the notion of a bounded utility function.
A deployed neural network has a utility function that can be described as outputting a description of the patterns it sees in its most recent input, according to whatever algorithm it’s been trained to apply. It’s pretty clear to any expert that the neural network doesn’t care about anything beyond a specific set of numbers that it outputs.
A neural network that is in the process of being trained is slightly harder to analyze, but essentially the same. It cares about generating an algorithm that will be used in a deployed neural network. At any one training step, it is focused solely on applying fixed algorithms to produce improvements to the deployable algorithm. It has no concept that would lead it to look beyond its immediate task of incremental improvements to that deployable algorithm.
And in some important sense, those steps are the main ways in which AI gets used to produce cars that have superhuman driving ability, and the designers can prove (at least to themselves) that the cars won’t go out and buy more processing power, or forage for more energy.
Many forms of AI will be more complex than neural networks (e.g. they might be a mix of RL and neural networks), and I don’t have the expertise to extend this analysis to those systems. I’m confident that it’s possible in principle to get general-purpose superhuman AIs using only this kind of bounded utility function, but I’m uncertain how practical that is compared to a more unified agent with a broader utility function.
To clarify, when you say “bounded utility function” you mean that it’s only defined over a fixed set of inputs, right?
(As opposed to meaning that the output of the function is never infinite, as in this post, which is what I first think of when I hear “bounded utility function”. In other words, I expected bounded utility to refer to the range of the function, but you seem to be referring to the domain. Not sure which is more standard, but thought it worth calling out for other readers who may be confused.)
It sounds like he’s talking about services. From the post:
I’m not talking about the range. Domain seems possibly right, but not as informative as I’d like. I’m talking about what parts of spacetime it cares about, and saying that it only cares about specific outputs of a specific process. Drexler refers to this as “bounded scope and duration”. Note that this will normally be an implicit utility function, that we infer from our understanding of the system.
“bounded utility function” is definitely not an ideal way of referring to this.