What the hell does SIAI mean by ‘utility function’ anyway? (math please)
Inside the agents and tools as currently implemented, there is a solver that works on a function, and finds input values to that function, which result in maximum (or, usually, minimum) of that function (note that the output may be binary).
[To clarify: that function can include both model of the world and the evaluation of ‘desirability’ of properties of a state of this model. Usually, in software development, if you have f(g(x)) (where g is world predictor and f is the desirability evaluator), and g’s output is only ever used by f, this is a target for optimization to create fg(x) which is more accurate in given time but does not consist of nearly separable parts. Furthermore, the f output is only ever fed to comparison operators, making it another optimization target to create cmp_fg() which compares the actions directly perhaps by calculating the difference between worlds that is caused by particular action, which allows to cull most of processing out]
It, however, is entirely indifferent to actually maximizing anything. It doesn’t even try to maximize some internal variable (it will gladly try inputs that result in small output value, but usually is written not to report those inputs).
I think the confusion arises from defining the agent in English language-based concepts, as opposed to the AI developer’s behaviour where they would define things in some logical down-to-elements way, and then try to communicate it using English. The command in English, ‘bring me the best answer!‘, does tell you to go ahead and convert universe to computronium to answer it (if you are to interpret it in science-fiction-robot-minded way). The commands in programming languages, not really. I don’t think English specifies that either, we just can interpret it charitably enough if we feel like (starting from other purpose, such as ‘be nice’).
edit: I feel that a lot of difficulties of making ‘safe AGI’, those that are not outright nonsensical, are just repackaged special cases of statements about general difficulty of making any AGI, safe or not. That’s very nasty thing to do, to generate such special cases preferentially. edit: Also, some may be special cases of lack/impossibility of solution to symbol grounding.
What the hell does SIAI mean by ‘utility function’ anyway? (math please)
Inside the agents and tools as currently implemented, there is a solver that works on a function, and finds input values to that function, which result in maximum (or, usually, minimum) of that function (note that the output may be binary).
[To clarify: that function can include both model of the world and the evaluation of ‘desirability’ of properties of a state of this model. Usually, in software development, if you have f(g(x)) (where g is world predictor and f is the desirability evaluator), and g’s output is only ever used by f, this is a target for optimization to create fg(x) which is more accurate in given time but does not consist of nearly separable parts. Furthermore, the f output is only ever fed to comparison operators, making it another optimization target to create cmp_fg() which compares the actions directly perhaps by calculating the difference between worlds that is caused by particular action, which allows to cull most of processing out]
It, however, is entirely indifferent to actually maximizing anything. It doesn’t even try to maximize some internal variable (it will gladly try inputs that result in small output value, but usually is written not to report those inputs).
I think the confusion arises from defining the agent in English language-based concepts, as opposed to the AI developer’s behaviour where they would define things in some logical down-to-elements way, and then try to communicate it using English. The command in English, ‘bring me the best answer!‘, does tell you to go ahead and convert universe to computronium to answer it (if you are to interpret it in science-fiction-robot-minded way). The commands in programming languages, not really. I don’t think English specifies that either, we just can interpret it charitably enough if we feel like (starting from other purpose, such as ‘be nice’).
edit: I feel that a lot of difficulties of making ‘safe AGI’, those that are not outright nonsensical, are just repackaged special cases of statements about general difficulty of making any AGI, safe or not. That’s very nasty thing to do, to generate such special cases preferentially. edit: Also, some may be special cases of lack/impossibility of solution to symbol grounding.