TAG comments on Why does AGI need a utility function?

TAG 24 Aug 2022 16:34 UTC
1 point
0
“Agent” of course means more than one thing, eg;
1. Active versus passive...basically acting unprompted if we are talking about software.
2. Acting on another’s behalf, as in principal-agent.
3. Having a utility function of its own (or some sort of goals of its own) and optimising (or satisficing) it.
4. Something that depends on free will, consciousness, Selfhood, etc.
Gwern’s claim that it’s advantageous for agents to be tools is clearly false in sense 1. Most of the instances of software in the world are passive.. Spreadsheets, word processors and so on, that sit there doing nothing when you fire them up. The market doesn’t demand agentive1 versions of spreadsheets nd word processors, and they haven’t been outcompeted by agentive versions. They are tools that want to remain tools.

There are software agents in senses 1 and 2,such as automated trading software. Trading software is agentive in the principle-agent sense, ie. It’s intended to make money for its creators, the Principal. They don’t want it to have too much agency, because it might start losing them money, or breaking the law, or making money for someone else...its creators don’t want it to have a will of its own, they want it to optimise their own utility function.

So that’s another sense in which “the more agency, the better”, is false. (Incidentally, it also means that Control and Capability aren’t orthogonal …capability of a kind worth wanting needs to be somewhat controlled, and the easiest way to control is to keep capability minimal).

Optimisation is definable for an agent that does not have its own UF,...it’s optimising the principal’s UF as well as it can,and as well as the principal/creator can communicate it. That’s not scary...if it’s optimising your UF , it’s aligned with you, and if it isn’t that’s an ordinary problem of expressing your business goals in an algorithm. But achieving that is your problem..a type 2 agent does not have its own UF, so you are communicating your UF to it by writing an algorithm, or something like that.

Giving an agent its own UF does not necessarily make it better at optimising your UF...which is to say, does not make it more optimised in any sense you care about. So there is no strong motivation towards.it.

An agent with its own UF is more sophisticated and powerful in some senses, but there are multiple definitions of “power” and multiple interests here...it’s not all one ladder that everyone is trying to climb as fast as possible.

Beware the slippery slope from “does something” to “is an agent” to “has A UF” to “is an optimiser” to “is a threat”, to “is an existential threat”!