I just think what you’re measuring is very different from what people usually mean by “utility maximization”. I like how this X comment says that:
it doesn’t seem like turning preference distributions into random utility models has much to do with what people usually mean when they talk about utility maximization, even if you can on average represent it with a utility function.
So, in other words: I don’t think claims about utility maximization based on MC questions can be justified. See also Olli’s comment.
Anyway, what would be needed beyond your 5.3 section results: show that an AI, in very different agentic environments where its actions have some at least slightly “real” consequences, behaves in a consistent way with some utility function (ideally consistent with the one from your MC questions). This is what utility maximization means for most people.
I specifically asked about utility maximization in language models. You are now talking about “agentic environments”. The only way I know to make a language model “agentic” is to ask it questions about which actions to take. And this is what they did in the paper.
I just think what you’re measuring is very different from what people usually mean by “utility maximization”. I like how this X comment says that:
So, in other words: I don’t think claims about utility maximization based on MC questions can be justified. See also Olli’s comment.
Anyway, what would be needed beyond your 5.3 section results: show that an AI, in very different agentic environments where its actions have some at least slightly “real” consequences, behaves in a consistent way with some utility function (ideally consistent with the one from your MC questions). This is what utility maximization means for most people.
I specifically asked about utility maximization in language models. You are now talking about “agentic environments”. The only way I know to make a language model “agentic” is to ask it questions about which actions to take. And this is what they did in the paper.
OK, I’ll try to make this more explicit:
There’s an important distinction between “stated preferences” and “revealed preferences”
In humans, these preferences are often very different. See e.g. here
What they measure in the paper are only stated preferences
What people think of when talking about utility maximization is revealed preferences
Also when people care about utility maximization in AIs it’s about revealed preferences
I see no reason to believe that in LLMs stated preferences should correspond to revealed preferences
Sure! But taking actions reveals preferences, instead of stating preferences. That’s the key difference here.