moridinamael comments on moridinamael’s Shortform

moridinamael 25 Apr 2022 14:01 UTC
3 points
I’m well versed in what I would consider to be the practical side of decision theory but I’m unaware of what tools, frameworks, etc. are used to deal with uncertainty in the utility function. By this I mean uncertainty in how utility will ultimately be assessed, for an agent that doesn’t actually know how much they will or won’t end up preferring various outcomes post facto, and they know in advance that they are ignorant about their preferences.

The thing is, I know how I would do this, it’s not really that complex (use probability distributions for the utilities associated with outcomes and propagate that through the decision tree) but I can’t find a good trailhead for researching how others have done this. When I Google things like “uncertainty in utility function” I am just shown standard resources on decision making under uncertainty, which is about uncertainty in the outcome, not uncertainty in the utility function.

(As for why I’m interested in this — first of all, it seems like a more accurate way of modeling human agents, and, second, I can’t see how you instantiate something like Indirect Normativity without the concept of uncertainty in the utility function itself.)
- RHollerith 27 Apr 2022 5:38 UTC
  2 points
  Parent
  Are we talking about an agent that is uncertain about its own utility function or about an agent that is uncertain about another agent’s?
  - RHollerith 27 Apr 2022 6:07 UTC
    2 points
    Parent
    You are probably talking about the former. What would count as evidence about the uncertain utility function?
    - moridinamael 27 Apr 2022 21:47 UTC
      2 points
      Parent
      Yes, the former. If the agent takes actions and receives reward, assuming it can see the reward, then it will gain evidence about its utility function.
      - RHollerith 8 May 2022 20:17 UTC
        2 points
        Parent
        Probably you already know this, but the framework known as reinforcement learning is very relevant here. In particular, there are probably web pages that describe how to compute the expected utility of a (strategy, reward function) pair.