moridinamael comments on moridinamael’s Shortform

moridinamael 27 Apr 2022 21:47 UTC
2 points
Yes, the former. If the agent takes actions and receives reward, assuming it can see the reward, then it will gain evidence about its utility function.
- RHollerith 8 May 2022 20:17 UTC
  2 points
  Parent
  Probably you already know this, but the framework known as reinforcement learning is very relevant here. In particular, there are probably web pages that describe how to compute the expected utility of a (strategy, reward function) pair.