How might we theoretically ground utility functions? One approach could be to view the possible environments as a set of universe histories (e.g. a list of the positions of all quarks, etc. at all times), and a utility function as a function that maps these universe histories to real numbers. We might want this utility function to be computable, but this eliminates some plausible preferences we might want to represent. For example, in the procrastination paradox, the subject prefers to push the button as late as possible, but disprefers never pressing the button. If the history is infinitely long, no computable function can know for sure that the button was never pressed: it’s always possible that it was pressed at some later day.
Instead, we could use _subjective utility functions_, which are defined over _events_, which is basically anything you can think about (i.e. it could be chairs and tables, or quarks and strings). This allows us to have utility functions over high level concepts. In the previous example, we can define an event “never presses the button”, and reason about that event atomically, sidestepping the issues of computability.
We could go further and view _probabilities_ as subjective (as in the Jeffrey-Bolkor axioms), and only require that our beliefs are updated in such a way that we cannot be Dutch-booked. This is the perspective taken in logical induction.
Planned summary for the Alignment Newsletter: