By contrast, in this section I’m interested in what it means for an agent to have a goal of its own. Three existing frameworks which attempt to answer this question are Von Neumann and Morgenstern’s expected utility maximisation, Daniel Dennett’s intentional stance, and Hubinger et al’s mesa-optimisation. I don’t think any of them adequately characterises the type of goal-directed behaviour we want to understand, though. While we can prove elegant theoretical results about utility functions, they are such a broad formalism that practically any behaviour can be described as maximising some utility function.
There is my algorithmic-theoretic definition which might be regarded as a formalization of the intentional stance, and which avoids the degeneracy problem you mentioned.
There is my algorithmic-theoretic definition which might be regarded as a formalization of the intentional stance, and which avoids the degeneracy problem you mentioned.