Jeremy Gillen comments on Finding Goals in the World Model

Jeremy Gillen 23 Aug 2022 16:43 UTC
3 points
0
That’s correct that it simultaneously infers the policy and utility function. To avoid the underspecification problem, it uses a prior that favors higher intelligence agents. This is similar to taking assumptions 1 and 2a from http://proceedings.mlr.press/v97/shah19a/shah19a.pdf