If a goal is a preference order over world states, then there are uncountably many of them, so any countable means of expression can only express a vanishingly small minority of them. Trivially (as Bostrom points out) a goal system can be too complex for an agent of a given intelligence. It therefore seems to me that what we’re really defending is an Upscalability thesis: if an agent A with goal G is possible, then a significantly more intelligent A++ with goal G is possible.
If a goal is a preference order over world states, then there are uncountably many of them, so any countable means of expression can only express a vanishingly small minority of them. Trivially (as Bostrom points out) a goal system can be too complex for an agent of a given intelligence. It therefore seems to me that what we’re really defending is an Upscalability thesis: if an agent A with goal G is possible, then a significantly more intelligent A++ with goal G is possible.