I think you make a lot of good points, but I do have a big objection:
the set of successful policies only includes ones that entertain different models of the origin of reward, and then pick actions to maximize predicted future rewards.
This is obviously an additional assumption, and not a compelling one. The relevant sense of success is “doing the right thing, whatever that means”, and reward maximisers are apparently not very good at this.
You argue that someone hasn’t yet given you a compelling alternative—but that’s a very weak reason to say that doing better is impossible. As we both agree, the set of all policies is very large.
A more plausible assumption is that people don’t manage to do better than reward maximisers, even though they’re not very good. I have doubts about this too, but I think it can at least be argued.
I don’t think it’s an assumption really. I think this sentence just fixes the meanings, in perfectly sensible ways, of the words “entertain” and “to” (as in “pick actions to”). I guess you’re not persuaded that competent behavior in the “many new video games” environment is deserving of the description “aiming to maximize predicted future rewards”. Why is that, if the video games are sufficiently varied?
Fixing the meanings of keywords is one of the important things that assumptions do, and the scope of the quoted claim is larger than you suggest.
You argue a sufficiently competent reward maximiser will intervene in the provision of rewards and will not in fact achieve high scores in many video games. An insufficiently competent reward maximiser will be outperformed by a more competent “intrinsically motivated video game player” (I think we can understand what this means well enough, even if we can’t give a formal specification). So I don’t think reward maximisers are the best video game players, with implies that competent video game playing is not equivalent to reward maximisation.
If you allow maximisers of “latent rewards”, then other parts of your argument become much less plausible. What does intervening on a latent reward even mean? Is it problematic?
I also think that video games are a special enough category that “video game competence” is probably a poor proxy for AI desirability.
I also think that video games are a special enough category that “video game competence” is probably a poor proxy for AI desirability.
I agree on poor proxy for AI desirability. To add to that problem, a game developer in one country can have values and goals opposite to game developers in a different country/region.
When you said “poor proxy for AI desirability”, the first thing that came to mind was dogs faces on deepmind early illustrations, because dataset contained too many dogs. Same false optimal policy can be obtained from video games.
I think you make a lot of good points, but I do have a big objection:
This is obviously an additional assumption, and not a compelling one. The relevant sense of success is “doing the right thing, whatever that means”, and reward maximisers are apparently not very good at this.
You argue that someone hasn’t yet given you a compelling alternative—but that’s a very weak reason to say that doing better is impossible. As we both agree, the set of all policies is very large.
A more plausible assumption is that people don’t manage to do better than reward maximisers, even though they’re not very good. I have doubts about this too, but I think it can at least be argued.
I don’t think it’s an assumption really. I think this sentence just fixes the meanings, in perfectly sensible ways, of the words “entertain” and “to” (as in “pick actions to”). I guess you’re not persuaded that competent behavior in the “many new video games” environment is deserving of the description “aiming to maximize predicted future rewards”. Why is that, if the video games are sufficiently varied?
Fixing the meanings of keywords is one of the important things that assumptions do, and the scope of the quoted claim is larger than you suggest.
You argue a sufficiently competent reward maximiser will intervene in the provision of rewards and will not in fact achieve high scores in many video games. An insufficiently competent reward maximiser will be outperformed by a more competent “intrinsically motivated video game player” (I think we can understand what this means well enough, even if we can’t give a formal specification). So I don’t think reward maximisers are the best video game players, with implies that competent video game playing is not equivalent to reward maximisation.
If you allow maximisers of “latent rewards”, then other parts of your argument become much less plausible. What does intervening on a latent reward even mean? Is it problematic?
I also think that video games are a special enough category that “video game competence” is probably a poor proxy for AI desirability.
I agree on poor proxy for AI desirability. To add to that problem, a game developer in one country can have values and goals opposite to game developers in a different country/region.
When you said “poor proxy for AI desirability”, the first thing that came to mind was dogs faces on deepmind early illustrations, because dataset contained too many dogs. Same false optimal policy can be obtained from video games.