Good question. Unfortunately, one weakness of our definition of multi-agent POWER is that it doesn’t have much useful to say in a case like this one.
We assume AI learning timescales vastly outstrip human learning timescales as a way of keeping our definition tractable. So the only way to structure this problem in our framework would be to imagine a human is playing chess against a superintelligent AI — a highly distorted situation compared to the case of two roughly equal opponents.
On the other hand, from other results I’ve seen anecdotally, I suspect that if you gave one of the agents a purely random policy (i.e., take a random legal action at each state) and assigned the other agent some reasonable reward function distribution over material, you’d stand a decent chance of correctly identifying high-POWER states with high-mobility board positions.
You might also be interested in this comment by David Xu, where he discusses mobility as a measure of instrumental value in chess-playing.
We assume AI learning timescales vastly outstrip human learning timescales as a way of keeping our definition tractable. So the only way to structure this problem in our framework would be to imagine a human is playing chess against a superintelligent AI — a highly distorted situation compared to the case of two roughly equal opponents.
I think this is probably true in the long term (the classical-quantum/reversible computer transition is very large, and humans can’t easily modify brains, unlike a virtual human.) But this may not be true in the short-term.
Agreed. We think our human-AI setting is a useful model of alignment in the limit case, but not really so in the transient case. (For the reason you point out.)
Good question. Unfortunately, one weakness of our definition of multi-agent POWER is that it doesn’t have much useful to say in a case like this one.
We assume AI learning timescales vastly outstrip human learning timescales as a way of keeping our definition tractable. So the only way to structure this problem in our framework would be to imagine a human is playing chess against a superintelligent AI — a highly distorted situation compared to the case of two roughly equal opponents.
On the other hand, from other results I’ve seen anecdotally, I suspect that if you gave one of the agents a purely random policy (i.e., take a random legal action at each state) and assigned the other agent some reasonable reward function distribution over material, you’d stand a decent chance of correctly identifying high-POWER states with high-mobility board positions.
You might also be interested in this comment by David Xu, where he discusses mobility as a measure of instrumental value in chess-playing.
I think this is probably true in the long term (the classical-quantum/reversible computer transition is very large, and humans can’t easily modify brains, unlike a virtual human.) But this may not be true in the short-term.
Agreed. We think our human-AI setting is a useful model of alignment in the limit case, but not really so in the transient case. (For the reason you point out.)