TLW comments on Edmund’s Short form

TLW 6 Mar 2022 5:58 UTC
1 point
Part of this is people analyzing AIs in adversarial contexts through the lens of non-adversarial contexts when they really shouldn’t be.
In a non-adversarial context, an AI that beats X 95% of the time when the top human beats X 90% of the time is often considered superhuman. And so you get people calling e.g. AlphaStar superhuman because it beat the top human.
In an adversarial context, where that other 5% is in the statespace matters a lot. (E.g. if it’s in a region that can be steered towards by an opponent, that’s a problem.)
*****
(Part of this is also that the AI training is far more lenient of low-but-ahead win rates than humans are. A human will often lean towards weaker but more-resistant-to-glass-jaws strategies, especially in tournament settings. (They are trying to beat the tournament, not, strictly speaking, get the highest % of wins.))
*****
I sometimes think that the dual benchmark of ‘what’s the highest rank person X beats >40%^[1] of the time’ and ‘what’s the lowest rank person that beats X >60%^[1] of the time’ would be more useful in evaluating AI progress.
1. ^
  Semi-arbitrary numbers, don’t read too much into it.