AlphaStar’s innovative league-based training process finds the approaches that are most reliable and least likely to go wrong.
“Go wrong” is still tied to the game’s win condition. So while the league-based training process does find the set of agents whose gameplay is least exploitable (among all the agents they trained), it’s not obvious how this relates to problems in AGI safety such as goal specification or robustness to capability gains. Maybe they’re thinking of things like red teaming. But without more context I’m not sure how safety-relevant this is.
“Go wrong” is still tied to the game’s win condition. So while the league-based training process does find the set of agents whose gameplay is least exploitable (among all the agents they trained), it’s not obvious how this relates to problems in AGI safety such as goal specification or robustness to capability gains. Maybe they’re thinking of things like red teaming. But without more context I’m not sure how safety-relevant this is.