Unless I’m missing something, Cicero can talk about its strategies, but only in the sense that its training resulted in its text usually saying such things about its strategies that it usually helps to win the game. Not in the sense that it would have some subpart that would truthfully and reliably report on whatever strategy the network actually has (I’d expect those two goals to contradict each other pretty often (or at least sometimes)).
Unless I’m missing something, Cicero can talk about its strategies, but only in the sense that its training resulted in its text usually saying such things about its strategies that it usually helps to win the game. Not in the sense that it would have some subpart that would truthfully and reliably report on whatever strategy the network actually has (I’d expect those two goals to contradict each other pretty often (or at least sometimes)).
I’ve heard that this is false. Though I haven’t personally read the paper, so I can’t comment with confidence.
Oh, I see. It seems like it doesn’t work reliably though (the comment says it “doesn’t lead to a fully honest agent”).