The specific argument that you just referenced in your earlier comment: that argmax is important for competitiveness, but that argmax is inherently unsafe because of adversarial optimization (“argmax is a trap”).
Assuming softmax is important for competitiveness instead, I don’t see why this argument doesn’t go through with “argmax” replaced by “softmax” throughout (including the “argmax is a trap” section of the OP). I read your linked comment and post, and still don’t understand. I wonder what the authors of the OP (or anyone else) think about this.
Assuming softmax is important for competitiveness instead, I don’t see why this argument doesn’t go through with “argmax” replaced by “softmax” throughout (including the “argmax is a trap” section of the OP). I read your linked comment and post, and still don’t understand. I wonder what the authors of the OP (or anyone else) think about this.