From my view, the determinism isn’t actually the main takeaway from realizing that the loss landscape is stationary, the categorization is. Also, I would argue that there’s a huge practical difference between having mesaobjectives that happen to be local mins of the base objective and counting on quirks/stop-gradients; for one, the former is severely constrained on the kinds of objectives it can have (i.e we can imagine that lots of mesaobjectives, especially more blatant ones, might be harder to couple), which totally rules out some more extreme conceptions of gradient hackers that completely ignore the base objective. Also, convergent gradient hackers are only dangerous out of distribution—in theory if you had a training distribution that covered 100% of the input space then convergent gradient hackers would cease to be a thing (being really suboptimal on distribution is bad for the base objective).
From my view, the determinism isn’t actually the main takeaway from realizing that the loss landscape is stationary, the categorization is. Also, I would argue that there’s a huge practical difference between having mesaobjectives that happen to be local mins of the base objective and counting on quirks/stop-gradients; for one, the former is severely constrained on the kinds of objectives it can have (i.e we can imagine that lots of mesaobjectives, especially more blatant ones, might be harder to couple), which totally rules out some more extreme conceptions of gradient hackers that completely ignore the base objective. Also, convergent gradient hackers are only dangerous out of distribution—in theory if you had a training distribution that covered 100% of the input space then convergent gradient hackers would cease to be a thing (being really suboptimal on distribution is bad for the base objective).