David Scott Krueger (formerly: capybaralet) comments on Predictors as Agents

David Scott Krueger (formerly: capybaralet) 10 Feb 2019 22:39 UTC
LW: 4 AF: 3
AF
Whether or not this happens depends on the learning algorithm. Let’s assume an IID setting. Then an algorithm that evaluates many random parameter settings and choses the one that gives the best performance would have this effect. But a gradient-based learning algorithm wouldn’t necessarily, since it only aims to improve its predictions locally (so what you say in the ETA is more accurate, **in this case**, I think).
Also, I just wanted to mention that Stuart Armstrong’s paper “Good and safe uses of AI oracles” discusses self-fulfilling prophecies as well; Stuart provides a way of training a predictor that won’t be victim to such effects (just don’t reveal its predictions when training). But then it also fails to account for the effect its predictions actually have, which can be a source of irreducible error… The example is a (future) stock-price predictor: making its predictions public makes them self-refuting to some extent, as they influence market actors decisions.
- interstice 25 Mar 2019 20:57 UTC
  LW: 1 AF: 1
  AF Parent
  Yeah, if you train the algorithm by random sampling, the effect I described will take place. The same thing will happen if you use an RL algorithm to update the parameters instead of an unsupervised learning algorithm(though it seems willfully perverse to do so—you’re throwing away a lot of the structure of the problem by doing this, so training will be much slower)
  I also just found an old comment which makes the exact same argument I made here. (Though it now seems to me that argument is not necessarily correct!)