dsj comments on Models Don’t “Get Reward”

dsj 31 Dec 2022 16:06 UTC
12 points
12
I agree the evolutionary metaphor works in this regard, because of the repeated interplay between small variations and selection.

The caution is against only thinking about the selection part — thinking of gradient descent as just a procedure that, when done, gives you a model of low loss, from the space of possible models.

In particular, there’s this section in the post:
- Consider a hypothetical model that chooses actions by optimizing towards some internal goal which is highly correlated with the reward that would be assigned by a human overseer.
- Obviously, RL is going to exhibit selection pressure towards such a model.
It is not obvious to me that RL will exhibit selection pressure towards such a model! That depends on what models are nearby in parameter space. That model may have very high reward, but the models nearby could have low reward, in which case there’s no path to it.
- DragonGod 1 Jan 2023 16:35 UTC
  2 points
  0
  Parent
  So RL is similar to evolutionary selection in the sense that after each iteration there a reachable space, and the space only narrows (never widens) with each iteration.
  
  E.g. fish could evolve to humans and to orcas, but orcas cannot evolve to humans? (I don’t think this analogy actually works very well.)
  - cfoster0 2 Jan 2023 4:25 UTC
    4 points
    0
    Parent
    Analogy seems okay by me, because I don’t think “the space only narrows (never widens) with each iteration” is true about RL or about evolutionary selection!
    - DragonGod 2 Jan 2023 5:41 UTC
      2 points
      1
      Parent
      Oh, do please explain.
      - cfoster0 2 Jan 2023 6:13 UTC
        4 points
        0
        Parent
        Wait, why would it only narrow in either case?
        DragonGod 2 Jan 2023 6:39 UTC
        2 points
        −1
        Parent
        Because investments close off parts of solution space?
        
        I guess I’m imagining something like a tree. Nodes can reach all their descendants, but a node cannot reach any of its siblings descendants. As you move deeper into the tree, the reachable nodes becomes strictly smaller.
        cfoster0 2 Jan 2023 7:32 UTC
        9 points
        2
        Parent
        What does that correspond to?
        Like, I think that the solution space in both cases is effectively unbounded and traversable in any direction, with only a tiny number of solutions that have ever been instantiated at any given point (in evolutionary history/in the training process), and at each iteration there are tons of “particles” (genomes/circuits) trying out new configurations. Plus if you account for the fact that the configuration space can get bigger over time (genomes can grow longer/agents can accumulate experiences) then I think you can really just keep on finding new configurations ’til the cows come home. Yes, the likelihood of ever instantiating the same one twice is tiny, but instantiating the same trait/behavior twice? Happens all the time, even within the same lineage. Looks like in biology, there’s even a name for it!
        If there’s a gene in the population and a totally new mutation arises, now you have both the original and the mutated version floating somewhere in the population, which slightly expands the space of explored genomes (err, “slightly” relative to the exponentially-big space of all possible genomes). Even if that mutated version takes over because it increases fitness in a niche this century, that niche could easily change next century, and there’s so much mutation going on that I don’t see why the original variant couldn’t arise again. Come to think of it, the constant changeover of environmental circumstances in evolution kinda reminds me of nonstationarity in RL...