RobertKirk comments on Will Capabilities Generalise More?

RobertKirk 30 Aug 2022 12:09 UTC
LW: 1 AF: 1
1
AF
I think perhaps a lot work is being done by “if your optimiser worked”. This might also be where there’s a disanaology between humans<->evolution and AIs<->SGD+PPO (or whatever RL algorithm you’re using to optimise the policy). Maybe evolution is actually a very weak optimiser, that doesn’t really “work”, compared to SGD+RL.
- TurnTrout 6 Sep 2022 22:56 UTC
  LW: 2 AF: 2
  1
  AF Parent
  I think that evolution is not the relevant optimizer for humans in this situation. Instead consider the within-lifetime learning that goes on in human brains. Humans are very probably reinforcement learning agents in a relevant sense; in some ways, humans are the best reinforcement learning agents we have ever seen.