But I’d draw a distinction here analogous to the difference between evolutionary pressure on humans to lie, vs cultural pressure on how much we lie; the effects of the former are usually too slow to matter much compared to effects of the latter. Crude selection on models is at least much less problematic than selection on the behavior of a given model, especially if your lie detection approach works well on everything that’s nearby in design space.
This was essentially the reason why the Sharp Left Turn argument was so bad: Humans + SGD are way faster at optimization compared to evolution, and there’s far less imbalance between the inner optimization power and the outer optimization power, where it’s usually at best 10-40x, and even then you can arguably remove the inner optimizer entirely.
Humans + SGD are way faster, can select directly over policies, and we can basically assign whatever ratio we like of outer optimization steps to inner optimization steps. Evolution simply can’t do that. There are other disanalogies, but this is one of the main disanalogies between evolution and us.
This was essentially the reason why the Sharp Left Turn argument was so bad: Humans + SGD are way faster at optimization compared to evolution, and there’s far less imbalance between the inner optimization power and the outer optimization power, where it’s usually at best 10-40x, and even then you can arguably remove the inner optimizer entirely.
Humans + SGD are way faster, can select directly over policies, and we can basically assign whatever ratio we like of outer optimization steps to inner optimization steps. Evolution simply can’t do that. There are other disanalogies, but this is one of the main disanalogies between evolution and us.