I don’t think “Adaptations Executors VS Fitness Maximizers” is a good way of thinking about this. All of the behaviors described in the post can be understood as a consequence of next-word prediction, it’s just that what performing extremely well at next-word prediction looks like is counterintuitive. There’s no need to posit a difference in inner/outer objective.
An exact match? No. But the observations in this post don’t point towards any particular mismatch, because the behaviors described would be seen even if the inner objective was perfectly aligned with the outer.
Thanks, you’ve put a deep vague unease of mine into succinct form.
And of course, now I come to think about it, a very wise man said it even more succinctly a very long time ago:
Adaption Executors, Not Fitness Maximizers.
I don’t think “Adaptations Executors VS Fitness Maximizers” is a good way of thinking about this. All of the behaviors described in the post can be understood as a consequence of next-word prediction, it’s just that what performing extremely well at next-word prediction looks like is counterintuitive. There’s no need to posit a difference in inner/outer objective.
Is there a reason to suspect an exact match between inner and outer objective?
An exact match? No. But the observations in this post don’t point towards any particular mismatch, because the behaviors described would be seen even if the inner objective was perfectly aligned with the outer.