A system that looks like “actively try to make paperclips no matter what” seems like the sort of thing that an evolution-like process could spit out pretty easily. A system that looks like “robustly maximize paperclips no matter what” maybe not so much.
I expect it’s a lot easier to make a thing which consistently executes actions which have worked in the past than to make a thing that models the world well enough to calculate expected value over a bunch of plans and choose the best one, and have that actually work (especially if there are other agents in the world, even if those other agents aren’t hostile—see the winner’s curse).
A system that looks like “actively try to make paperclips no matter what” seems like the sort of thing that an evolution-like process could spit out pretty easily. A system that looks like “robustly maximize paperclips no matter what” maybe not so much.
I expect it’s a lot easier to make a thing which consistently executes actions which have worked in the past than to make a thing that models the world well enough to calculate expected value over a bunch of plans and choose the best one, and have that actually work (especially if there are other agents in the world, even if those other agents aren’t hostile—see the winner’s curse).