“harness” is doing a lot of work there. If incoherent search processes are actually superior then VNM agents are not the type of pattern that is evolutionary stable, so no “harnessing” is possible in the long term, more like a “dissolving into”.
Unless you’re using “VNM agent” to mean something like “the definitionally best agent”, in which case sure, but a VNM agent is a pretty precise type of algorithm defined by axioms that are equivalent to saying it is perfectly resistant to being Dutch booked.
Resistance to Dutch booking is cool, seems valuable, but not something I’d spent limited compute resources on getting six nines of reliability on. Seems like evolution agrees, so far: the successful organisms we observe in nature, from bacteria to humans, are not VNM agents and in fact are easily Dutch booked. The question is whether this changes as evolution progresses and intelligence increases.
Minor points just to get them out of the way:
I think Bayesian optimization still makes sense with infinite compute if you have limited data (infinite compute doesn’t imply perfect knowledge, you still have to run experiments in the world outside of your computer).
The reason I specified evolutionary search is because that’s the claim I see Lehman & Stanley as making—that algorithms pursuing simple objectives tend to not be robust in an evolutionary sense. I’m less confident making claims about broader classes of optimization but not intentionally excluding them
Meta point: it feels like we’re bouncing between incompatible and partly-specified formalisms before we even know what the high level worldview diff is.
To that end, I’m curious what you think the implications of the Lehman & Stanley hypothesis would be—supposing it were shown even for architectures that allow planning to search, which I agree their paper does not do. So yes you can trivially exhibit a “goal-oriented search over good search policies” that does better than their naive novelty search, but what if it turns out a “novelty-oriented search over novelty-oriented search policies” does better still? Would this be a crux for you, or is this not even a coherent hypothetical in your ontology of optimization?