I didn’t think about the structure that search-in territory / model-based optimization allows, but in those posts I mention that most optimization iterates back and forth between search-in-model and search-in-territory, and that a key feature which I think you’re ignoring here is cost of samples / iteration.
Note: I think that this is a better written-version of what I was discussing when I revisited selection versus control, here: https://www.lesswrong.com/posts/BEMvcaeixt3uEqyBk/what-does-optimization-mean-again-optimizing-and-goodhart (The other posts in that series seem relevant.)
I didn’t think about the structure that search-in territory / model-based optimization allows, but in those posts I mention that most optimization iterates back and forth between search-in-model and search-in-territory, and that a key feature which I think you’re ignoring here is cost of samples / iteration.