There is a useful generality axis and a useful optimality axis and you can meaningfully progress along both at the same time. If you think no free lunch theorems disprove this then you are confused about no free lunch theorems.
Whether or not an axis is “useful” depends on your utility function.
If you only care about compressing certain books from The Library of Babel, then “general optimality” is real — but if you value them all equally, then “general optimality” is fake.
When real, the meaning of “general optimality” depends on which books you deem worthy of consideration.
Within the scope of an analysis whose consideration is restricted to the cluster of sequences typical to the Internet, the term “general optimality” may be usefully applied to a predictive model. Such analysis is unfit to reason about search over a design-space — unless that design-space excludes all out-of-scope sequences.
Which is equivalent to saying if you only care about a situation where none of your observations correlate with any of your other observations and none of your actions interact with any of your observations then your observations are valueless. Which is a true but empty statement, and doesn’t meaningfully affect whether there is an optimality axis that it’s possible to be better on.
There is a useful generality axis and a useful optimality axis and you can meaningfully progress along both at the same time. If you think no free lunch theorems disprove this then you are confused about no free lunch theorems.
Whether or not an axis is “useful” depends on your utility function.
If you only care about compressing certain books from The Library of Babel, then “general optimality” is real — but if you value them all equally, then “general optimality” is fake.
When real, the meaning of “general optimality” depends on which books you deem worthy of consideration.
Within the scope of an analysis whose consideration is restricted to the cluster of sequences typical to the Internet, the term “general optimality” may be usefully applied to a predictive model. Such analysis is unfit to reason about search over a design-space — unless that design-space excludes all out-of-scope sequences.
Which is equivalent to saying if you only care about a situation where none of your observations correlate with any of your other observations and none of your actions interact with any of your observations then your observations are valueless. Which is a true but empty statement, and doesn’t meaningfully affect whether there is an optimality axis that it’s possible to be better on.