I was imagining the case where O(q−1) is too slow, i.e. where we want the AI to actually perform a search.
The second paragraph is what I had in mind. Note that in this case you are maximizing over learnable cost functions rather than all cost functions.
I was imagining the case where O(q−1) is too slow, i.e. where we want the AI to actually perform a search.
The second paragraph is what I had in mind. Note that in this case you are maximizing over learnable cost functions rather than all cost functions.