No Free Lunch theorems only apply to a system that is at maximum entropy. Generally intelligent systems (e.g. AIXI) are possible because the simplicity prior is useful in our own universe (in which entropy is not at a maximum). Instrumental convergence isn’t at the root of intelligence, simplicity is.
I think “maximum entropy” is mostly just another way of saying “no common tasks”. One can probably artificially construct cases where this isn’t the case, but the cases that show up in reality seem to follow that structure.
As an example, consider two tasks with no common subgoals: say factoring large integers and winning at Go.
Existing general-purpose methods for winning at Go abstract the state space and action space down to what is specified by the rules, i.e. a 19x19 grid where you can place black or white stones under certain conditions. Thus a wide variety of different Go games will, for instance, start with the same state according to these general-purpose methods, and therefore be reduced to common subgoals.
They go further. There are patterns such as “ladders” that can be learned to inform strategy in many different states, and there are simple rules that can be used to figure out the winner from a final state.
These are the sorts of things that make it possible to create general-purpose Go algorithms.
Now you are right that these things don’t show up with factoring large integer at all. Under my model, this predicts that the algorithms we use for solving Go are basically distinct from the algorithms that we use for factoring large integers. Which also seems true. That said, there are some common subgoals between factoring large integers and winning at Go, such as having a lot of computation power; and indeed we often seem to use computers for both.
Imagine we are trying to find an algorithm that will excel at both of these while running on a Turing machine. There are no real-world resources to acquire, hence instrumental convergence isn’t even relevant. However an algorithm that assumes a simplicity prior (like AIXI) will still outperform one that doesn’t (say sampling all possible Go playing/number factoring algorithms and then picking the one that performs the best).
I think AIXI would underperform the algorithm of “figure out whether you are playing Go or factoring numbers, and then either play Go or factor the number”? Not sure what you are getting at here.
I think “maximum entropy” is mostly just another way of saying “no common tasks”. One can probably artificially construct cases where this isn’t the case, but the cases that show up in reality seem to follow that structure.
Existing general-purpose methods for winning at Go abstract the state space and action space down to what is specified by the rules, i.e. a 19x19 grid where you can place black or white stones under certain conditions. Thus a wide variety of different Go games will, for instance, start with the same state according to these general-purpose methods, and therefore be reduced to common subgoals.
They go further. There are patterns such as “ladders” that can be learned to inform strategy in many different states, and there are simple rules that can be used to figure out the winner from a final state.
These are the sorts of things that make it possible to create general-purpose Go algorithms.
Now you are right that these things don’t show up with factoring large integer at all. Under my model, this predicts that the algorithms we use for solving Go are basically distinct from the algorithms that we use for factoring large integers. Which also seems true. That said, there are some common subgoals between factoring large integers and winning at Go, such as having a lot of computation power; and indeed we often seem to use computers for both.
I think AIXI would underperform the algorithm of “figure out whether you are playing Go or factoring numbers, and then either play Go or factor the number”? Not sure what you are getting at here.