New statement: given unlimited compute-power, a cross-domain optimization algorithm is simple. Agreed?
First, I’m not particularly interested in infinities. Truly unlimited computing power implies, for example, that you can just do an exhaustive brute-force search through the entire solution space and be done in an instant. Simple, yes, but not very meaningful.
Second, no, I do not agree. because you’re sweeping under the rug the complexities of, for example, applying your cost function to different domains. You can construct sufficiently simple optimizers, it’s just that they won’t be very… intelligent.
Right, but when dealing with a reinforcement learner like AIXI, it has no fixed cost function that it has to somehow shoehorn into dealing with different computational/conceptual domains. How the environment responds to AIXI’s actions and how the environment rewards AIXI are learned phenomena, so the only planning algorithm is expectimax. The implicit “reward function” being learned might be simple or might be complicated, but that doesn’t matter: AIXI will learn it by updating its distribution of probabilities across Turing machine programs just as well, either way.
it has no fixed cost function that it has to somehow shoehorn into dealing with different computational/conceptual domains. How the environment responds to AIXI’s actions and how the environment rewards AIXI are learned phenomena
The “cost function” here is how each state of the world (=environment) gets converted to a single number (=reward). That does not look simple to me.
Again, it doesn’t get converted at all. To use the terminology of machine learning, it’s not a function computed over the feature-vector, reward is instead represented as a feature itself.
Instead of:
reward = utility_function(world)
You have:
Inductive WorldState w : Type :=
| world : w -> integer -> WorldState w.
With the w being an arbitrary data-type representing the symbol observed on the agent’s input channel and the integer being the reward signal, similarly observed on the agent’s input channel. A full WorldState w datum is then received on the input channel in each interaction cycle.
Since AIXI’s learning model is to perform Solomonoff Induction to thus find the Turing machine that most-probably generated all previously-seen input observations, the task of “decoding” the reward is thus performed as part of Solomonoff Induction.
Really? To remind you, we’re discussing this in the context of a general-purpose super-intelligent AI which, if we get a couple of bits wrong, might just tile the universe with paperclips and possibly construct a hell for all the simulated humans who ever lived, just for kicks. And how does that AI know what to do?
A human operator.
X-D
On a bit more serious note, defining a few of the really hard parts as “somebody else’s problem” does not mean you solved the issue. Remember, this started by you claiming that intelligence is very simple.
Remember, this started by you claiming that intelligence is very simple.
You’ve wasted five replies when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple and if you try to show me how AIXI works, I’ll just change what I mean by ‘simple’.”
when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple
That’s not true. Cross-domain optimization algorithms can be simple, it’s just that when they are simple they can hardly be described as intelligent. What I don’t believe is that intelligence is nothing but a cross-domain optimizer with a lot of computing power.
GLUTs are simple too. Most people think they are not intelligent, and everyone thinks that interesting one’s can’t exist in our universe. Using “is” to mean “is according to an unrealiseable theory” is not the best of habits.
First, I’m not particularly interested in infinities. Truly unlimited computing power implies, for example, that you can just do an exhaustive brute-force search through the entire solution space and be done in an instant. Simple, yes, but not very meaningful.
Second, no, I do not agree. because you’re sweeping under the rug the complexities of, for example, applying your cost function to different domains. You can construct sufficiently simple optimizers, it’s just that they won’t be very… intelligent.
What cost function? It’s a reinforcement learner.
cost function = utility function = fitness function = reward (all with appropriate signs)
Right, but when dealing with a reinforcement learner like AIXI, it has no fixed cost function that it has to somehow shoehorn into dealing with different computational/conceptual domains. How the environment responds to AIXI’s actions and how the environment rewards AIXI are learned phenomena, so the only planning algorithm is expectimax. The implicit “reward function” being learned might be simple or might be complicated, but that doesn’t matter: AIXI will learn it by updating its distribution of probabilities across Turing machine programs just as well, either way.
The “cost function” here is how each state of the world (=environment) gets converted to a single number (=reward). That does not look simple to me.
Again, it doesn’t get converted at all. To use the terminology of machine learning, it’s not a function computed over the feature-vector, reward is instead represented as a feature itself.
Instead of:
You have:
With the
w
being an arbitrary data-type representing the symbol observed on the agent’s input channel and theinteger
being the reward signal, similarly observed on the agent’s input channel. A fullWorldState w
datum is then received on the input channel in each interaction cycle.Since AIXI’s learning model is to perform Solomonoff Induction to thus find the Turing machine that most-probably generated all previously-seen input observations, the task of “decoding” the reward is thus performed as part of Solomonoff Induction.
So where, then, is reward coming from? What puts it into the AIXI’s input channel?
In AIXI’s design? A human operator.
Really? To remind you, we’re discussing this in the context of a general-purpose super-intelligent AI which, if we get a couple of bits wrong, might just tile the universe with paperclips and possibly construct a hell for all the simulated humans who ever lived, just for kicks. And how does that AI know what to do?
A human operator.
X-D
On a bit more serious note, defining a few of the really hard parts as “somebody else’s problem” does not mean you solved the issue. Remember, this started by you claiming that intelligence is very simple.
You’ve wasted five replies when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple and if you try to show me how AIXI works, I’ll just change what I mean by ‘simple’.”
What a jerk.
That’s not true. Cross-domain optimization algorithms can be simple, it’s just that when they are simple they can hardly be described as intelligent. What I don’t believe is that intelligence is nothing but a cross-domain optimizer with a lot of computing power.
I accept your admission of losing :-P
GLUTs are simple too. Most people think they are not intelligent, and everyone thinks that interesting one’s can’t exist in our universe. Using “is” to mean “is according to an unrealiseable theory” is not the best of habits.