Again, it doesn’t get converted at all. To use the terminology of machine learning, it’s not a function computed over the feature-vector, reward is instead represented as a feature itself.
Instead of:
reward = utility_function(world)
You have:
Inductive WorldState w : Type :=
| world : w -> integer -> WorldState w.
With the w being an arbitrary data-type representing the symbol observed on the agent’s input channel and the integer being the reward signal, similarly observed on the agent’s input channel. A full WorldState w datum is then received on the input channel in each interaction cycle.
Since AIXI’s learning model is to perform Solomonoff Induction to thus find the Turing machine that most-probably generated all previously-seen input observations, the task of “decoding” the reward is thus performed as part of Solomonoff Induction.
Really? To remind you, we’re discussing this in the context of a general-purpose super-intelligent AI which, if we get a couple of bits wrong, might just tile the universe with paperclips and possibly construct a hell for all the simulated humans who ever lived, just for kicks. And how does that AI know what to do?
A human operator.
X-D
On a bit more serious note, defining a few of the really hard parts as “somebody else’s problem” does not mean you solved the issue. Remember, this started by you claiming that intelligence is very simple.
Remember, this started by you claiming that intelligence is very simple.
You’ve wasted five replies when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple and if you try to show me how AIXI works, I’ll just change what I mean by ‘simple’.”
when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple
That’s not true. Cross-domain optimization algorithms can be simple, it’s just that when they are simple they can hardly be described as intelligent. What I don’t believe is that intelligence is nothing but a cross-domain optimizer with a lot of computing power.
GLUTs are simple too. Most people think they are not intelligent, and everyone thinks that interesting one’s can’t exist in our universe. Using “is” to mean “is according to an unrealiseable theory” is not the best of habits.
Again, it doesn’t get converted at all. To use the terminology of machine learning, it’s not a function computed over the feature-vector, reward is instead represented as a feature itself.
Instead of:
You have:
With the
w
being an arbitrary data-type representing the symbol observed on the agent’s input channel and theinteger
being the reward signal, similarly observed on the agent’s input channel. A fullWorldState w
datum is then received on the input channel in each interaction cycle.Since AIXI’s learning model is to perform Solomonoff Induction to thus find the Turing machine that most-probably generated all previously-seen input observations, the task of “decoding” the reward is thus performed as part of Solomonoff Induction.
So where, then, is reward coming from? What puts it into the AIXI’s input channel?
In AIXI’s design? A human operator.
Really? To remind you, we’re discussing this in the context of a general-purpose super-intelligent AI which, if we get a couple of bits wrong, might just tile the universe with paperclips and possibly construct a hell for all the simulated humans who ever lived, just for kicks. And how does that AI know what to do?
A human operator.
X-D
On a bit more serious note, defining a few of the really hard parts as “somebody else’s problem” does not mean you solved the issue. Remember, this started by you claiming that intelligence is very simple.
You’ve wasted five replies when you should have just said at the beginning, “I don’t believe cross-domain optimization algorithms can be simple and if you try to show me how AIXI works, I’ll just change what I mean by ‘simple’.”
What a jerk.
That’s not true. Cross-domain optimization algorithms can be simple, it’s just that when they are simple they can hardly be described as intelligent. What I don’t believe is that intelligence is nothing but a cross-domain optimizer with a lot of computing power.
I accept your admission of losing :-P
GLUTs are simple too. Most people think they are not intelligent, and everyone thinks that interesting one’s can’t exist in our universe. Using “is” to mean “is according to an unrealiseable theory” is not the best of habits.