So here’s a value learning scheme: try to squish the world and natural language into the same latent space, just with different input/output functions.
What’s the input-output function in the two cases?
I’m also generally confused about why you’re calling this thing “two linked models” rather than “one model”. For example, I would say that a brain has one world model that is interlinked with speech and vision and action, etc. Right?
What’s the input-output function in the two cases?
Good question :) We need the AI to have a persistent internal representation of the world so that it’s not limited to preferences directly over sensory inputs. Many possible functions would work, and in various places (like comparison to CIRL), I’ve mentioned that it would be really useful to have some properties of a hierarchical probabilistic model, but as an aid to imagination I mostly just thought of a big ol’ RNN.
We want the world model to share associations between words and observations, but we don’t want it to share dynamics (one text-state following another is a very different process from one world-state following another). It might be sufficient for the encoding/decoding functions from observations to be RNNs, and the encoding/decoding functions from text just to be non-recurrent neural networks on patches of text.
That is, if we call the text T the observations (at time t) Ot, and the internal state St, we’d have the encoding function (Ot,St)→St+1, decoding something like (Ot,St)→Ot+1, and also S→T and T→S. And then you could compose these functions to get things like (Ot,St)→Tt+1. Does this answer your question, and do you think it brings new problems to light? I’m more interested in general problems or patterns than in problems specific to RNNs (like initialization of the state), because I’m sort of assuming that this is just a placeholder for future technology that would have a shot at learning a model of the entire world.
For example, I would say that a brain has one world model that is interlinked with speech and vision and action, etc. Right?
Right. I sort of flip-flop on this, also calling it “one simultaneous model” plenty. If there are multiple “models” in here, it’s because different tasks use different subsets of its parts, and if we do training on multiple tasks, those subsets get trained together. But of course the point is that the subsets overlap.
What’s the input-output function in the two cases?
I’m also generally confused about why you’re calling this thing “two linked models” rather than “one model”. For example, I would say that a brain has one world model that is interlinked with speech and vision and action, etc. Right?
Good question :) We need the AI to have a persistent internal representation of the world so that it’s not limited to preferences directly over sensory inputs. Many possible functions would work, and in various places (like comparison to CIRL), I’ve mentioned that it would be really useful to have some properties of a hierarchical probabilistic model, but as an aid to imagination I mostly just thought of a big ol’ RNN.
We want the world model to share associations between words and observations, but we don’t want it to share dynamics (one text-state following another is a very different process from one world-state following another). It might be sufficient for the encoding/decoding functions from observations to be RNNs, and the encoding/decoding functions from text just to be non-recurrent neural networks on patches of text.
That is, if we call the text T the observations (at time t) Ot, and the internal state St, we’d have the encoding function (Ot,St)→St+1, decoding something like (Ot,St)→Ot+1, and also S→T and T→S. And then you could compose these functions to get things like (Ot,St)→Tt+1. Does this answer your question, and do you think it brings new problems to light? I’m more interested in general problems or patterns than in problems specific to RNNs (like initialization of the state), because I’m sort of assuming that this is just a placeholder for future technology that would have a shot at learning a model of the entire world.
Right. I sort of flip-flop on this, also calling it “one simultaneous model” plenty. If there are multiple “models” in here, it’s because different tasks use different subsets of its parts, and if we do training on multiple tasks, those subsets get trained together. But of course the point is that the subsets overlap.