Here are some remarks for anybody who wants to investigate the
problem of Learned world-models with hardcoded pieces—hope they
will be useful.
My main message is that when thinking about this problem, you should
be very aware that there are fashions in AI research. The current
fashion is all about ML, about creating learned world models. In the
most extreme expression of this fashion, represented by the essay the
bitter
lesson,
even the act of hand-coding some useful prior for the learned world
model is viewed with suspicion. It is viewed as domain-specific or benchmark-specific
tweaking that will not teach us anything about making the next big ML
breakthrough.
Fashion used to be different: there were times when AI pioneers built
robots with fully hardcoded world models and were very proud of it.
Hardcoding parts of the world model never went out of fashion in the
applied AI and cyber-physical systems community, e.g. with people who
build actual industrial robots, and people who want to build safe
self-driving cars.
Now, you are saying that My default presumption is that our AGIs will
learn a world-model from scratch, i.e. learn their full world model from
scratch. In this, you are following the
prevailing fashion in theoretical (as opposed to applied) ML. But if
you follow that fashion it will blind you to a whole class of
important solutions for building learned world models with hardcoded
pieces.
It is very easy, an almost routine software engineering task, to build
predictive world models that combine both hardcoded and learned
pieces. One example of building such a model is to implement it as a
Bayesian network or a Causal graph. The key thing to note here is that
each single probability distribution/table for each graph node (each
structural function in case of a causal graph) might be produced
either by machine learning from a training set, or simply be
hard-coded by the programmer. See my sequence counterfactual
planning for some
examples of the design freedom this creates in adding safety features
to an AGI agent’s world model.
Good luck with your further research! Feel free to reach out if you
want to discusses this problem of mixed-mode model construction
further.
Now, you are saying that My default presumption is that our AGIs will learn a world-model from scratch, i.e. learn their full world model from scratch. In this, you are following the prevailing fashion in theoretical (as opposed to applied) ML. But if you follow that fashion it will blind you to a whole class of important solutions for building learned world models with hardcoded pieces.
Just FYI, for me personally this presumption comes from my trying to understand human brain algorithms, on the theory that people could plausibly build AGIs using similar algorithms. After all, that’s definitely a possible way to build AGI, and a lot of people are trying to do that as we speak. I think the world-model exists in the cortex, and I think that the cortex is initialized from random weights (or something equivalent). See discussion of “learning-from-scratch-ism” here.
there were times when AI pioneers built robots with fully hardcoded world models and were very proud of it.
So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn’t seem like something to be proud of! Or am I misunderstanding you?
It is very easy, an almost routine software engineering task, to build predictive world models that combine both hardcoded and learned pieces. One example of building such a model is to implement it as a Bayesian network or a Causal graph. The key thing to note here is that each single probability distribution/table for each graph node (each structural function in case of a causal graph) might be produced either by machine learning from a training set, or simply be hard-coded by the programmer.
If I understand you correctly, you’re assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world—from dowsing rods to infinite series—and they’ll allow some aspects of the probabilities and/or connections in that model to be learned, and they’ll manually lock in other probabilities and/or connections. Is that correct?
If so, the part where I’m skeptical is the first step, where the programmer puts in nodes for everything that the robot will ever know about the world. I don’t think that approach scales to AGI. I think the robot needs to be able to put in new nodes, so that it can invent new concepts, walk into new rooms, learn to handle new objects, etc.
So then we start with the 5,000,000 nodes that the programmer put in, but the robot starts adding in its own nodes. Maybe the programmer’s nodes are labeled (“infinite series”, “dowsing rods”, etc.), but the robot’s are by default unlabeled (“new_node_1” is a certain kind of chess tactic that it just thought of, “new_node_2″ is this particular piece of Styrofoam that’s stuck to my robot leg right now, etc. etc.)
And then we have problems like: how do we ensure that when the robot comes across an infinite series, it uses the “infinite series” node that the programmer put in, rather than building its own new node? Especially if the programmer’s “infinite series” node doesn’t quite capture all the rich complexity of infinite series that the robot has figured out? Or conversely, how do we ensure that the robot doesn’t start using the hardcoded “infinite series” node for the wrong things?
The case I’m mainly interested in is a hardcoded human model, and then the concerns would be mainly anthropomorphization (if the AGI applies the hardcoded human model to teddy bears, then it would wind up trading off the welfare of humans against the “welfare” of teddy bears) and dehumanization (where the AGI reasons about humans by building its own human model from scratch, i.e. out of new nodes, ignoring the hardcoded human model, the same way that it would understand a complicated machine that it just invented. And then that screws up our attempt to install pro-social motivations, which involved the hardcoded human model. So it disregards the welfare of some or all humans.)
Just FYI, for me personally this [from scratch] presumption comes from my trying to understand human brain algorithms.
Thanks for clarifying. I see how you might apply a ‘from scratch’
assumption to the neocortex. On the other hand, if the problem is to
include both learned and hard-coded parts in a world model, one might
take inspiration from things like the visual cortex, from the observation that while initial
weights in the visual cortex neurons may be random (not sure if this is biologically true though), the broad neural wiring has been hardcoded by evolution. In AI terminology, this wiring represents a
hardcoded prior, or (if you want to take the stance that you are
learning without a prior) a hyperparameter.
So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn’t seem like something to be proud of! Or am I misunderstanding you?
The robots I am talking about were usually not completely blind, but
they had very limited sensing capabilities. The point about
hardcoding here is that the processing steps which turned sensor signals
into world model details were often hardcoded. Other necessary world
model details for which no sensors were available would have to be
hardcoded as well.
If I understand you correctly, you’re assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world
I do not think you not understand me correctly.
You are assuming I am talking about handcoding giant networks where each individual node might encode a
single basic concept like a dowsing rod, and then ML may even add more
nodes dynamically. This is not at all what the example networks I
linked to look like, and not at all how ML works on them.
Look, I included this link to the sequence to clarify exactly what I mean:
please click the link and take a look. The planning world causal graphs you
see there are not world models for toy agents in toy worlds, they are
plausible AGI agent world models. A single node typically represents
a truly giant chunk of current or future world state. The learned
details of a complex world are all inside the learned structural
functions, in what I call the model parameter L in the sequence.
The linked-to approach is not the only way to combine learned and hardcoded model parts, but think it shows very useful technique. My more general point is also that there are a lot of not-in-fashion historical examples that may offer further inspiration.
Well, I did try reading your posts 6 months ago, and I found them confusing, in large part because I was thinking about the exact problem I’m talking about here, and I didn’t understand how your proposal would get around that problem or solve it. We had a comment exchange here somewhat related to that, but I was still confused after the exchange … and it wound up on my to-do list … and it’s still on my to-do list to this day … :-P
Definitely my sequence of 6 months ago is not about doing counterfactual planning by modifying somewhat opaque million-node causal networks that might be generated by machine learning. The main idea is to show planning world model modifications that you can apply even when you have no way of decoding opaque machine-learned functions.
Here are some remarks for anybody who wants to investigate the problem of Learned world-models with hardcoded pieces—hope they will be useful.
My main message is that when thinking about this problem, you should be very aware that there are fashions in AI research. The current fashion is all about ML, about creating learned world models. In the most extreme expression of this fashion, represented by the essay the bitter lesson, even the act of hand-coding some useful prior for the learned world model is viewed with suspicion. It is viewed as domain-specific or benchmark-specific tweaking that will not teach us anything about making the next big ML breakthrough.
Fashion used to be different: there were times when AI pioneers built robots with fully hardcoded world models and were very proud of it.
Hardcoding parts of the world model never went out of fashion in the applied AI and cyber-physical systems community, e.g. with people who build actual industrial robots, and people who want to build safe self-driving cars.
Now, you are saying that My default presumption is that our AGIs will learn a world-model from scratch, i.e. learn their full world model from scratch. In this, you are following the prevailing fashion in theoretical (as opposed to applied) ML. But if you follow that fashion it will blind you to a whole class of important solutions for building learned world models with hardcoded pieces.
It is very easy, an almost routine software engineering task, to build predictive world models that combine both hardcoded and learned pieces. One example of building such a model is to implement it as a Bayesian network or a Causal graph. The key thing to note here is that each single probability distribution/table for each graph node (each structural function in case of a causal graph) might be produced either by machine learning from a training set, or simply be hard-coded by the programmer. See my sequence counterfactual planning for some examples of the design freedom this creates in adding safety features to an AGI agent’s world model.
Good luck with your further research! Feel free to reach out if you want to discusses this problem of mixed-mode model construction further.
Just FYI, for me personally this presumption comes from my trying to understand human brain algorithms, on the theory that people could plausibly build AGIs using similar algorithms. After all, that’s definitely a possible way to build AGI, and a lot of people are trying to do that as we speak. I think the world-model exists in the cortex, and I think that the cortex is initialized from random weights (or something equivalent). See discussion of “learning-from-scratch-ism” here.
So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn’t seem like something to be proud of! Or am I misunderstanding you?
If I understand you correctly, you’re assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world—from dowsing rods to infinite series—and they’ll allow some aspects of the probabilities and/or connections in that model to be learned, and they’ll manually lock in other probabilities and/or connections. Is that correct?
If so, the part where I’m skeptical is the first step, where the programmer puts in nodes for everything that the robot will ever know about the world. I don’t think that approach scales to AGI. I think the robot needs to be able to put in new nodes, so that it can invent new concepts, walk into new rooms, learn to handle new objects, etc.
So then we start with the 5,000,000 nodes that the programmer put in, but the robot starts adding in its own nodes. Maybe the programmer’s nodes are labeled (“infinite series”, “dowsing rods”, etc.), but the robot’s are by default unlabeled (“new_node_1” is a certain kind of chess tactic that it just thought of, “new_node_2″ is this particular piece of Styrofoam that’s stuck to my robot leg right now, etc. etc.)
And then we have problems like: how do we ensure that when the robot comes across an infinite series, it uses the “infinite series” node that the programmer put in, rather than building its own new node? Especially if the programmer’s “infinite series” node doesn’t quite capture all the rich complexity of infinite series that the robot has figured out? Or conversely, how do we ensure that the robot doesn’t start using the hardcoded “infinite series” node for the wrong things?
The case I’m mainly interested in is a hardcoded human model, and then the concerns would be mainly anthropomorphization (if the AGI applies the hardcoded human model to teddy bears, then it would wind up trading off the welfare of humans against the “welfare” of teddy bears) and dehumanization (where the AGI reasons about humans by building its own human model from scratch, i.e. out of new nodes, ignoring the hardcoded human model, the same way that it would understand a complicated machine that it just invented. And then that screws up our attempt to install pro-social motivations, which involved the hardcoded human model. So it disregards the welfare of some or all humans.)
Thanks for clarifying. I see how you might apply a ‘from scratch’ assumption to the neocortex. On the other hand, if the problem is to include both learned and hard-coded parts in a world model, one might take inspiration from things like the visual cortex, from the observation that while initial weights in the visual cortex neurons may be random (not sure if this is biologically true though), the broad neural wiring has been hardcoded by evolution. In AI terminology, this wiring represents a hardcoded prior, or (if you want to take the stance that you are learning without a prior) a hyperparameter.
The robots I am talking about were usually not completely blind, but they had very limited sensing capabilities. The point about hardcoding here is that the processing steps which turned sensor signals into world model details were often hardcoded. Other necessary world model details for which no sensors were available would have to be hardcoded as well.
I do not think you not understand me correctly.
You are assuming I am talking about handcoding giant networks where each individual node might encode a single basic concept like a dowsing rod, and then ML may even add more nodes dynamically. This is not at all what the example networks I linked to look like, and not at all how ML works on them.
Look, I included this link to the sequence to clarify exactly what I mean: please click the link and take a look. The planning world causal graphs you see there are not world models for toy agents in toy worlds, they are plausible AGI agent world models. A single node typically represents a truly giant chunk of current or future world state. The learned details of a complex world are all inside the learned structural functions, in what I call the model parameter L in the sequence.
The linked-to approach is not the only way to combine learned and hardcoded model parts, but think it shows very useful technique. My more general point is also that there are a lot of not-in-fashion historical examples that may offer further inspiration.
Well, I did try reading your posts 6 months ago, and I found them confusing, in large part because I was thinking about the exact problem I’m talking about here, and I didn’t understand how your proposal would get around that problem or solve it. We had a comment exchange here somewhat related to that, but I was still confused after the exchange … and it wound up on my to-do list … and it’s still on my to-do list to this day … :-P
I know all about that kind of to-do list.
Definitely my sequence of 6 months ago is not about doing counterfactual planning by modifying somewhat opaque million-node causal networks that might be generated by machine learning. The main idea is to show planning world model modifications that you can apply even when you have no way of decoding opaque machine-learned functions.