Steven Byrnes comments on Research agenda update

Steven Byrnes 11 Aug 2021 14:10 UTC
LW: 3 AF: 1
AF
Now, you are saying that My default presumption is that our AGIs will learn a world-model from scratch, i.e. learn their full world model from scratch. In this, you are following the prevailing fashion in theoretical (as opposed to applied) ML. But if you follow that fashion it will blind you to a whole class of important solutions for building learned world models with hardcoded pieces.
Just FYI, for me personally this presumption comes from my trying to understand human brain algorithms, on the theory that people could plausibly build AGIs using similar algorithms. After all, that’s definitely a possible way to build AGI, and a lot of people are trying to do that as we speak. I think the world-model exists in the cortex, and I think that the cortex is initialized from random weights (or something equivalent). See discussion of “learning-from-scratch-ism” here.
there were times when AI pioneers built robots with fully hardcoded world models and were very proud of it.
So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn’t seem like something to be proud of! Or am I misunderstanding you?
It is very easy, an almost routine software engineering task, to build predictive world models that combine both hardcoded and learned pieces. One example of building such a model is to implement it as a Bayesian network or a Causal graph. The key thing to note here is that each single probability distribution/table for each graph node (each structural function in case of a causal graph) might be produced either by machine learning from a training set, or simply be hard-coded by the programmer.
If I understand you correctly, you’re assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world—from dowsing rods to infinite series—and they’ll allow some aspects of the probabilities and/or connections in that model to be learned, and they’ll manually lock in other probabilities and/or connections. Is that correct?
If so, the part where I’m skeptical is the first step, where the programmer puts in nodes for everything that the robot will ever know about the world. I don’t think that approach scales to AGI. I think the robot needs to be able to put in new nodes, so that it can invent new concepts, walk into new rooms, learn to handle new objects, etc.
So then we start with the 5,000,000 nodes that the programmer put in, but the robot starts adding in its own nodes. Maybe the programmer’s nodes are labeled (“infinite series”, “dowsing rods”, etc.), but the robot’s are by default unlabeled (“new_node_1” is a certain kind of chess tactic that it just thought of, “new_node_2″ is this particular piece of Styrofoam that’s stuck to my robot leg right now, etc. etc.)
And then we have problems like: how do we ensure that when the robot comes across an infinite series, it uses the “infinite series” node that the programmer put in, rather than building its own new node? Especially if the programmer’s “infinite series” node doesn’t quite capture all the rich complexity of infinite series that the robot has figured out? Or conversely, how do we ensure that the robot doesn’t start using the hardcoded “infinite series” node for the wrong things?
The case I’m mainly interested in is a hardcoded human model, and then the concerns would be mainly anthropomorphization (if the AGI applies the hardcoded human model to teddy bears, then it would wind up trading off the welfare of humans against the “welfare” of teddy bears) and dehumanization (where the AGI reasons about humans by building its own human model from scratch, i.e. out of new nodes, ignoring the hardcoded human model, the same way that it would understand a complicated machine that it just invented. And then that screws up our attempt to install pro-social motivations, which involved the hardcoded human model. So it disregards the welfare of some or all humans.)
- Koen.Holtman 12 Aug 2021 18:18 UTC
  LW: 1 AF: 1
  AF Parent
  
  Just FYI, for me personally this [from scratch] presumption comes from my trying to understand human brain algorithms.
  
  Thanks for clarifying. I see how you might apply a ‘from scratch’ assumption to the neocortex. On the other hand, if the problem is to include both learned and hard-coded parts in a world model, one might take inspiration from things like the visual cortex, from the observation that while initial weights in the visual cortex neurons may be random (not sure if this is biologically true though), the broad neural wiring has been hardcoded by evolution. In AI terminology, this wiring represents a hardcoded prior, or (if you want to take the stance that you are learning without a prior) a hyperparameter.
  
  So, the AI pioneers wrote into their source code whether each door in the building is open or closed? And if a door is closed when the programmer expected it to be open, then the robot would just slam right into the closed door?? That doesn’t seem like something to be proud of! Or am I misunderstanding you?
  
  The robots I am talking about were usually not completely blind, but they had very limited sensing capabilities. The point about hardcoding here is that the processing steps which turned sensor signals into world model details were often hardcoded. Other necessary world model details for which no sensors were available would have to be hardcoded as well.
  
  If I understand you correctly, you’re assuming that the programmer will manually set up a giant enormous Bayesian network that represents everything in the world
  
  I do not think you not understand me correctly.
  
  You are assuming I am talking about handcoding giant networks where each individual node might encode a single basic concept like a dowsing rod, and then ML may even add more nodes dynamically. This is not at all what the example networks I linked to look like, and not at all how ML works on them.
  
  Look, I included this link to the sequence to clarify exactly what I mean: please click the link and take a look. The planning world causal graphs you see there are not world models for toy agents in toy worlds, they are plausible AGI agent world models. A single node typically represents a truly giant chunk of current or future world state. The learned details of a complex world are all inside the learned structural functions, in what I call the model parameter $L$ in the sequence.
  
  The linked-to approach is not the only way to combine learned and hardcoded model parts, but think it shows very useful technique. My more general point is also that there are a lot of not-in-fashion historical examples that may offer further inspiration.
  - Steven Byrnes 12 Aug 2021 20:47 UTC
    LW: 2 AF: 1
    AF Parent
    Well, I did try reading your posts 6 months ago, and I found them confusing, in large part because I was thinking about the exact problem I’m talking about here, and I didn’t understand how your proposal would get around that problem or solve it. We had a comment exchange here somewhat related to that, but I was still confused after the exchange … and it wound up on my to-do list … and it’s still on my to-do list to this day … :-P
    - Koen.Holtman 13 Aug 2021 9:04 UTC
      LW: 1 AF: 1
      AF Parent
      I know all about that kind of to-do list.
      
      Definitely my sequence of 6 months ago is not about doing counterfactual planning by modifying somewhat opaque million-node causal networks that might be generated by machine learning. The main idea is to show planning world model modifications that you can apply even when you have no way of decoding opaque machine-learned functions.