abramdemski comments on Minimization of prediction error as a foundation for human values in AI alignment

abramdemski 14 Oct 2019 7:14 UTC
4 points
I’m still interested if you can say more about how you view it as minimizing a warped prediction. I mentioned that of you fix some parts of the network, they seem to end up getting ignored rather than producing goal-directed behaviour. Do you have an alternate picture in which this doesn’t happen? (I’m not asking you to justify yourself rigorously; I’m curious for whatever thoughts or vague images you have here, though of course all the better if it really works)
- Gordon Seidoh Worley 14 Oct 2019 17:15 UTC
  2 points
  Parent
  Ah, I guess I don’t expect it to end up ignoring the parts of the network that can’t learn because I don’t think error minimization, learning, or anything else is a top level goal of the network. That is, there are only low-level control systems interacting, and parts of the network get not ignored by their being more powerful in various ways, probably by being positioned such that they are located in the network such that they have more influence on behavior than other parts of the network that perform Bayesian learning. This does mean I expect those parts of the network don’t learn or learn inefficiently, but they do that because it’s adaptive.
  For example, I would guess something in humans like the neocortex is capable of Bayesian learning, but it only influences the rest of the system through narrow channels that prevent it from “taking over” and making humans true prediction error minimizers, instead forcing them to do things that satisfy other set points. In buzz words you might say human minds are “complex, adaptive, emergent systems” built out of neurons with most of the function coming bottom up from the neurons or “from the middle”, if you will, in terms of network topology.