Jon Garcia comments on More accurate models can be worse

Jon Garcia 28 Dec 2021 17:02 UTC
1 point
The brain overcomes this issue through the use of saliency-weighted learning. I don’t have any references at the moment, but essentially, information is more salient when it is more surprising, either to the agent’s world model or to its self model.

For the former, the agent is constantly making predictions about what it will experience along with the precision of these expectations such that when it encounters something outside of these bounds, it takes notice and updates its world model more strongly in the direction of minimizing these prediction errors.

The latter, however, is where the “usefulness” of salient information is most directly apparent. The agent is not just predicting what will happen in the external world like some disembodied observer. It is modeling what it expects to experience conditioned on its model of itself being healthy and functional. When something surprisingly good occurs, it takes special note of all information that was coincident with the pleasure signal to try to make such experiences more likely in the future. And when something surprisingly bad occurs, it also takes notice of all information coincident with the pain signal so that it can make such experiences less likely in the future.

When everything is going as expected, though, the agent will tend not to keep that information around. Saliency-weighted learning is all about steering an agent’s models toward better predictive power and steering its behavior toward states of easier survivability (or easier learnability for a curiosity drive), allowing it to discard most information that it encounters in favor of only that which challenges its expectations.
- tailcalled 28 Dec 2021 22:53 UTC
  2 points
  Parent
  Saliency-based learning can definitely reduce this problem. Neural network reinforcement learners typically do something similar, e.g. predicting rewards (this is also necessary for other purposes). However, I don’t think it fully solves the problem because it only weights the information that it can immediately identify as being related to what it is seeking, and not the information that may eventually turn out to be useful for what it is seeking. Of course the latter is not really solvable in the general case.
  - Jon Garcia 29 Dec 2021 2:15 UTC
    1 point
    Parent
    All information currently in working memory could potentially become highly weighted when a saliency signal comes along. Through reinforcement learning, I imagine the agent could optimize whatever attention circuit does the loading of information into working memory in order to make this more useful, as part of some sort of learning-to-learn algorithm.