Gordon Seidoh Worley comments on Inner alignment in the brain

Gordon Seidoh Worley 23 Apr 2020 16:55 UTC
LW: 4 AF: 2
AF
Then that suggests to me an interesting hypothesis: maybe it can’t! What if some of our weirder instincts related to memory or counterfactual imagination are not adaptive at all, but rather crosstalk from social instincts, or vice-versa? For example, I think there’s a reaction in the subcortex that listens for a strong prediction of lower reward, alternating with a weak prediction of higher reward; when it sees this combination, it issues negative reward and negative valence. Think about what this subcortical reaction would do in the three different cases: If the weak prediction it sees is an empathetic simulation, well, that’s the core of jealousy! If the weak prediction it sees is a memory, well, that’s the core of loss aversion! If the weak prediction it sees is a counterfactual imagination, well, that’s the core of, I guess, that annoying feeling of having missed out on something good. Seems to fit together pretty well, right? I’m not super confident, but at least it’s food for thought.
I think this is interesting in terms of thinking about counterfactuals in decision theory, preference theory, etc.. To me it suggests that when we talk about counterfactuals we’re putting our counterfactual worlds in a stance that mixes up what they are and what we want them to be. What they are, as in the thing going on in our brains that causes us to think in terms of counterfactual worlds, is these predictions about the world (so world models or ontology), and when we apply counterfactual reasoning we’re considering different predictions about the world contingent on different inputs, possibly including inputs other than the ones we actually saw but that we are able to simulate. This means that it’s not reasonable that counterfactual worlds would be consistent with the history of the world (the standard problem with counterfactuals) because they aren’t alternative territories but maps of how we think we would have mapped different territory.
This doesn’t exactly save counterfactual reasoning, but it does allow us to make better sense of what it is when we use it and why it works sometimes and why it’s a problem other times.
- Steven Byrnes 25 Apr 2020 17:15 UTC
  LW: 8 AF: 4
  AF Parent
  I haven’t read the literature on “how counterfactuals ought to work in ideal reasoners” and have no opinion there. But the part where you suggest an empirical description of counterfactual reasoning in humans, I think I basically agree with what you wrote.
  
  I think the neocortex has a zoo of generative models, and a fast way of detecting when two are compatible, and if they are, snapping them together like Legos into a larger model.
  
  For example, the model of “falling” is incompatible with the model of “stationary”—they make contradictory predictions about the same boolean variables—and therefore I can’t imagine a “falling stationary rock”. On the other hand, I can imagine “a rubber wine glass spinning” because my rubber model is about texture etc., my wine glass model is about shape and function, and my spinning model is about motion. All 3 of those models make non-contradictory predictions (mostly because they’re issuing predictions about non-overlapping sets of variables), so the three can snap together into a larger generative model.
  
  So for counterfactuals, I suppose that we start by hypothesizing some core of a model (“a bird the size of an adult blue whale”) and then searching out more little generative model pieces that can snap onto that core, growing it out as much as possible in different ways, until you hit the limits where you can’t snap on any more details without making it unacceptably self-contradictory. Something like that...
  
  Again, I think I agree with what you wrote. :-)
  What links here?