Vanessa Kosoy comments on Martín Soto’s Shortform

Vanessa Kosoy 29 Jul 2024 17:11 UTC
6 points
1
Maybe the learning algorithm doesn’t have a clear notion of “positive and negative”, and instead just provides in a same direction (but with different intensities) for different intensities in a scale without origin. (But this seems very different from the current paradigm, and fundamentally wasteful.)
Maybe I don’t understand your intent, but isn’t this exactly the currently paradigm? You train a network using the derivative of the loss function. Adding a constant to the loss function changes nothing. So, I don’t see how it’s possible to have a purely ML-based explanation of where humans consider the “origin” to be.
- Martín Soto 29 Jul 2024 23:05 UTC
  3 points
  0
  Parent
  You’re right! I had mistaken the derivative for the original function.
  Probably this slip happened because I was also thinking of the following:
  Embedded learning can’t ever be modelled as taking such an (origin-agnostic) derivative.
  When in ML we take the gradient in the loss landscape, we are literally taking (or approximating) a counterfactual: “If my algorithm was a bit more like this, would I have performed better in this environment? (For example, would my prediction have been closer to the real next token)”
  But in embedded reality there’s no way to take this counterfactual: You just have your past and present observations, and you don’t necessarily know whether you’d have obtained more or less reward had you moved your hand a bit more like this (taking the fruit to your mouth) or like that (moving it away).
  Of course, one way to solve this is to learn a reward model inside your brain, which can learn without any counterfactuals (just considering whether the prediction was correct, or how “close” it was for some definition of close). And then another part of the brain is trained to approximate argmaxing the reward model.
  But another effect, that I’d also expect to happen, is that (either through this reward model or other means) the brain learns a “baseline of reward” (the “origin”) based on past levels of dopamine or whatever, and then reinforces things that go over that baseline, and disincentivizes those that go below (also proportionally to how far they are from the baseline). Basically the hedonic treadmill. I also think there’s some a priori argument for this helping with computational frugality, in case you change environments (and start receiving much more or much less reward).
  - Vanessa Kosoy 1 Aug 2024 8:39 UTC
    2 points
    0
    Parent
    I don’t think embeddedness has much to do with it. And I disagree that it’s incompatible with counterfactuals. For example, infra-Bayesian physicalism is fully embedded and has a notion of counterfactuals. I expect any reasonable alternative to have them as well.