joseph_c comments on Multi-Component Learning and S-Curves

joseph_c 30 Nov 2022 4:27 UTC
1 point
0
Have you experimented with subtracting $a^{T} a \cdot b^{T} b$ from the loss? It seems to me that doing so would get rid of the second term and allow the model to learn the correct vectors from the beginning.
- Adam Jermyn 30 Nov 2022 17:51 UTC
  1 point
  0
  Parent
  That’s not a scalar, do you mean the trace of that? If so, doesn’t that just eliminate the term that causes the incorrect initialization to decay?
  - joseph_c 1 Dec 2022 4:36 UTC
    1 point
    0
    Parent
    Sorry, I meant $⟨ a, a ⟩ \cdot ⟨ b, b ⟩$ . And yes, that should eliminate the term that causes the incorrect initialization to decay. Doesn’t that cause the learning to be in the correct direction from the start?
    - Adam Jermyn 1 Dec 2022 20:01 UTC
      2 points
      0
      Parent
      I don’t think so? I think that just means you keep the incorrect initialization around while also learning the correct direction.