That’s not a scalar, do you mean the trace of that? If so, doesn’t that just eliminate the term that causes the incorrect initialization to decay?
Sorry, I meant ⟨a,a⟩⋅⟨b,b⟩. And yes, that should eliminate the term that causes the incorrect initialization to decay. Doesn’t that cause the learning to be in the correct direction from the start?
I don’t think so? I think that just means you keep the incorrect initialization around while also learning the correct direction.
That’s not a scalar, do you mean the trace of that? If so, doesn’t that just eliminate the term that causes the incorrect initialization to decay?
Sorry, I meant ⟨a,a⟩⋅⟨b,b⟩. And yes, that should eliminate the term that causes the incorrect initialization to decay. Doesn’t that cause the learning to be in the correct direction from the start?
I don’t think so? I think that just means you keep the incorrect initialization around while also learning the correct direction.