Yes, I train them one at a time, constraining each new vector to be orthogonal to the older ones (this was not clear in the post, so thanks for asking!).
I haven’t experimented with this, but you could also imagine using only “soft” orthogonality constraints (e.g., penalizing pairwise cosine similarities between vectors).
Yes, I train them one at a time, constraining each new vector to be orthogonal to the older ones (this was not clear in the post, so thanks for asking!).
I haven’t experimented with this, but you could also imagine using only “soft” orthogonality constraints (e.g., penalizing pairwise cosine similarities between vectors).