Please excuse my lack of knowledge here, but if we know all of the vectors for the tokens in the cl100k_base model, why can’t we then create the embedding matrix? Is the embedding matrix not simply all of these rows?
cl100k_base
The tokens themselves are public, but not the actual embedding matrix/vectors (as far as I know)
Please excuse my lack of knowledge here, but if we know all of the vectors for the tokens in the
cl100k_base
model, why can’t we then create the embedding matrix? Is the embedding matrix not simply all of these rows?The tokens themselves are public, but not the actual embedding matrix/vectors (as far as I know)