Mistral and Pythia use rotary embeddings and don’t have a positional embedding matrix. Which matrix are you looking at for those two models?
Oh shoot, yea. I’m probably just looking at the rotary embeddings, then. Forgot about that, thanks
Mistral and Pythia use rotary embeddings and don’t have a positional embedding matrix. Which matrix are you looking at for those two models?
Oh shoot, yea. I’m probably just looking at the rotary embeddings, then. Forgot about that, thanks