Adam Scherlis did some preliminary exploration here:https://www.lesswrong.com/posts/BMghmAxYxeSdAteDc/an-exploration-of-gpt-2-s-embedding-weightsHere’s a more thorough investigation of the overall shape of said embeddings with interactive figures:https://bert-vs-gpt2.dbvis.de/
Adam Scherlis did some preliminary exploration here:
https://www.lesswrong.com/posts/BMghmAxYxeSdAteDc/an-exploration-of-gpt-2-s-embedding-weights
Here’s a more thorough investigation of the overall shape of said embeddings with interactive figures:
https://bert-vs-gpt2.dbvis.de/