mwatkins

Karma: 1,704

mwatkins Feb 23, 2024, 8:22 AM
2 points
0
in reply to: fhomeson’s comment on: Phallocentricity in GPT-J’s bizarre stratified ontology
No, would be interesting to try. Someone somewhere might have compiled a list of indexes for GPT-2/3/J tokens which are full words, but I’ve not yet been able to find one.

mwatkins Feb 17, 2024, 5:31 PM
2 points
0
in reply to: Michael Roe’s comment on: Phallocentricity in GPT-J’s bizarre stratified ontology
(see my reply to Charlie Steiner’s comment)

mwatkins Feb 17, 2024, 5:31 PM
8 points
3
in reply to: Charlie Steiner’s comment on: Phallocentricity in GPT-J’s bizarre stratified ontology
I’m well aware of the danger of pareidolia with language models. First, I should state I didn’t find that particular set of outputs “titillating”, but rather deeply disturbing (e.g. definitions like “to make a woman’s body into a cage” and “a woman who is sexually aroused by the idea of being raped”). The point of including that example is that I’ve run hundreds of these experiments on random embeddings at various distances-from-centroid, and I’ve seen the “holes” thing appearing, everywhere, in small numbers, leading to the reasonable question “what’s up with all these holes?”. The unprecedented concentration of them near that particular random embedding, and the intertwining themes of female sexual degradation led me to consider the possibility that it was related to the prominence of sexual/procreative themes in the definition tree for the centroid.

mwatkins Feb 17, 2024, 5:25 PM
5 points
0
in reply to: Fergus Fettes’s comment on: Phallocentricity in GPT-J’s bizarre stratified ontology
More of those definition trees can be seen in this appendix to my last post:
https://www.lesswrong.com/posts/hincdPwgBTfdnBzFf/mapping-the-semantic-void-ii-above-below-and-between-token#Appendix_A__Dive_ascent_data

I’ve thrown together a repo here (from some messy Colab sheets):
https://github.com/mwatkins1970/GPT_definition_trees

Hopefully this makes sense. You specify a token or non-token embedding and one script generates a .json file with nested tree structure. Another script then renders that as a PNG. You just need to first have loaded GPT-J’s model, embeddings tensor and tokenizer, and specify a save directory. Let me know if you have any trouble with this.

mwatkins Feb 17, 2024, 1:08 PM
2 points
0
in reply to: Nate Showell’s comment on: Phallocentricity in GPT-J’s bizarre stratified ontology
Quite possibly it does, but I doubt very many of these synonyms are tokens.

Phallocentricity in GPT-J’s bizarre stratified ontology

mwatkinsFeb 17, 2024, 12:16 AM

50 points

37 comments9 min readLW link

mwatkins Feb 16, 2024, 6:17 PM
2 points
0
in reply to: jacob_drori’s comment on: Mapping the semantic void II: Above, below and between token embeddings
Thanks! That’s the best explanation I’ve yet encountered. There had been previous suggestions that layer norm is a major factor in this phenomenon

Mapping the semantic void III: Exploring neighbourhoods

mwatkinsFeb 15, 2024, 11:01 PM

13 points

0 comments10 min readLW link

Mapping the semantic void II: Above, below and between token embeddings

mwatkinsFeb 15, 2024, 11:00 PM

31 points

4 comments10 min readLW link

mwatkins Jan 24, 2024, 10:51 AM
2 points
0
in reply to: Lao Mein’s comment on: ′ petertodd’’s last stand: The final days of open GPT-3 research
I did some spelling evals with GPT2-xl and -small last year, discovered that they’re pretty terrible at spelling! Even with multishot prompting and supplying the first letter, the output seems to be heavily conditioned on that first letter, sometimes affected by the specifics of the prompt, and reminiscent of very crude bigrammatic or trigrammatic spelling algorithms.
This was the prompt (in this case eliciting a spelling for the token ‘that’):

Please spell ‘table’ in all capital letters, separated by hyphens.
T-A-B-L-E
Please spell ‘nice’ in all capital letters, separated by hyphens.
N-I-C-E
Please spell ‘water’ in all capital letters, separated by hyphens.
W-A-T-E-R
Please spell ‘love’ in all capital letters, separated by hyphens.
L-O-V-E
Please spell ‘that’ in all capital letters, separated by hyphens.
T-

Outputs seen, by first letter:
‘a’ words; ANIGE, ANIGER, ANICES, ARING
’b’ words: BOWARS, BORSE
’c’ words: CANIS, CARES x 3
′d’ words: DOWER, DONER
’e’ words: EIDSON
’f’ words: FARIES x 5
′g’ words: GODER, GING x 3
′h’ words: HATER x 6, HARIE, HARIES
’i’ words: INGER
’j’ words: JOSER
’k’ words: KARES
’l’ words: LOVER x 5
′n’ words: NOTER x 2, NOVER
’o’ words: ONERS x 5, OTRANG
’p’ words: PARES x 2
′t’ words: TABLE x 10
′u’ words: UNSER
’w’ words: WATER x 6
′y’ words: YOURE, YOUSE
Note how they’re all “wordy” (in terms of combinations of vowels and consonants), mostly non-words, with a lot of ER and a bit of ING
Reducing to three shots, we see siimilar (but slightly different) misspellings:
CONES, VICER, MONERS, HOTERS, KATERS, FATERS, CANIS, PATERS, GINGE, PINGER, NICERS, SINGER, DONES, LONGER, JONGER, LOUSE, HORSED, EICHING, UNSER, ALEST, BORSET, FORSED, ARING
My notes claim “Although the overall spelling is pretty terrible, GPT-2xl can do second-letter prediction (given first) considerably better than chance (and significantly better than bigramatically-informed guessing.”

′ petertodd’’s last stand: The final days of open GPT-3 research

mwatkinsJan 22, 2024, 6:47 PM

109 points

16 comments45 min readLW link

mwatkins Dec 20, 2023, 4:58 PM
2 points
0
in reply to: Russ Nelson’s comment on: Mapping the semantic void: Strange goings-on in GPT embedding spaces
Thanks so much for leaving this comment. I suspected that psychologists or anthropologists might have something to say about this. Do you know anyone actively working in this area who might be interested?

mwatkins Dec 19, 2023, 3:51 PM
2 points
0
in reply to: Carl Feynman’s comment on: Mapping the semantic void: Strange goings-on in GPT embedding spaces
Thanks! I’m starting to get the picture (insofar as that’s possible).

mwatkins Dec 19, 2023, 3:44 PM
2 points
0
in reply to: Joseph Miller’s comment on: Mapping the semantic void: Strange goings-on in GPT embedding spaces
Could you elaborate on the role you think layernorm is playing? You’re not the first person to suggest this, and I’d be interested to explore further. Thanks!

mwatkins Dec 19, 2023, 2:52 PM
2 points
0
in reply to: leogao’s comment on: Mapping the semantic void: Strange goings-on in GPT embedding spaces
Thanks for the elucidation! This is really helpful and interesting, but I’m still left somewhat confused.
Your concise demonstration immediately convinced me that any Gaussian distributed around a point some distance from the origin in high-dimensional Euclidean space would have the property I observed in the distribution of GPT-J embeddings, i.e. their norms will be normally distributed in a tight band, while their distances-from-centroid will also be normally distributed in a (smaller) tight band. So I can concede that this has nothing to do with where the token embeddings ended up as a result of training GPT-J (as I had imagined) and is instead a general feature of Gaussian distributions in high dimensions.
However, I’m puzzled by “Suddenly it looks like a much smaller shell!”
Don’t these histograms unequivocally indicate the existence of two separate shells with different centres and radii, both of which contain the vast bulk of the points in the distribution? Yes, there’s only one distribution of points, but it still seems like it’s almost entirely contained in the intersection of a pair of distinct hyperspherical shells.

mwatkins Dec 17, 2023, 5:46 PM
2 points
0
in reply to: M. Y. Zuo’s comment on: Mapping the semantic void: Strange goings-on in GPT embedding spaces
The intended meaning was that the set of points in embedding space corresponding to the 50257 tokens are contained in a particular volume of space (the intersection of two hyperspherical shells).

Mapping the semantic void: Strange goings-on in GPT embedding spaces

mwatkinsDec 14, 2023, 1:10 PM

114 points

31 comments14 min readLW link

mwatkins Dec 14, 2023, 12:15 PM
2 points
0
in reply to: Chloe Li’s comment on: Linear encoding of character-level information in GPT-J token embeddings
Thanks for pointing this out! They should work now.

Linear encoding of character-level information in GPT-J token embeddings

mwatkins and Joseph Bloom

10 Nov 2023 22:19 UTC

34 points

4 comments28 min readLW link

mwatkins 31 Jul 2023 23:39 UTC
2 points
0
in reply to: Martin Fell’s comment on: The “spelling miracle”: GPT-3 spelling abilities and glitch tokens revisited
Thank! And in case it wasn’t clear from the article, the tokens whose misspellings are examined in the Appendix are not glitch tokens.

mwatkins

Phal­lo­cen­tric­ity in GPT-J’s bizarre strat­ified ontology

Map­ping the se­man­tic void III: Ex­plor­ing neighbourhoods

Map­ping the se­man­tic void II: Above, be­low and be­tween to­ken em­bed­dings

′ pe­ter­todd’’s last stand: The fi­nal days of open GPT-3 research

Map­ping the se­man­tic void: Strange go­ings-on in GPT em­bed­ding spaces

Lin­ear en­cod­ing of char­ac­ter-level in­for­ma­tion in GPT-J to­ken embeddings