fhomeson comments on Phallocentricity in GPT-J’s bizarre stratified ontology

fhomeson 21 Feb 2024 0:35 UTC
8 points
0
Did you try getting the centroid of all words, rather than all tokens? The set of token will contain a lot of nonsense fragments.
- mwatkins 23 Feb 2024 8:22 UTC
  2 points
  0
  Parent
  No, would be interesting to try. Someone somewhere might have compiled a list of indexes for GPT-2/3/J tokens which are full words, but I’ve not yet been able to find one.