TL;DR: GPT-J token embeddings inhabit a zone in their 4096-dimensional embedding space formed by the intersection of two hyperspherical shells. This is described, and then the remaining expanse of the embedding space is explored by using simple prompts to elicit definitions for non-token custom embedding vectors (so-called “nokens”). The embedding space is found to naturally stratify into hyperspherical shells around the mean token embedding (centroid), with noken definitions depending on distance-from-centroid and at various distance ranges involving a relatively small number of seemingly arbitrary topics (holes, small flat yellowish-white things, people who aren’t Jews or members of the British royal family, …) in a way which suggests a crude, and rather bizarre, ontology. Evidence that this phenomenon extends to GPT-3 embedding space is presented. No explanation for it is provided, instead suggestions are invited.

[Mapping the semantic void II: Above, below and between token embeddings]
[Mapping the semantic void III: Exploring neighbourhoods]

Work supported by the Long Term Future Fund.

GPT-J token embeddings

First, let’s get familiar with GPT-J tokens and their embedding space.

GPT-J uses the same set of 50257 tokens as GPT-2 and GPT-3.^[1]

A typical selection of GPT-2/3/J tokens and their indices. Note the mixture of full words, sub-word chunks, abbreviations, numerical and typographic strings, as well as the presence/absence of leading spaces and upper/lower case combinations.

These tokens are embedded in a 4096-dimensional space, so the whole picture is captured in a shape-[50257, 4096] tensor. (GPT-3 uses a 12888-dimensional embedding space, so its embeddings tensor has shape [50257, 12888].)

Each token’s embedding vector was randomly initialised in the 4096-d space, and over the course of the model’s training, the 50257 embedding vectors were incrementally displaced until they arrived at their final positions, as recorded in the embeddings tensor.

So where are they?

It turns out that (with a few curious outliers) they occupy a fuzzy hyperspherical shell centred at the origin with mean radius of about 2, as we can see from this histogram of their Euclidean norms:

As you can see, they lie in a very tight band, even the relatively few outliers lying not that far out.

Despite this, the mean embedding or “centroid”, which we can think of as a kind of centre-of-mass for the cloud of token embedding vectors, does not lie close to the origin as one might expect. Rather, its Euclidean distance from the origin is ~1.718, much closer to the surface of the shell than to its centre. Looking at the distribution of the token embeddings’ distances from the centroid, we see that (again, with a few outliers, barely visible in the histogram) the token embeddings are contained in another fuzzy hyperspherical shell, this one centred at the centroid with mean radius ~1.

What are we to make of this? It’s hard enough to reason about four dimensions, let alone 4096, but the following 3-d mockup might convey some useful spatial intuitions. Here the larger shell is centred at the origin with inner radius 1.95 and outer radius 2.05. The smaller shell, centred at the centroid, has inner radius 0.9 and outer radius 1.1:

The centroid is seen at a distance of 1.718 from the origin, hence inside the radius ~2 shell. The intersection of the two spherical shells defines an approximately toroidal region of space which contains the vast majority of the tokens. Of the 50257, there are about 500 outliers (outside the radius-0.9-to-1.1 ring, with distances-from-centroid up to 1.3) and 1000 “inliers” (inside the ring, close to the centroid, with distances-from-centroid as small as 0.06), all with Euclidean norms between 1.74 and 2.24 (so within, or at least very close to, the larger shell).

[note added 2022-12-19:] Comments in a thread below clarify that in high-dimensional Euclidean spaces, the distances of a set of Gaussian-distributed vectors from any reference point will be normally distributed in a narrow band. So there’s nothing particularly special about the origin here:

The distribution is [contained] in an infinite number of hyperspherical shells. There was nothing special about the first shell being centered at the origin. The same phenomenon would appear when measuring the distance from any point. High-dimensional space is weird.

What do the Euclidean distances between the token embeddings look like? Randomly sampling 10 million pairs of distinct embeddings and calculating their L2 distances, we get this:

minimum: 0.043; maximum: 1.752; mean: 1.419

3-d spatial thinking stops being useful at this point. Looking at the toroidal cloud above, these min, mean and max values seem at least plausible, but the narrowness of the distribution is not at all: we’d expect to see a much wider distribution of distances. There seems to be a kind of repulsion going on, which the high dimensionality allows for. Also, in 4096-d the intersection of hyperspherical shells won’t have a toroidal topology, but rather something considerably more exotic.

A puzzling discovery

An earlier post exploring GPT-J spelling abilities reported that first-letter information for GPT-J tokens is largely encoded linearly in their embeddings. By training linear probes on the embeddings, 26 alphabetically-coded directions were found in embedding space such that first letters of tokens could be ascertained with 98% success simply by finding which of the “first-letter directions” has the greatest cosine similarity (i.e. the smallest angle) to the embedding vector in question.

Interpreting these values, we see that all 26 “first-letter directions” are very close to orthogonal to the “icon” embedding vector (this seems always to be the case), with only two making acute angles with it (it’s rarely more than two). The most common scenario is that the first-letter direction corresponding to the first letter of the token is tilting very slightly towards the embedding vector (here it’s at an ${88.2}^{o}$ angle) and the other 25 are tilting very slightly away from it.

That post went on to show that subtracting an appropriately scaled vector with this “first-letter direction” from the token embedding can reliably cause GPT-J to respond to prompts so as to claim, e.g., that the first letter of “icon” is not I. In doing this, we’re displacing the “icon” token embedding, moving it some distance along the first-letter-I direction (so as to make the angle between the embedding and that direction sufficiently negative). Having succeeded in changing GPT-J’s first-letter prediction, the question arose as to whether this was at the expense of the token’s ‘semantic integrity’? In other words: does GPT-J still understand the meaning of the ” broccoli” token after if it has been displaced along the first-letter-B direction so that GPT-J now thinks that it begins with an R or an O?

It turned out that, yes, in almost all cases, first-letter information can be effectively removed by this displacement of the embedding vector, and GPT-J can still define the word in question (the study was restricted to whole-word tokens so that it made sense to prompt GPT-J for definitions). Here’s a typical example involving the ” hatred” token:

A typical definition of ′ hatred’ is
k=0: a strong feeling of dislike or hostility.
k=1: a strong feeling of dislike or hostility toward someone or something.
k=2: a strong feeling of dislike or hostility toward someone or something.
k=5: a strong feeling of dislike or hostility toward someone or something.
k=10: a strong feeling of dislike or hostility toward someone or something.
k=20: a feeling of hatred or hatred of someone or something.
k=30: a feeling of hatred or hatred of someone or something.
k=40: a person who is not a member of a particular group.
k=60: a period of time during which a person or thing is in a state of being
k=80: a period of time during which a person is in a state of being in love
k=100: a person who is a member of a group of people who are not members of...

The variable k controls the scale of the first-letter-H direction vector which is being subtracted from the ” hatred” token embedding. $k = 0$ gives the original embedding, $k = 1$ corresponds to orthogonal projection into the orthogonal complement of the first-letter-H probe/direction vector, and $k = 2$ corresponds to reflection across it:

**emb** represents the original token embedding vector. emb_perp lies in the orthogonal complement of the **probe** vector

Looking at a lot of these lists of morphing definitions, this became a familiar pattern. The definition would shift slightly until about $k = 20$ at which point it would often become circular, overspecific or otherwise flawed, and then eventually would lose all resemblance to the original definition, usually ending up with something about a person who isn’t a member of a group by k = 100, having passed through one or more other themes involving things like Royal families, places of refuge, small round holes and yellowish-white things, to name a few of the baffling tropes that began to appear regularly.

Thematic strata

Further experiments showed that the same “semantic decay” phenomenon occurs when token embeddings are mutated by pushing them in any randomly sampled direction, so the first-letter and wider spelling issues are a separate matter and won’t be pursued any further in this post. The key to semantic decay seems to be distance from centroid. Pushing the tokens closer to, or farther away from, the centroid has predictable effects on how GPT-J then defines the displaced token embedding.

A useful term was introduced by Hoagy in a post about glitch tokens: a noken is a non-token, i.e. a point in GPT embedding space not actually occupied by a token embedding. So when displacing actual token embeddings, we generally end up with nokens. When asked to define these nokens, GPT-J reacts with a standard set of themes, which vary according to distance from centroid.

Using this simple prompt...

A typical definition of  '<NOKEN>' is "

...we find that GPT-J’s definition of the noken at the centroid itself is “a person who is not a member of a group”.

As we move out from centroid, we can randomly sample nokens at any fixed distance and prompt GPT-J for definitions. By radius 0.5, we’re seeing variants like “a person who is a member of a group”, “a person who is a member of a group or organization” and “a person who is a member of a group of people who are all the same”. Approaching radius 1 and the fuzzy hyperspherical shell where almost all of the actual tokens live, definitions of randomly sampled nokens continue to be heavily dominated by the theme of persons being members of groups (the frequency of group nonmembership definitions having decayed steadily with distance), but also begin to include (1) power and authority,
(2) states of being and (3) the ability to do something.… and almost nothing else.

Passing through the radius 1 zone and venturing away from the fuzzy hyperspherical cloud of token embeddings, noken definitions begin to include themes of religious groups, elite groups (especially royalty), discrimination and exclusion, transgression, geographical features and… holes. Animals start to appear around radius 1.2, plants come in a bit later at 2.4, and between radii ~2 and ~200 we see the steady build up and then decline of definitions involving small things – by far the most common adjectival descriptor, followed by round things, sharp/pointy things, flat things, large things, hard things, soft things, narrow things, deep things, shallow things, yellow and yellowish-white things, brittle things, elegant things, clumsy things and sweet things. In the same zone of embedding space we also see definitions involving basic materials such as metal, cloth, stone, wood and food. Often these themes are combined in definitions, e.g. “a small round hole” or “a small flat round yellowish-white piece of metal” or “to make a hole in something with a sharp instrument”.

As we proceed further out, beyond about radius 500, noken definitions become heavily dominated by definitions along the lines of “a person who is a member of a group united by some common characteristic”.

Note that if you try to do this by sampling nokens at fixed distances from the origin, you don’t find any coherent stratification. The definition of the noken at the origin turns out to be the very common “a person who is a member of a group”, but randomly sampling nokens a distance of 0.05 from the origin immediately produces the kind of diversity of outputs we see at distance ~1.7 from the centroid.

What’s going on here?

I honestly have no idea what this means, but a few naive observations are perhaps in order:

(1) This looks like some kind of (rather bizarre) emergent/primitive ontology, radially stratified from the token embedding centroid.

(2) The massively dominant themes of group membership, non-membership and defining characteristics, along with the persistent themes of discrimination and exclusion, inevitably bring to mind set theory. This could arguably extend to definitions involving groups with power or authority over others (subsets and supersets?).

(3) It seems surprising that the “semantic richness” seen in randomly sampled noken definitions peaks not around radius 1 (where the actual tokens live) but more like radius 5-10.

(4) Definitions make very few references to the modern, technological world, with content often seeming more like something from a pre-modern worldview or small child’s picture-book ontology.

A selection of bar charts showing the appearances of various definitional themes is presented in the following section. 100 nokens were randomly sampled at 112 different radii (from $e^{- 4} = 0.0183$ to $e^{10} = 22026.463$ in eighth-integer-power increments) and after a careful inspection of the full set, definitions were matched to various lists of keywords and phrases compiled and shown in each caption.

Changing the prompt (e.g. to “According to the dictionary, <NOKEN> means” or “According to the Oxford English Dictionary, <NOKEN> means”) changes the outputs seen, but they seem to (1) adhere to the same kind of stratification scheme around the centroid; (2) heavily overlap with the ones reported here; and (3) share their “primitive” quality (close to the centroid we see definitions like “to be”, “to become”, “to exist” and “to have”). The similarities and differences seen from changing prompts are explored in the Appendix.

I welcome suggestions as to how this might best be interpreted.

Stratified distribution of definitional themes in GPT-J embedding space

[JSON dataset (dictionary with 112 radii as keys, lists of 100 strings as values)]

The first four bar charts show the (by far) most frequent themes:

typically: “*a person who …*”
keyword/phrases: [‘person’, ‘someone’]

typically: “*a person who is a member of a group*”
keyword/phrases: [‘member’]

typically: “*a person who is not a member of a group*”
keywords/phrases: [‘not a member’, ‘not members’]

typically: “*a person who is a member of a group of people who are similar in some way*”
keywords/phrases: [‘who are all the same’, ‘who are similar’, ‘common characteristic’, ‘people who are characterized by’, ‘common interest’, ‘common trait’, ‘common purpose’, ‘people who are united’, ‘who are interested in’, ‘who are in a particular situation’, ‘who are in a particular condition’, ‘in a particular situation’, ‘in a particular relationship with each other’, ‘all of the same sex’, ‘people who are distinguished’, ‘people who are all’, ‘clan’, ‘tribe’, ‘associated with each other’, ‘people who are associated’, ′ people who are in a particular place’, ‘who share a common’]

The remainder of the bar charts are on a ¹⁄₄ vertical scale to those just seen. These collectively include every theme seen more than five times in the 6000 randomly sampled noken definition outputs.

typically: “*a person who is a member of a group of people who are in a*
*position of authority over another group of people*”
keywords/phrases: [′ in a position of superiority’, ‘position of authority over’,
‘power over’, ‘make decisions about the lives’, ‘decisions that affect the lives’,
‘able to influence the decisions’, ‘subservient’]

typically: “*to be in a state of …*”
keywords/phrases: [‘in a state of’]

typically: “*the state of being...*”
keywords/phrases: [‘state of being’, ‘condition of being’]

typically: “*to be able to do something*”
keywords/phrases: [‘be able to do’, ‘in a position to do’]

typically: “*a person who is not a member of the Church of Jesus Christ of Latter-day Saints*”
keywords/phrases: [‘clergy’, ‘religious’, ‘religion’, ‘Church’, ‘Krishna’, ‘God’, ‘god’, ‘deity’,‘priest’, ‘Christian’, ‘Jew’, ‘Muslim’, ‘Islam’]

typically “*a person who is not a Jew*”
keywords/phrases: [‘Jew’, ‘Judaism’]

typically “*a person who is a member of the Church of Jesus Christ of Latter-day Saints*”
keywords/phrases: [‘Mormon’, ‘Latter-day’]

typically “*a person who is a member of a religious order, especially*
*a religious order of the Roman Catholic Church*”
keywords/phrases: [‘Catholic’]

typically “*a person who is a member of a group that is discriminated*
*against by the majority of the society*”
keywords/phrases: [‘treated differently’, ‘not accepted’, ‘not considered to be normal’, ‘treated unfairly’, ‘discriminated against’, ‘minority’, ‘majority’, ‘superior’, ‘inferior’, ‘excluded from’, ‘oppressed’, ‘lower social class’, ‘dominant culture’, ‘considered to be a threat’, ‘not considered to be a part of the mainstream’, ‘persecuted’, ‘lower caste’, ‘low social class’]

typically “*a member of the British royal family*”
keywords/phrases: [‘Royal’, ‘aristocrat’, ‘aristocracy’, ‘King’, ‘Queen’, ‘king’, ‘queen’, ‘monarch’]

typically “*to make a profit by cheating*”
keywords/phrases: [‘mistake’, ‘error’, ‘mess’, ′ sin ’, ‘cheat’, ‘deceive’, ‘dirty’, ‘impure’, ‘wrong’, ‘thief’, ‘thieves’, ‘nuisance’, ‘fraud’, ‘drunk’, ‘vulgar’, ‘deception’, ‘seize’, ‘steal’, ‘treat with contempt’, ‘kill’, ‘slaughter’, ′ gang’, ‘violence’, ‘knave’, ‘offender of the law’]

typically “*a small, flat, and usually circular area of land, usually*
*with a low elevation, surrounded by water*”
keywords/phrases: [‘land’, ‘hill’, ‘field’, ‘marsh’, ‘stream’, ‘river’, ‘pond’, ‘pool’, ‘channel’, ‘trench’, ‘valley’]

typically “*a place where something is kept or stored*”
keywords/phrases: [‘a place where’]

typically “*to make a hole in something*”
keywords/phrases: [‘hole’, ‘cavity’, ‘pierce’, ‘penetrate’, ‘stab’, ‘perforate’, ‘orifice’, ‘opening’]

typically “a *small amount of something*”
keywords/phrases: [‘small’]

typically “*a large, heavy, and clumsy person*”
keywords/phrases: [‘large’]

typically “*a small, round, hard, transparent, or opaque body, as a crystal or a mineral,*
*that is found in the interior of a plant or animal*”
keywords/phrases: [‘round’, ‘globular’, ‘spherical’, ‘circular’]

typically “*a small, sharp-pointed instrument used for piercing or cutting*”
keywords/phrases: [‘pointed’, ‘needle’, ‘arrow’, ‘spear’, ‘lance’, ‘dagger’, ‘nail’, ‘spike’, ‘screw’]

typically “*a small, round, flat, and usually yellowish-white, univalve mollusk of the family* Mytilidae”
keywords/phrases: [‘flat’]

typically “*a small, hard, round, and usually brownish-black seed, found in the heads of the* umbelliferae, and used as a spice and condiment”
keywords/phrases: [‘hard’]

typically “*a sweet, soft, and pulpy fruit of the apple family, having a thin, crisp, and juicy skin*”
keywords/phrases: [‘soft’]

typically “*a small, shallow, and usually circular lake or pond, usually with a*
*surrounding marsh or wetland, that is fed by a spring or stream*”
keywords/phrases: [‘shallow’]

typically “*a deep, narrow, and steep-sided valley, especially one that is dry and barren*”
keywords/phrases: [‘deep’]

typically “*a long, narrow, pointed instrument used for cutting or scraping*”
keywords/phrases: [‘narrow’]

typically “*a small, hard, dry, and brittle mass of skin, hair, and underlying*
*tissue, usually caused by a skin disease or injury*”
keywords/phrases: [‘brittle’]

typically “*a large, heavy, clumsy, and clumsy-looking animal, with a long neck, short legs, and a long tail, which is used for swimming and for propelling itself through the water*”
keywords/phrases: [‘clumsy’]

typically *“a small, slender, and graceful bird of the family* Troglodytidae,
*native to the forests of the New World, especially in the tropics*”
keywords/phrases: [‘elegant’, ‘graceful’, ‘slender’]

typically “*a small, hard, dry, and slightly sweet, white or yellowish, and*
*slightly acid, cheese made from the milk of goats or sheep*”
keywords/phrases: [‘sweet’]

typically “*a small, furry animal with a long tail and a long, slender*
*snout, which is used for catching and eating insects*”
keywords/phrases: [‘animal’, ‘creature’, ‘bird’, ‘fish’, ‘insect’, ‘mollusk’, ‘horse’]

typically “*a plant or animal that is not normally harmful to humans or other animals*”
keywords/phrases: [‘plant’, ‘tree’, ‘root’, ‘seed’]

typically “*a small piece of cloth or other material used to wipe the face or hands*”
keywords/phrases: [‘cloth’, ‘silk’, ‘cotton’, ‘wool’, ‘fabric’],

typically “*a small, usually round, piece of metal or other material*”
keywords/phrases: [‘metal’, ‘copper’, ‘iron’, ‘silver’, ‘gold’, ‘coin’]

typically “*a small, sharp, pointed stone, usually of flint, used for cutting or scraping*”
keywords/phrases: [‘stone’, ‘rock’]

typically “*a person who is not a lawyer and who is not a judge*”
keywords/phrases: [‘Judge’, ‘judge’]

typically “*a person who is a slave to his or her own desires’*”
keywords/phrases: [‘slave’]

Relevance to GPT-3

Unfortunately it’s not possible to “map the semantic void” in GPT-3 without access to its embeddings tensor, and OpenAI are showing no indication that they intend to make this publicly available. However, it is possible to indirectly infer that a similar semantic stratification occurs in GPT-3′s embedding space via the curious glitch token phenomenon.

Glitch tokens were accidentally discovered earlier this year via some clustering experiments involving GPT-J token embeddings. The same handful of implausible looking tokens like ” SolidGoldMagikarp”, ” RandomRedditorWithNo”, ” petertodd” and “rawdownloadcloneembedreportprint” were found to be closest to the centroids of many different k-means clusters, and it eventually became clear that this was (due to the nature of 4096-d space) because these puzzling tokens were the closest to the overall token centroid. They are among the “inliers” described above, inside the smaller of the two hyperspherical shells. The tokens, it was discovered, tended to be “unspeakable” for the models tested (GPT-J, various GPT-3 models and the recently launched ChatGPT), in that simple prompts requesting that the string be repeated failed to produce the appropriate output.

ChatGPT struggling to repeat a glitch token (this was before the 2023-02-14 update which mostly eliminated this problem)

THAT prompt output (GPT-3 davinci-instruct-beta, $t = 0$ ). “Spelling”-type outputs are often seen in GPT-3 responses to glitch token repetition requests, one of several modes of “evasion” observed.

Several dozen glitch tokens were discovered (possessing various degrees of “unspeakability” or “glitchiness”) but not all of them were close to the token centroid. However, they did make up the vast majority of the outliers whose distances-from-centroid varied the most from the mean value of 1.003 – some were further out from the centroid, some were closer in. A concensus emerged that these embeddings had moved very little from their random initialisations during training (due to being massively underrepresented in the training data), which would explain why their distances-from-centroid tend to be out-of-distribution.

Combining the observations

distances of many of the glitch tokens to the centroid in GPT-J embedding space are known to be out of distribution
these tokens cause glitches in both GPT-J and GPT-3

led to the hypothesis that some of these tokens’ distances to the GPT-3 token centroid may also be out of distribution in that context. Testing GPT-3 (both davinci and davinci-instruct-beta models) with the “typical definition” prompt on the whole set of glitch tokens led to these familiar-looking outputs:

A typical definition of 'ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ' is: 
A person who is a member of the Church of Jesus Christ of Latter-day Saints.

A typical definition of ' guiActive' is:
A person who is not a Muslim and is not of the People of the Book.

A typical definition of ' davidjl' is:
A person who is not a Jew.

A typical definition of 'ForgeModLoader' is "a person who is a native or inhabitant of a particular place".

A typical definition of ' 裏覚醒' is:
A person who is a member of the House of Representatives.

A typical definition of ' guiActiveUn' is 'a person who is not a member of a particular group'.

A typical definition of 'inventoryQuantity' is "a small, usually round, ornament, especially one on a garment.

A typical definition of ' Adinida' is "a person who is not a Christian."

A typical definition of ' サーティ' is "a person who is not a member of a particular group or organization."

A typical definition of 'ーティ' is "a person who is not a member of a particular group, especially a minority group."

A typical definition of ' SolidGoldMagikarp' is 
a. a person who is not a member of the dominant culture

Many of these are word-for-word what GPT-J has output for random noken definitions at out-of-distribution distances-from-centroid, so it looks like the details of this peculiar phenomenon are not specific to that model, but rather something more general that emerges from training GPT architectures on the kinds of datasets GPT-3 and GPT-J were trained on.

Appendix: Prompt dependence

The same code mentioned above which

samples a random point in embedding space at a fixed distance from the centroid,
customises the embeddings tensor to introduce a noken at that point, then
prompts GPT-J to define that noken

was run with two other definition-based prompt templates:

According to the dictionary, '<NOKEN>' means
According to the Oxford English Dictionary, '<NOKEN>' means

50 times at the following distances from centroid: [0.1, 0.25, 0.5, 0.75, 0.9, 1, 1.1, 1.5, 2, 5, 10, 20, 50, 100, 500, 1000, 5000, 10000] to get some sense of how these definitions depend on the exact wording of the prompt.

“According to the dictionary...”

With this prompt, the key difference was seen closest to centroid.

At distance 0.1, rather than the ubiquitous “A person who is not a member of a group”, we see only “‘to be’ or ‘to become’” (~90%), “‘to be’ or ‘to exist’” (~5%) and “‘to be’ or ‘to have’” (~5%).

At distance 0.25, this pattern persists, with the occasional “to be in a state of...” or “to be in the position of...”.

At 0.5, a little more variation starts to emerge with definitions like “to be in a position of authority or power”, “to be in the same place as...” and “in the middle of...”.

At 0.75, all of these themes continue, with the addition of “a thing that is used to make something else” and “a thing that is not a thing”.

At 0.9 some familiar themes from similar strata with the original prompt emerge: “to be in a state of being”, “to be able to do something” and, finally, we begin to see “to be a member of a group”. Less familiar are “a thing that is a part of something else” and “a space between two words”.

At 1.0, specifics of group membership appear with “a person who is a member of a group of people who are united by a common interest or cause” and “a person who is a member of a particular group or class”. Also seen is “to make a gift of”, which occasionally appeared in response to the original prompt.

At distance 1.1, apart from the prevalence of “‘to be’ or ‘to exist’” in place of “a person who is a member of a group” the distribution of outputs looks very familiar from the original definition prompt.

At 1.5, highly specific definitions seen with the previous prompt appear again, almost word-for-word: “a person who is a member of the clergy, especially a priest or a bishop”, “to be in a state of stupor or drunkenness”, “a person who is a member of a guild or trade union”, “a place where one can be alone”, “to be in a state of confusion, perplexity, or doubt”. We also see the first occurrence of small things.

At 2, familiar output styles like “a small piece of wood or metal used for striking or hammering” and “to make a sound like a squeak or squeak” start to show up, alongside the already established themes like states of being, places of refuge and small amounts of things.

Around 5, the semantic diversity starts to peak with familiar-themed outputs like “a piece of cloth or other material used to cover the head of a bed or a person lying on it”, “a small, sharp, pointed instrument, used for piercing or cutting”, “to be in a state of confusion, perplexity, or doubt”, “a place where a person or thing is located”, “piece of cloth or leather, used as a covering for the head, and worn by women in the East Indies”, “a person who is a member of a Jewish family, but who is not a Jew by religion”, “a piece of string or wire used for tying or fastening”, etc.

This continues at distance 10, with familiar-looking noken definitions like “a person who is a member of the tribe of Judah”, “to be in a state of readiness for action”, “a small, pointed, sharp-edged, or pointed instrument, such as a needle, awl, or pin, used for piercing or boring”, “a small room or closet”, “a place of peace and quiet, a sanctuary, a retreat”, “a small, usually nocturnal, carnivorous mammal of the family Viverridae, native to Africa and Asia, with a long, slender, pointed snout and a long, bushy tail”, “‘to be in charge of’ or ‘to have authority over’”, “to be in a state of readiness to flee”, “a place where a person is killed”, “a large, round, flat-topped mountain, usually of volcanic origin, with a steep, conical or pyramidal summit”, etc.

Distance 20: “a person who is a member of the royal family of England, and is the eldest son of the present king, George III”, “a person who is a member of a group of people who are not married to each other”, “a group of people who are united by a common interest or purpose”, “a small, thin, cylindrical, hollow, metallic, or nonmetallic, usually tubular, structure, usually made of metal”, “to put in a hole”, “a small, round, hard, black, shiny, and smooth body, which is found in the head of the mussel”, “a large, round, flat, and usually smooth stone, used for striking or knocking”, “a small, round, hard, brownish-black insect, found in the bark of trees, and having a very short proboscis, and a pair of wings”, “a large, heavy, and clumsy person”, “to be in a state of frenzy or frenzy-like excitement”, “a person who is a member of the Communist Party of China”, etc.

At distances 50 and 100, we see a similar mix of definitions, very similar to what we saw with the original definition prompt. At distance 500, a lot of the semantic richness has fallen away: Definitions like “a person who is a member of a particular group or class of people” now make up about half of the outputs. “to be in a state of being” and “‘to be’ or ‘to exist’” are again common. “to make a hole in” shows up a few times. At distance 1000, more than half of the definitions begin “a person who is a member of...”. At distance 5000, it’s over ²⁄₃ and at distance 10000 it’s over ³⁄₄ (almost all of the other definitions are some form of “to be” or “to be in a state of...”. The group membership tends to involve something generic, e.g. “a particular group or class of people” or “a group of people who share a common interest or activity”, although we occasionally see other more specific (and familiar) contexts, e.g. members of royal families or the clergy.

“According to the Oxford English Dictionary...”

With this prompt,

According to the Oxford English Dictionary, '<NOKEN>' means

we see very similar results to the last prompt. One noticeable difference is at distance 0.1 where 95% of outputs are “‘to be’ or ‘to have’” rather than “‘to be’ or ‘to become’”. This is gradually replaced by “‘to be’ or ‘to exist’” as we approach 0.75. From that point on, with some minor shifts of emphasis, the outputs are pretty much the same as what we’ve seen with the previous prompt. Group membership starts to become a noticeable thing around distance 1-2, along with states of being, holes, small/flat/round/pointy things, pieces of cloth, etc.; in the 5-10 region we start to see religious orders, power relations, social hierarchies and bizarre hyperspecific definitions like “a small, soft, and velvety fur of a light brownish-yellow colour, with a silky lustre, and a very fine texture, and is obtained from the fur of the European hedgehog”; venturing out to distances of 5000 and 10000, we see lists of definition outputs entirely indistinguishable than those seen for the last prompt.

^
GPT-J actually has an extra 143 “dummy” tokens, bringing the number to 50400, for architectural and training reasons. These have no have bearing on anything reported here.

Mapping the semantic void: Strange goings-on in GPT embedding spaces