You can also ask it what they have in common. Most of them are strings found in computer code, which supports the author’s hypothesis that “[m]any of the anomalous tokens look like they may have been scraped from backends of e-commerce sites, Reddit threads, Twitch streams, etc. – sources which may well have not been included in the training corpuses”.
Me: What do the following tokens have in common? [‘ForgeModLoader’, …, ′ strutConnector’]
ChatGPT: These tokens appear to be mostly strings used in some type of computer programming or code, such as in HTML, Markdown, or a programming language.
You can also ask it what they have in common. Most of them are strings found in computer code, which supports the author’s hypothesis that “[m]any of the anomalous tokens look like they may have been scraped from backends of e-commerce sites, Reddit threads, Twitch streams, etc. – sources which may well have not been included in the training corpuses”.