John Simons

Karma: 18

John Simons 6 Feb 2023 15:22 UTC
19 points
0
in reply to: vitaliya’s comment on: SolidGoldMagikarp (plus, prompt generation)
What is quite interesting about that dataset is the fact it has strings in the form “*number|*weirdstring*|*number*” which I remember seeing in some methods of training LLMs, i.e. “|” being used as delimiter for tokens. They could be poisoned training examples or have some weird effect in retrieval.