RHollerith comments on Is it Legal to Maintain Turing Tests using Data Poisoning, and would it work?

RHollerith 7 Sep 2024 14:12 UTC
1 point
0
There is a trend toward simplifying model architectures. For example, AlphaGo Zero is simpler than AlphaGo in that it was created without using data from human games. AlphaZero in turn was simpler than AlphaGo Zero (in some way that I cannot recall right now).

Have you tried to find out whether any of the next-generation LLMs (or “transformer-based models”) being trained now even bothers to split text into tokens?
- Double 8 Sep 2024 18:06 UTC
  1 point
  0
  Parent
  Good point, I didn’t know about that, but yes that is yet another way that LLMs will pass the spelling challenge. For example, this paper uses letter triples instead of tokens. https://arxiv.org/html/2406.19223v1#:~:text=Large language models (LLMs) have,textual data into integer representation.