Hoagy comments on What’s up with LLMs representing XORs of arbitrary features?

Hoagy 14 Jan 2024 8:02 UTC
1 point
0
Yeah I’d expect some degree of interference leading to >50% success on XORs even in small models.
- Clément Dumas 16 Jan 2024 10:34 UTC
  1 point
  0
  Parent
  You can get ~75% just by computing the or. But we found that only at the last layer and step16000 of Pythia-70m training it achieves better than 75%, see this video