Cleo Nardo comments on The Waluigi Effect (mega-post)

Cleo Nardo 3 Mar 2023 13:44 UTC
4 points
2
Yep, this sounds like a promising idea. Maybe connected to Christiano’s ELK.
- Joseph Bloom 5 Mar 2023 18:50 UTC
  2 points
  1
  Parent
  I would be very surprised if complex high level behavior was mediated strongly by a single neuron due to superposition. Engineering polysemanticity (“making it depend on many different neurons”) feels like the flip side of engineering monosemanticity so you might want to read Adam Jermyn’s post on the topic.