Michaël Trazzi comments on Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

Michaël Trazzi 22 Jul 2022 20:55 UTC
11 points
2
In their announcement post they mention:
Mechanistic interpretability research in a similar vein to the work of Chris Olah and David Bau, but with less of a focus on circuits-style interpretability and more focus on research whose insights can scale to models with many billions of parameters and larger. Some example approaches might be:
- Locating and editing factual knowledge in a transformer language model.
- Using deep learning to automate deep learning interpretability—for example, training a language model to give semantic labels to neurons or other internal circuits.
- Studying the high-level algorithms that models use to perform e.g, in-context learning or prompt programming.
- Thomas Larsen 22 Jul 2022 21:01 UTC
  1 point
  0
  Parent
  Thanks!