Mateusz Bagiński comments on Using (Uninterpretable) LLMs to Generate Interpretable AI Code

Mateusz Bagiński 5 Jul 2023 16:07 UTC
1 point
0
I mostly second Beren’s reservations, but given that current models can already improved sorting algorithms in ways that didn’t occur to humans (ref), I think it’s plausible that they prove useful in generating algorithms for automating interpretability and the like. E.g., some elaboration on ACDC, or ROME, or MEMIT.
- Joar Skalse 8 Jul 2023 8:48 UTC
  3 points
  2
  Parent
  Note that this proposal is not about automating interpretability.