Roman Leventov comments on Charbel-Raphaël and Lucius discuss interpretability

Roman Leventov 21 Nov 2023 22:39 UTC
8 points
4
I agree with you, but it’s not clear that in lieu of explicit regularisation, DNNs, in particular LLMs, will compress to the degree that they become intelligible (interpretable) to humans. That is, their effective dimensionality might be reduced from 1T to 100M or whatever, but that would be still way too much for humans to comprehend. Explicit regularisation drives this effective dimensionality down.