Noosphere89 comments on Why I’m Working On Model Agnostic Interpretability

Noosphere89 12 Nov 2022 19:16 UTC
2 points
−2
I’m unsure about this, because if you’re not black-boxing things, then you think that something specific lies in that structure. And that specificity is what makes it no longer agnostic to model choice.

You have to black box if you want maximally general insights.
- β-redex 13 Nov 2022 0:54 UTC
  2 points
  0
  Parent
  I think we usually don’t generalize very far not because we don’t have general models, but because it’s very hard to state any useful properties about very general models.
  
  You can trivially view any model/agent as a Turing machine, without loss of generality.^[1] We just usually don’t do that because it’s very hard to state anything useful about such a general model of computation. (It seems very hard to prove/disprove P=NP, we know for a fact that halting is undecidable, etc.)
  
  I am very interested though what model John will use to state useful theorems that capture both the current DL paradigm, and the next paradigm with high probability. (He might have written about this somewhere already, haven’t read all his stuff yet.)
  ↩︎
  Assuming determinism, but OP’s black-box interpretability stuff already seems to assume that.
  - tailcalled 17 Nov 2022 12:16 UTC
    3 points
    0
    Parent
    I think he addressed it in Don’t Get Distracted By The Boilerplate.