NN recognising dogs vs cats as part of an image net classifier that would class a piece of paper with ‘dog’ written on as a dog
GPT-4 able to describe an image of a dog/cat in great detail
Computer doing matrix multiplication.
The range of cases in which the equivalence between the what the computer is doing, and our high level description is doing holds increases as we do down this list, and depending on what cases are salient, it becomes more or less explanatory to say that the algorithm is doing task X.
My claim is that the deep metaphysical distinction is between “the computer is changing transistor voltages” and “the computer is multiplying matrices”, not between “the computer is multiplying matrices” and “the computer is recognising dogs”.
Once we move to a language game in which “the computer is multiplying matrices” is appropriate, then we are appealing to something like the X-Y Criterion for assessing these claims.
The sentences are more true the tighter the abstraction is —
The machine does X with greater probability.
The machine does X within a larger range of environments.
The machine has fewer side effects.
The machine is more robust to adversarial inputs.
Etc
But SOTA image classifiers are better at recognising dogs than humans are, so I’m quite happy to say “this machine recognises dogs”. Sure, you can generate adversarial inputs, but you could probably do that to a human brain as well if you had an upload.
The philosophical leap from voltages to matrices, i.e. allowing that a physical system could ever be ‘doing’ high level description X. This is a bit weird at first but also clearly true as soon you start treating X as having a specific meaning in the world as opposed to just being a thing that occurs in human mind space.
The empirical claim that this high level description X fits what the computer is doing.
I think the pushback to the post is best framed in terms of which frame is best for talking to people who deny that it’s ‘really doing X’. In terms of rhetorical strategy and good quality debate, I think the correct tactic is to try and have the first point mutually acknowledged in the most sympathetic case, and try to have a more productive conversation about the extent of the correlation, while I think aggressive statements of ‘it’s always actually doing X if it looks like its doing X’ are probably unhelpful and become a bit of a scissor. (memetics over usefulness har har!)
Maybe worth thinking about this in terms of different examples:
NN detecting the presence of tanks just by the brightness of the image (possibly apocryphal—Gwern)
NN recognising dogs vs cats as part of an image net classifier that would class a piece of paper with ‘dog’ written on as a dog
GPT-4 able to describe an image of a dog/cat in great detail
Computer doing matrix multiplication.
The range of cases in which the equivalence between the what the computer is doing, and our high level description is doing holds increases as we do down this list, and depending on what cases are salient, it becomes more or less explanatory to say that the algorithm is doing task X.
Yeah, I broadly agree.
My claim is that the deep metaphysical distinction is between “the computer is changing transistor voltages” and “the computer is multiplying matrices”, not between “the computer is multiplying matrices” and “the computer is recognising dogs”.
Once we move to a language game in which “the computer is multiplying matrices” is appropriate, then we are appealing to something like the X-Y Criterion for assessing these claims.
The sentences are more true the tighter the abstraction is —
The machine does X with greater probability.
The machine does X within a larger range of environments.
The machine has fewer side effects.
The machine is more robust to adversarial inputs.
Etc
But SOTA image classifiers are better at recognising dogs than humans are, so I’m quite happy to say “this machine recognises dogs”. Sure, you can generate adversarial inputs, but you could probably do that to a human brain as well if you had an upload.
Hmm, yeah there’s clearly two major points:
The philosophical leap from voltages to matrices, i.e. allowing that a physical system could ever be ‘doing’ high level description X. This is a bit weird at first but also clearly true as soon you start treating X as having a specific meaning in the world as opposed to just being a thing that occurs in human mind space.
The empirical claim that this high level description X fits what the computer is doing.
I think the pushback to the post is best framed in terms of which frame is best for talking to people who deny that it’s ‘really doing X’. In terms of rhetorical strategy and good quality debate, I think the correct tactic is to try and have the first point mutually acknowledged in the most sympathetic case, and try to have a more productive conversation about the extent of the correlation, while I think aggressive statements of ‘it’s always actually doing X if it looks like its doing X’ are probably unhelpful and become a bit of a scissor. (memetics over usefulness har har!)