This is an excellent post! Thank you for sharing your thoughts! I too am very curious about many of these questions, although I’m also at a half-baked stage with a lot of it (I’d also love to have a better footing here!). But in any case, here are some thoughts (in no particular order).
I’ve been interested in the questions you pose around AlexNet for a while, in particular, how much computation is a function of observers assigning values versus an intrinsic property of the thing itself. And I agree this starts getting pretty weird and interesting when you consider that minds themselves are doing computations. Like, it seems pretty clear that if I write down a truth table on paper, it is not the paper or ink that “did” the computation, it was me. Likewise, if I take two atoms in a rock, call one 0, the other 1, then take an atom at a future state, call it 0, it seems clear that the computation “AND” happened entirely in my head and I projected it onto the rock (although I do think it’s pretty tricky to say why this is, exactly!). But what about if I rain marbles down on a circle circumscribed in a square (the ratio of which “calculates” pi)? In this case it feels a bit less arbitrary, the circle and the square chalked on the ground are “meaningfully” relating to the computation, although it is me who is doing the bulk of the work (taking the ratio)? This feels a bit more middle ground to me. In any case, I do think there is a spectrum between “completely intrinsic to the thing” and “agents projecting their own computation on the thing” and that this is largely ignored but incredibly interesting and instructive for how computation actually works.
Relatedly, people often roll their eyes at the Chinese Room thought experiment (and rightly so, because I think the conclusions people draw about it with respect to AI are often misguided). But I also think that it’s pointing to a deep confusion about computation that I also share. The standard take is that, okay maybe the person doesn’t understand Chinese but the “room does,” because all of the information is contained inside of it. I’m not really convinced by this. For the same reason that the truth table isn’t “doing” the computation of AND, I don’t think that the book that contains the translation is doing any meaningful computation, and I don’t think the human inside understands Chinese, either (in the colloquial sense we mean when we don’t understand a foreign language). There was certainly understanding when that book was generated, but all of that generative tech is absent in the room. So I think Searle is pointing at something interesting and informative here and I tentatively agree that the room does not understand Chinese (although I disagree with the conclusion that this means AI could never understand anything).
I do agree that input/output mappings are not a good mechanistic understanding of computation, but I would also guess that it’s the right level of abstraction for grouping different physical systems. E.g., the main similarity between the mechanical and electrical adder is that, upon receiving 1 and 1, output 2, and so on.
I get confused about why minds have special status, e.g. “computation is a function of both the dynamics and an observer.” On the one hand, it feels intuitive that they are special, and I get what you mean. And on the other hand, minds are also just physical systems. What is it about a mind that makes something a computation when it wasn’t otherwise? It’s something about how much of the computation stems from the mind versus the device? And how “entangled” the mind is with the computation, e.g., whether states in the non-mind system are correlated with states in the mind? Which suggests that the thing is not exactly “mind-ness” but how “coupled” various physical states are to each other.
I also think that the adder systems are far less (or maybe zero) observer dependent computations, relative to the rock or truth table, in the sense that there are a series of physically coupled states (within the system itself) which reliably turn the same input into the same output. Like, there is this step of a person saying what the inputs “represent,” but the person’s mind, once the device is built, does not need to be entangled with the states in the machine in order for it to do the computation. The representation step seems important, but also not as much about the computation itself rather than “how that computation is used.” Like, I think that when we look at isolated cases of computation (like the adders), this part feels weird because computation (as it normally plays out) is part of an interconnected system which “uses” the outputs of various computations to “do” something (like in a standard computer, the output of addition might be the input to the forward-prop in a neural net or whatever). And a “naked” computation is strange, because usually the “sense making” aspect of a computation is in how it’s used, not the steps needed to produce it. To be clear, I think the representation step is interesting (and notably the thing lacking in the Chinese Room), and I do think that it’s part of how computation is used in real-world contexts, but I still want to say that the adder “adds” whether we are there to represent the inputs as numbers or not. Maybe similar to how I want to say that the Chinese room “translates Chinese” whether or not anyone is there to do the “semantic work” of understanding what that means (which, in my view, is not a spooky thing, but rather something-something “a set of interconnected computations”).
Maybe a good way to think of these things is to ask “how much mind entanglement do you need at various parts of this process in order for the computation to take place?”
My guess is that computation is fundamentally something like “state reliably changes in response to other state.” Where both words (“state” and “reliably”) are a bit tricky to fully pin down and there are a bunch of thorny philosophical issues. For instance, “reliably” means something like “if I input the first state a bunch of times, the next state almost always follows”, but if-thens are hard to reconcile with deterministic world views. And “state” is typically referring to something abstract, e.g., we say “if the protein changes to this shape, then this gene is expressed,” but what exactly do we mean by “shape”? There is not a single, precise shape that works, there’s a whole class consisting of slight perturbations or different molecular constituents, etc. that will “get the job done,” i.e., express the gene. And without having a good foundation of what we mean by an abstraction, I think talking about natural computation can be philosophically difficult.
“Is there anything purely inside of AlexNet that can tell us that 1 in the output node means cat and that 0 means not cat?” I’m not sure exactly what you’re gesturing at with this, but my guess is that there is. I’m thinking of interpretability tools that show that cat features activate when shown a picture of a cat, and that these states reliably produce a “1” rather than a “0.” But maybe you’re talking about something else or have more uncertainty about it than me?
I agree that thinking is extremely wild! ’Nough said.
Thanks so much for this comment (and sorry for taking ~1 year to respond!!). I really liked everything you said.
For 1 and 2, I agree with everything and don’t have anything to add. 3. I agree that there is something about the input/output mapping that is meaningful but it is not everything. Having a full theory for exactly the difference, and what the distinctions between what structure counts as interesting internal computation (not a great descriptor of what I mean but can’t think of anything better right now) vs input output computation would be great.
4. I also think a great goal would be in generalizing and formalizing what an “observer” of a computation is. I have a few ideas but they are pretty half-baked right now.
5. That is an interesting point. I think it’s fair. I do want to be careful to make sure that any “disagreements” are substantial and not just semantic squabling here. I like your distinction between representation work and computational work. The idea of using vs. performing a computation is also interesting. At the end of the day I am always left craving some formalism where you could really see the nature of these distinctions.
6. Sounds like a good idea!
7. Agreed on all counts.
8. I was trying to ask the question if there is anything that tells us that the output node is semantically meaningful without reference to e.g. the input images of cats, or even knowledge of the input data distribution. Interpretability work, both in artificial neural networks and more traditionally in neuroscience, always use knowledge of input distributions or even input identity to correlate activity of neurons to the input, and in that way assign semantics to neural activity (e.g. recently, othello board states, or in neuroscience jennifer aniston neurons or orientation tuned neurons) . But when I’m sitting down with my eyes closed and just thinking, there’s no homonculus there that has access to input distributions on my retina that can correlate some activity pattern to “cat.” So how can the neural states in my brain “represent” or embody or whatever word you want to use, the semantic information of cat, without this process of correlating to some ground truth data. WHere does “cat” come from when theres no cat there in the activity?!
This is an excellent post! Thank you for sharing your thoughts! I too am very curious about many of these questions, although I’m also at a half-baked stage with a lot of it (I’d also love to have a better footing here!). But in any case, here are some thoughts (in no particular order).
I’ve been interested in the questions you pose around AlexNet for a while, in particular, how much computation is a function of observers assigning values versus an intrinsic property of the thing itself. And I agree this starts getting pretty weird and interesting when you consider that minds themselves are doing computations. Like, it seems pretty clear that if I write down a truth table on paper, it is not the paper or ink that “did” the computation, it was me. Likewise, if I take two atoms in a rock, call one 0, the other 1, then take an atom at a future state, call it 0, it seems clear that the computation “AND” happened entirely in my head and I projected it onto the rock (although I do think it’s pretty tricky to say why this is, exactly!). But what about if I rain marbles down on a circle circumscribed in a square (the ratio of which “calculates” pi)? In this case it feels a bit less arbitrary, the circle and the square chalked on the ground are “meaningfully” relating to the computation, although it is me who is doing the bulk of the work (taking the ratio)? This feels a bit more middle ground to me. In any case, I do think there is a spectrum between “completely intrinsic to the thing” and “agents projecting their own computation on the thing” and that this is largely ignored but incredibly interesting and instructive for how computation actually works.
Relatedly, people often roll their eyes at the Chinese Room thought experiment (and rightly so, because I think the conclusions people draw about it with respect to AI are often misguided). But I also think that it’s pointing to a deep confusion about computation that I also share. The standard take is that, okay maybe the person doesn’t understand Chinese but the “room does,” because all of the information is contained inside of it. I’m not really convinced by this. For the same reason that the truth table isn’t “doing” the computation of AND, I don’t think that the book that contains the translation is doing any meaningful computation, and I don’t think the human inside understands Chinese, either (in the colloquial sense we mean when we don’t understand a foreign language). There was certainly understanding when that book was generated, but all of that generative tech is absent in the room. So I think Searle is pointing at something interesting and informative here and I tentatively agree that the room does not understand Chinese (although I disagree with the conclusion that this means AI could never understand anything).
I do agree that input/output mappings are not a good mechanistic understanding of computation, but I would also guess that it’s the right level of abstraction for grouping different physical systems. E.g., the main similarity between the mechanical and electrical adder is that, upon receiving 1 and 1, output 2, and so on.
I get confused about why minds have special status, e.g. “computation is a function of both the dynamics and an observer.” On the one hand, it feels intuitive that they are special, and I get what you mean. And on the other hand, minds are also just physical systems. What is it about a mind that makes something a computation when it wasn’t otherwise? It’s something about how much of the computation stems from the mind versus the device? And how “entangled” the mind is with the computation, e.g., whether states in the non-mind system are correlated with states in the mind? Which suggests that the thing is not exactly “mind-ness” but how “coupled” various physical states are to each other.
I also think that the adder systems are far less (or maybe zero) observer dependent computations, relative to the rock or truth table, in the sense that there are a series of physically coupled states (within the system itself) which reliably turn the same input into the same output. Like, there is this step of a person saying what the inputs “represent,” but the person’s mind, once the device is built, does not need to be entangled with the states in the machine in order for it to do the computation. The representation step seems important, but also not as much about the computation itself rather than “how that computation is used.” Like, I think that when we look at isolated cases of computation (like the adders), this part feels weird because computation (as it normally plays out) is part of an interconnected system which “uses” the outputs of various computations to “do” something (like in a standard computer, the output of addition might be the input to the forward-prop in a neural net or whatever). And a “naked” computation is strange, because usually the “sense making” aspect of a computation is in how it’s used, not the steps needed to produce it. To be clear, I think the representation step is interesting (and notably the thing lacking in the Chinese Room), and I do think that it’s part of how computation is used in real-world contexts, but I still want to say that the adder “adds” whether we are there to represent the inputs as numbers or not. Maybe similar to how I want to say that the Chinese room “translates Chinese” whether or not anyone is there to do the “semantic work” of understanding what that means (which, in my view, is not a spooky thing, but rather something-something “a set of interconnected computations”).
Maybe a good way to think of these things is to ask “how much mind entanglement do you need at various parts of this process in order for the computation to take place?”
My guess is that computation is fundamentally something like “state reliably changes in response to other state.” Where both words (“state” and “reliably”) are a bit tricky to fully pin down and there are a bunch of thorny philosophical issues. For instance, “reliably” means something like “if I input the first state a bunch of times, the next state almost always follows”, but if-thens are hard to reconcile with deterministic world views. And “state” is typically referring to something abstract, e.g., we say “if the protein changes to this shape, then this gene is expressed,” but what exactly do we mean by “shape”? There is not a single, precise shape that works, there’s a whole class consisting of slight perturbations or different molecular constituents, etc. that will “get the job done,” i.e., express the gene. And without having a good foundation of what we mean by an abstraction, I think talking about natural computation can be philosophically difficult.
“Is there anything purely inside of AlexNet that can tell us that 1 in the output node means cat and that 0 means not cat?” I’m not sure exactly what you’re gesturing at with this, but my guess is that there is. I’m thinking of interpretability tools that show that cat features activate when shown a picture of a cat, and that these states reliably produce a “1” rather than a “0.” But maybe you’re talking about something else or have more uncertainty about it than me?
I agree that thinking is extremely wild! ’Nough said.
Thanks so much for this comment (and sorry for taking ~1 year to respond!!). I really liked everything you said.
For 1 and 2, I agree with everything and don’t have anything to add.
3. I agree that there is something about the input/output mapping that is meaningful but it is not everything. Having a full theory for exactly the difference, and what the distinctions between what structure counts as interesting internal computation (not a great descriptor of what I mean but can’t think of anything better right now) vs input output computation would be great.
4. I also think a great goal would be in generalizing and formalizing what an “observer” of a computation is. I have a few ideas but they are pretty half-baked right now.
5. That is an interesting point. I think it’s fair. I do want to be careful to make sure that any “disagreements” are substantial and not just semantic squabling here. I like your distinction between representation work and computational work. The idea of using vs. performing a computation is also interesting. At the end of the day I am always left craving some formalism where you could really see the nature of these distinctions.
6. Sounds like a good idea!
7. Agreed on all counts.
8. I was trying to ask the question if there is anything that tells us that the output node is semantically meaningful without reference to e.g. the input images of cats, or even knowledge of the input data distribution. Interpretability work, both in artificial neural networks and more traditionally in neuroscience, always use knowledge of input distributions or even input identity to correlate activity of neurons to the input, and in that way assign semantics to neural activity (e.g. recently, othello board states, or in neuroscience jennifer aniston neurons or orientation tuned neurons) . But when I’m sitting down with my eyes closed and just thinking, there’s no homonculus there that has access to input distributions on my retina that can correlate some activity pattern to “cat.” So how can the neural states in my brain “represent” or embody or whatever word you want to use, the semantic information of cat, without this process of correlating to some ground truth data. WHere does “cat” come from when theres no cat there in the activity?!
9. SO WILD