Sentience is the capacity to experience anything – the fact that when your brain is thinking or processing visual information, you actually feel what it’s like to think and to see.
The following thought experiment demonstrates, step by step, why conscious experience must be a product of functional patterns, not any specific physical material or structure.
The Inevitable Self-Report of Sentience
Consider a person who says, ‘Wow, it feels so strange to see and to think,’ speaking of their own conscious experience.
Every human action results from precise patterns of neuronal firing, cascading through the brain until reaching motor neurons that cause them to take actions, in this case to produce speech describing their sentience. Brains are bound by the laws of physics – if the same neurons fire with the same timings, it will result in the same outward behavior with absolute certainty.
It cannot be perpetual coincidence that our subjective experience always lines up with what our brain and body are doing. There is some reason for the synchronization.
When someone talks about their sentience, the state of being sentient must influence their behavior – otherwise, it would be physically impossible for them to speak about their sentience. The mere fact that they can describe their experience means that sentience has played a causal role in their behavior.
Now, replace one neuron with a functionally identical unit, one that takes the same inputs and fires the same way. The behavior of the person remains the same, and they still say, “Wow, it feels so strange to see and to think.” This remains true if you replace more neurons – even the entire brain – with functionally equivalent units. The person will still say the same thing.
Taking it further – replace the entire brain with a single piece of hardware that takes in the same sensory input signals and produces the same outputs to the motor neurons, using software equivalents for each neuron. The person will still say, “Wow, it feels so strange to see and to think.”
Since discussing one’s sentience proves that sentience influences behavior, and we have ruled out anything other than firing patterns as capable of driving behavior, it follows that sentience emerges from these firing patterns and their functional relationships. Sentience doesn’t require biology, quantum magic, or any particular physical substrate beyond something capable of modeling these patterns. Sentience emerges from the patterns themselves, not from what creates them.
It doesn’t line up, for me at least. What it feels like is not clearly the same thing as others understand my communication of it to be. Nor the reverse—it’s unclear that how I interpret their reports tracks very well with their actual perception of their experiences. And there are orders of magnitude more detail going on in my body (and even just in my brain) than I perceive, let alone that I communicate.
Until you operationally define “sentience”, as in how do you detect and measure it, in the face of potential errors and lies of reported experiences, you should probably taboo it. Circular arguments that “something is discussed, therefore that thing exists” are pretty weak, and don’t show anything important about that something.
I will revise the post when I get a chance because this is a common interpretation of what I said, which wasn’t my intent. My assertion isn’t “if someone or something claims sentience, it must definitely actually be sentient”. Instead we are meant to start with the assumption that the person at the start of the experiment is definitely sentient, and definitely being honest about it. Then the chain of logic starts from that baseline.
There are no sentient details going on that you wouldn’t perceive.
It doesn’t matter if you communicate something, the important part is that you are capable of communicating it, which means that in changes your input/output pattern (if it didn’t, you wouldn’t be capable of communicating it even in principle).
This isn’t the argument in the OP (even though, when reading quickly, I can see how someone could get that impression).
I think we’re spinning on an undefined term. I’d bet there are LOTS of details that effect my perception in subtle and aggregate ways which I don’t consciously identify. but i have no clue which perceived or unperceived details add up to my conception of sentience, and even less do I understand yours.
You’re equivocating between perceiving a collection of details and consciously identifying every separate detail.
If I show you a grid of 100 pixels, then (barring imperfect eyesight) you will consciously perceive all 100 them. But you will not consciously identify every individual pixel unless your attention is aimed at each pixel in a for loop (that would take longer than consciously perceiving the entire grid at once).
There are lots of details that affect your perception that you don’t consciously identify. But there is no detail that affects your perception that wouldn’t be contained in your consciousness (otherwise it, by definition, couldn’t affect your perception).
This is a classic thought experiment, perhaps the central argument for functionalism. If you’ve rederived it, congratulations!
I think probably most LWers assume this to be true.
thank you kindly. I had heard about a general neuron replacement thought experiment before as sort of an open question. What I was hoping to add here is the specific scenario of this experiment done on someone who begins the experiment as definitively sentient, and they are speaking of their own sentience. This fills in a few holes and answers a few questions that I think lead us to a conclusion rather than a question
I think this a very simple and likewise powerful articulation of anti-anthropocentrism, which I fully support.
I am particularly on board with “Sentience doesn’t require [...] any particular physical substrate beyond something capable of modeling these patterns.”
To correctly characterize the proof as 100% logical I think there is still some room for explicitly stating rigorous definitions of the concepts involved such as sentience.
Thank you for sharing!
Proof is a really strong word and (in my opinion) inappropriate in this context. This is about to become an extremely important question and we should be careful to avoid overconfidence. I’ve personally found this comment chain to be an enlightening discussion on the complexity of this issue (but of course this is something that has been discussed endlessly elsewhere).
As a separate issue, let’s say I write down the rule set for an automaton that will slowly grow and eventually emulate every finite string of finite grids of black and white pixels. This is not hard to do. Does it require a substrate to become conscious or is the rule set itself conscious? What if I actually run it in a corner of the universe that slowly uses surrounding matter to allow its output to grow larger and larger?
there are certainly a lot of open specific questions—such as—what precisely about the firing patterns is necessary for the emergence of sentience.
note that in the setting of the second paragraph I wrote, every “firing pattern” will eventually emerge. You may have misunderstood my comment as taking the basic premise of your post as true and quibbling about the details, but I am skeptical about even the fundamental idea
oh I understood you weren’t agreeing. I was just responding that I don’t know what aspects of ‘firing patterns’ specifically cause sentience to emerge, or how it would or wouldn’t apply to your alternate scenarios.
I see. There’s a really nice post here (maybe several) that touches on that idea in a manner similar to the Ship of Theseus, but I can’t find it. The basic idea was that if we take for granted the idea that mind uploads are fully conscious but then start updating the architecture to optimize for various things, is there a point where we are no longer “sentient”.
yeah, it’s all very weird stuff. Also, what is required for continuity? - like staying the same you, and not just someone who has all your memories and thinks they’re still you?
As other people have said, this is a known argument; specifically, it’s in The Generalized Anti-Zombie Principle in the Physicalism 201 series. From the very early days of LessWrong
I think this proof relies on three assumptions. The first (which you address in the post) is that consciousness must happen within physics. (The opposing view would be substance dualism where consciousness causally acts on physics from the outside.) The second (which you also address in the post) is that consciousness and reports about consciousness aren’t aligned by chance. (The opposing view would be epiphenomenalism, which is also what Eliezer trashes extensively in this sequence.)
The third assumption is one you don’t talk about, which is that switching the substrate without affecting behavior is possible. This assumption does not hold for physical processes in general; if you change the substrate of a plank of wood that’s thrown into a fire, you will get a different process. So the assumption is that computation in the brain is substrate-independent, or to be more precise, that there exists a level of abstraction in which you can describe the brain with the property that the elements in this abstraction can be implemented by different substrates. This is a mouthful, but essentially the level of abstraction would be the connectome—so the idea is that you can describe the brain by treating each neuron as a little black box about which you just know its input/output behavior, and then describe the interactions between those little black boxes. Then, assuming you can implement the input/output behavior of your black boxes with a different substrate (i.e., an artificial neuron), you can change the substrate of the brain while leaving its behavior intact (because both the old and the new brain “implement” the abstract description, which by assumption captures the brain’s behavior).
So essentially you need the neuron doctirne to be true. (Or at least the neuron doctrine is sufficient for the argument to work.)
If you read the essay, Eliezer does mostly acknowledge this assumption. E.g., he talks about neuron’s local behavior, implying that the function of a neuron in the brain is entirely about its local behavior (if not, this makes the abstract description more difficult at least, may or may not still be possible).
He also mentions the quantum gravity bit with Penrose, which is one example of how the assumption would be false, although probably a pretty stupid one. Something more concerning may be ephaptic coupling, which are non-local effects of neurons. Are those copied by artificial neurons as well? If you want to improve upon the argument, you could discuss the validity of those assumptions, i.e., why/how we are certain that the brain can be fully described as a graph of modular units.
(Also, note that the argument as you phrase it only proves that you can have a brain-shaped thing with different neurons that’s still conscious, which is slightly different from the claim that a simulation of a human on a computer would be conscious. So if the shape of the brain plays a role for computation (as e.g. this paper claims), then your argument still goes through but the step to simulations becomes problematic.)
This is guaranteed, because the universe (and any of its subsets) is computable (that means a classical computer can run software that acts the same way).
Also, here’s a sufficient reason why this isn’t true. As far as I know, Integrated Information Theory is currently the only highly formalized theory of consciousness in the literature. It’s also a functionalist theory (at least according to my operationalization of the term.) If you apply the formalism of IIT, it says that simulations on classical computers are minimally conscious at best, regardless of what software is run.
Now I’m not saying IIT is correct; in fact, my actual opinion on IIT is “100% wrong, no relation how consciousness actually works”. But nonetheless, if the only formalized proposal for consciousness doesn’t have the property that simulations preserve consciousness, then clearly the property is not guaranteed.
So why does IIT not have this property? Well because IIT analyzes the information flow/computational steps of a system—abstracting away the physical details, which is why I’m calling it functionalist—and a simulation of a system performs completely different computational steps than the original system. I mean it’s the same thing I said in my other reply; a simulation does not do the same thing as the thing it’s simulating, it only arrives at the same outputs, so any theory looking at computational steps will evaluate them differently. They’re two different algorithms/computations/programs, which is the level of abstraction that is generally believed to matter on LW. Idk how else to put this.
indeed. I think we should stop there though. The fact that it’s so formalized is part of the absurdity of IIT. There are a bunch of equations that are completely meaningless and not based in anything empirical whatsoever.
The goal of my effort with this proof, regardless of whether there is a flaw in the logic somewhere, is that I think if we can take a single inch forward based on logical or axiomatic proofs, this can begin to narrow down our sea of endless speculative hypotheses, then those inches matter.
I don’t think just because we have no way of solving the hard problem yet, or formulating a complete theory of consciousness, that this doesn’t mean we can’t make at least a couple of tiny inferences we can know with a high degree of certainty. I think it’s a disservice to this field that most high profile efforts have a complete framework of the entirety of consciousness stated as theory, when it’s completely possible to start moving forward one tiny step at a time without relying on speculation.
I’m totally on board with everything you said here. But I didn’t bring up IIT as a rebuttal to anything you said in your post. In fact, your argument about swapping out neurons specifically avoids the problem I’m talking about in this above comment. The formalism of IIT actually agrees with you that swapping out neurons in a brain doesn’t change consciousness (given the assumptions I’ve mentioned in the other comment)!
I’ve brought up IIT as a response to a specific claim—which I’m just going state again since I feel like I keep getting misunderstood as making more vague/general claims than I’m in fact making. The claim (which I’ve seen made on LW before) is that we know for a fact that a simulation of a human brain on a digital computer is conscious because of the Turing thesis. Or at least, that we know this for a fact if we assume some very basic things about the universe like laws of physics are complete and functionalism is true. So like, the claim is that every theory of consciousness that agrees with these two premises also states that a simulation of a human brain has the same consciousness as that human brain.
Well, IIT is a theory that agrees with both of these premises—it’s a functionalist proposal that doesn’t postulate any violation to the laws of physics—and it says that simulations of human brains have completely different consciousness than human brains themselves. Therefore, the above claim doesn’t seem true. This is my point; no more, no less. If there is a counter-example to an implication A⟹B, then the implication isn’t true; it doesn’t matter if the counter-example is stupid.
Again, does not apply to your post because you talked about swapping neurons in a brain, which is different—IIT agrees with your argument but disagrees with green_leaf’s argument.
No. Computability shows that you can have a classical computer that has the same input/output behavior, not that you can have a classical computer that acts the same way. Input/Output behavior is generally not considered to be enough to guarantee same consciousness, so this doesn’t give you what you need. Without arguing about the internal workings of the brain, a simulation of a brain is just a different physical process doing different computational steps that arrives at the same result. A GLUT (giant look-up table) is also a different physical process doing different computational steps that arrives at the same result, and Eliezer himself argued that GLUT isn’t conscious.
The “let’s swap neurons in the brain with artificial neurons” is actually a much better argument than “let’s build a simulation of the human brain on a different physical system” for this exact reason, and I don’t think it’s a coincidence that Eliezer used the former argument in his post.
That’s what I mean (I’m talking about the input/output behavior of individual neurons).
It should be, because it is, in fact, enough. (However, neither the post, nor my comment require that.)
Yes, and that’s false (but since that’s not the argument in the OP, I don’t think I should get sidetracked).
That’s false. If we assume for a second that the ITT really is the only formalized theory of consciousness, it doesn’t follow that the property is not, in fact, guaranteed. It could also be that the ITT is wrong and that in the actual reality, the property is, in fact, guaranteed.
Ah, I see. Nvm then. (I misunderstood the previous comment to apply to the entire brain—idk why, it was pretty clear that you were talking about a single neuron. My bad.)
Excluding coincidence doesn’t proved that an entity’s reports of consciousness are directly caused by its own consciousness. Robo-Chalmers will claim to be conscious because Chalmers does. It might actually be conscious, as an additional reason, or it might not. The fact that the claim is made does not distinguish the two cases. Yudkowsky makes much of the fact that Robo-Chalmers claim.would be caused indirectly by consciousness—Chalmers has to be conscious in order to make a computational duplicate of his consciousness—but at best that refutes the possibility of a zombie world, where entities claim to be conscious, although consciousness has never existed. Robo-Chalmers would still be possible in this world for reasons Yudkowsky accepts. So there is one possible kind of zombie, even given physicalism so the Generalised Anti Zombie Principle is false
(Note that I am talking about computational zombies, or c-zombies, not p-zombies
Computationalism isn’t a direct consequence of physicalism. Physicalism has it that an exact atom-by-atom duplicate of a person will be a person and not a zombie, because there is no nonphysical element to go missing. That’s the argument against p-zombies. But if actually takes an atom-by-atom duplication to achieve human functioning, then the computational theory of mind will be false, because there CTM implies that the same algorithm running on different hardware will be sufficient. Physicalism doesn’t imply computationalism, and arguments against p-zombies don’t imply the non existence of c-zombies-duplicates that are identical computationally, but not physically).
@Richard_Kennaway
That sounds like a Chalmers paper. https://consc.net/papers/qualia.html
just read his post. interesting to see someone have the same train of thought starting out, but then choose different aspects to focus on.
Any non-local behaviour by the neurons shouldn’t matter if the firing patterns are replicated. I think focusing on the complexity required by the replacement neurons is missing the bigger picture. Unless the contention is that the signals that arrive at the motor neurons have been drastically affected by some other processes, enough so that they overrule some long-held understanding of how neurons operate, they are minor details.
”The third assumption is one you don’t talk about, which is that switching the substrate without affecting behavior is possible. This assumption does not hold for physical processes in general; if you change the substrate of a plank of wood that’s thrown into a fire, you will get a different process. So the assumption is that computation in the brain is substrate-independent”
Well, this isn’t the assumption, it’s the conclusion (right or wrong). It appears from what I can tell is that the substrate is the firing patterns themselves.
I haven’t delved too deeply into Penrose’s stuff for quite some time. What I read before doesn’t seem to explain how quantum effects are going to influence action potential propagation on a behaviour-altering scale. It seems like throwing a few teaspoons of water at a tidal wave to try to alter its course.
You say “Now, replace one neuron with a functionally identical unit, one that takes the same inputs and fires the same way” and then go from there. This step is where you make the third assumption, which you don’t justify.
Agreed—I didn’t say anything that complexity itself is a problem, though, I said something much more specific.
I don’t see how it’s an assumption. Are we considering that the brain might not obey the laws of physics?
I mentioned complexity because you brought up a specific aspect of what determines the firing patterns, and my response is just to say ’sure, our replacement neurons will take in additional factors as part of their input and output’
basically, it seemed that part of your argument is that the neuron black box is unimplementable. I just don’t buy into the idea that neurons operate so vastly differently than the rest of reality to the point their behaviour can’t be replicated
If you consider the full set of causal effects of a physical object, then the only way to replicate those exactly is with the same object. This is just generally true; if you change anything about an object, this has changes to the particle structure, and that comes with measurable changes. An artificial neuron is not going to have exactly 100% the same behavior as a biological neuron.
This is why I made the comment about the plank of wood—it’s just to make the point that, in general, across all physical processes, substrate is causally relevant. This is a direct implication of the laws of physics; every particle has a continuous effect that depends on its precise location, any two objects have particles in different places, so there is no such thing as having a different object that does exactly the same thing.
So any step like “we’re going to take out this thing and then replace it with a different thing that has the same behavior” makes assumptions about the structure of the process. Since the behavior isn’t literally the same, you’re assuming that the system as a whole is such that the differences that do exist “fizzle out”. E.g., you might assume that it’s enough to replicate the changes to the flow of current, whereas the fact the new neurons have a different mass will fizzle out immediately and not meaningfully affect the process. (If you read my initial post, this is what I was getting at with the abstraction description thing; I was not just making a vague appeal to complexity.)
Absolutely not; I’m not saying that any of these assumptions are wrong or even hard to justify. I’m just pointing out that this is, in fact, an assumption. Maybe this is so pedantic that it’s not worth mentioning? But I think if you’re going to use the word proof, you should get even minor assumptions right. And I do think you can genuinely prove things; I’m not in the “proof is too strong a word for anything like this” camp. So by analogy, if you miss a step in a mathematical proof, you’d get points deducted even if the thing you’re proving is still true, and even if the step isn’t difficult go get right. I really just want people to be more precise when they discuss this topic.
I see what you’re saying, but I disagree with substrate’s relevance in this specific scenario because:
”An artificial neuron is not going to have exactly 100% the same behavior as a biological neuron.”
it just needs to fire at the same time, none of the internal behaviour need to be replicated or simulated.
So—indulging intentionally in an assumption this time—I do think those tiny differences fizzle out. I think it’s insignificant noise to the strong signal. What matters most in neuron firing is action potentials. This isn’t some super delicate process that will succumb to the whims of minute quantum effects and picosecond differences.
I assume that much like a plane doesn’t require feathers to fly, that sentience doesn’t require this super exacting molecular detail, especially given how consistently coherent our sentience feels to most people, despite how damn messy biology is. People have damaged brains, split brains, brains whose chemical balance is completely thrown off by afflictions or powerful hallucinogens, and yet through it all—we still have sentience. It seems wildly unlikely that it’s like ‘ah! you’re close to creating synthetic sentience, but you’re missing the serotonin, and some quantum entanglement’.
I know you’re weren’t arguing for that stance, I’m just stating it as a side note.
Nice; I think we’re on the same page now. And fwiw, I agree (except that I think you need just a little more than just “fire at the same time”). But yes, if the artificial neurons affect the electromagnetic field in the same way—so not only fire at the same time, but with precisely the same strength, and also have the same level of charge when they’re not firing—then this should preserve both communication via synaptic connections and gap junctions, as well as any potential non-local ephaptic coupling or brain wave shenanigans, and therefore, the change to the overall behavior of the brain will be so minimal that it shouldn’t affect its consciousness. (And note that concerns the brain’s entire behavior, i.e., the algorithm it’s running, not just its input/output map.)
If you want to work more on this topic, I would highly recommend trying to write a proof for why simulations of humans on digital computers must also be conscious—which, as I said in the other thread, I think is harder than the proof you’ve given here. Like, try to figure out exactly what assumptions you do and do not require—both assumptions about how consciousness works and how the brain works—and try to be as formal/exact as possible. I predict that actually trying to do this will lead to genuine insights at unexpected places. No one has ever attempted this on LW (or at least there were no attempts that are any good),[1] so this would be a genuinely novel post.
I’m claiming this based on having read every post with the consciousness tag—so I guess it’s possible that someone has written something like this and didn’t tag it, and I’ve just never seen it.
The argument is circular — this is the very thing you are claiming to prove.
Anyway, I expect the argument is familiar to people on LessWrong already. Somewhere on the web (that I couldn’t find) there’s a presentation by Dennett a long time ago in which he dramatises the same argument.
The part you’re quoting is just that the resulting outward behaviour will be preserved, and is just a baseline fact of deterministic physics. What I’m trying to prove is that sentience (partially supported by that fact) is fully emergent from the neuron firing patterns.