[ I’m fascinated by intuitions around consciousness, identity, and timing. This is an exploration, not a disagreement. ]
would expect actual (human-equivalent) Boltzmann brains to have the exact same kind of consciousness as ordinary humans, just typically not for very long.
Hmm. In what ways does it matter that it wouldn’t be for very long? Presuming the memories are the same, and the in-progress sensory input and cognition (including anticipation of future sensory input, even though it’s wrong in one case), is there anything distinguishable at all?
There’s presumably a minimum time slice to be called “experience” (a microsecond is just a frozen lump of fatty tissue, a minute is clearly human experience, somewhere in between it “counts” as conscious experience). But as long as that’s met, I really don’t see a difference.
It’s not globally discrete, though, is it? Any individual neuron fires in a discrete way, but IIUC those firings aren’t coordinated across the brain into ticks. That seems like a significant difference.
Hmm. What makes it significant? I mean, they’re not globally synchronized, but that could just mean the universe’s quantum ‘tick’ is small enough that there are offsets and variable tick requirements for each neuron. This seems analogous with large model processing, where the activations and calculations happen over time, each with multiple processor cycles and different timeslices.
Not that I see! I would expect it to be fully indistinguishable until incompatible sensory input eventually reaches the brain (if it doesn’t wink out first). So far it seems to me like our intuitions around that are the same.
What makes it significant?
I think at least in terms of my own intuitions, it’s that there’s an unambiguous start and stop to each tick of the perceive-and-think-and-act cycle. I don’t think that’s true for human processing, although I’m certainly open to my mental model being wrong.
Going back to your original reply, you said ‘I think it’s really tricky to think that there are fundamental differences based on duration or speed of experience’, and that’s definitely not what I’m trying to point to. I think you’re calling out some fuzziness in the distinction between started/stopped human cognition and started/stopped LLM cognition, and I recognize that’s there. I do think that if you could perfectly freeze & restart human cognition, that would be more similar, so maybe it’s a difference in practice more than a difference in principle.
But it does still seem to me that the fully discrete start-to-stop cycle (including the environment only changing in discrete ticks which are coordinated with that cycle) is part of what makes LLMs more Boltzmann-brainy to me. Paired with the lack of internal memory, it means that you could give an LLM one context for this forward pass, and a totally different context for the next forward pass, and that wouldn’t be noticeable to the LLM, whereas it very much would be for humans (caveat: I’m unsure what happens to the residual stream between forward passes, whether it’s reset for each pass or carried through to the next pass; if the latter, I think that might mean that switching context would be in some sense noticeable to the LLM [EDIT—it’s fully reset for each pass (in typical current architectures) other than kv caching which shouldn’t matter for behavior or (hypothetical) subjective experience).
This seems analogous with large model processing, where the activations and calculations happen over time, each with multiple processor cycles and different timeslices.
Can you explain that a bit? I think of current-LLM forward passes as necessarily having to happen sequentially (during normal autoregressive operation), since the current forward pass’s output becomes part of the next forward pass’s input. Am I oversimplifying?
[ I’m fascinated by intuitions around consciousness, identity, and timing. This is an exploration, not a disagreement. ]
Hmm. In what ways does it matter that it wouldn’t be for very long? Presuming the memories are the same, and the in-progress sensory input and cognition (including anticipation of future sensory input, even though it’s wrong in one case), is there anything distinguishable at all?
There’s presumably a minimum time slice to be called “experience” (a microsecond is just a frozen lump of fatty tissue, a minute is clearly human experience, somewhere in between it “counts” as conscious experience). But as long as that’s met, I really don’t see a difference.
Hmm. What makes it significant? I mean, they’re not globally synchronized, but that could just mean the universe’s quantum ‘tick’ is small enough that there are offsets and variable tick requirements for each neuron. This seems analogous with large model processing, where the activations and calculations happen over time, each with multiple processor cycles and different timeslices.
PS --
Absolutely, I’m right there with you!
Not that I see! I would expect it to be fully indistinguishable until incompatible sensory input eventually reaches the brain (if it doesn’t wink out first). So far it seems to me like our intuitions around that are the same.
I think at least in terms of my own intuitions, it’s that there’s an unambiguous start and stop to each tick of the perceive-and-think-and-act cycle. I don’t think that’s true for human processing, although I’m certainly open to my mental model being wrong.
Going back to your original reply, you said ‘I think it’s really tricky to think that there are fundamental differences based on duration or speed of experience’, and that’s definitely not what I’m trying to point to. I think you’re calling out some fuzziness in the distinction between started/stopped human cognition and started/stopped LLM cognition, and I recognize that’s there. I do think that if you could perfectly freeze & restart human cognition, that would be more similar, so maybe it’s a difference in practice more than a difference in principle.
But it does still seem to me that the fully discrete start-to-stop cycle (including the environment only changing in discrete ticks which are coordinated with that cycle) is part of what makes LLMs more Boltzmann-brainy to me. Paired with the lack of internal memory, it means that you could give an LLM one context for this forward pass, and a totally different context for the next forward pass, and that wouldn’t be noticeable to the LLM, whereas it very much would be for humans (caveat: I’m unsure what happens to the residual stream between forward passes, whether it’s reset for each pass or carried through to the next pass; if the latter, I think that might mean that switching context would be in some sense noticeable to the LLM [EDIT—it’s fully reset for each pass (in typical current architectures) other than kv caching which shouldn’t matter for behavior or (hypothetical) subjective experience).
Can you explain that a bit? I think of current-LLM forward passes as necessarily having to happen sequentially (during normal autoregressive operation), since the current forward pass’s output becomes part of the next forward pass’s input. Am I oversimplifying?