The experiments that have been tried in humans have been extremely conservative, aiming to fix problems in the most well-understood but least-relevant-to-intelligence areas of the brain (sensory input, motor output). [....] This is not evidence that the tech itself is actually this limited.
Your characterization of the current state of research matches my impressions (though it’s good to hear from someone who knows more). My reasons for thinking BCIs are weaksause have never been about that, though. The reasons are that:
I don’t see any compelling case for anything you can do on a computer which, when you hook it up to a human brain, makes the human brain very substantially better at solving philosophical problems. I can think of lots of cool things you can do with a good BCI, and I’m sure you and others can think of lots of other cool things, but that’s not answering the question. Do you see a compelling case? What is it? (To be more precise, I do see compelling cases for the few areas I mentioned: prosthetic intrabrain connectivity and networking humans. But those both seem quite difficult technically, and would plausibly be capped in their success by connection bandwidth, which is technically difficult to increase.)
It doesn’t seem like we understand nearly as much about intelligence compared to evolution (in a weak sense of “understand”, that includes stuff encoded in the human genome cloud). So stuff that we’ll program in a computer will be qualitatively much less helpful for real human thinking, compared to just copying evolution’s work. (If you can’t see that LLMs don’t think, I don’t expect to make progress debating that here.)
I think that cortical microcolumns are fairly close to acting in a pretty well stereotyped way that we can simulate pretty accurately on a computer. And I don’t think their precise behavior is all that critical. I think actually you could get 80-90% of the effective capacity by simply having a small (10k? 100k? parameter) transformer standing in for each simulated cortical column, rather than a less compute efficient but more biologically accurate simulation.
The tricky part is just setting up the rules for intercolumn connection (excitatory and inhibitory) properly. I’ve been making progress on this in my research, as I’ve mentioned to you in the past.
Interregional connections (e.g. parietal lobe to prefrontal lobe, or V1 to V2) are fewer, and consistent enough between different people, and involve many fewer total connections, so they’ve all been pretty well described by modern neuroscience. The full weighted directed graph is known, along with a good estimate of the variability on the weights seen between individuals.
It’s not the case that the whole brain is involved in each specific ability that a person has. The human brain has a lot of functional localization. For a specific skill, like math or language, there is some distributed contribution from various areas but the bulk of the computation for that skill is done by a very specific area. This means that if you want to increase someone’s math skill, you probably need to just increase that specific known 5% or so of their brain most relevant to math skill by 10x. This is a lot easier than needing to 10x the entire brain.
I don’t know enough to evaluate your claims, but more importantly, I can’t even just take your word for everything because I don’t actually know what you’re saying without asking a whole bunch of followup questions. So hopefully we can hash some of this out on the phone.
An estimation of the absolute number of axons indicates that human cortical areas are sparsely connected
Burke Q. Rosen, Eric Halgren
Abstract
The tracts between cortical areas are conceived as playing a central role in cortical information processing, but their actual numbers have never been determined in humans. Here, we estimate the absolute number of axons linking cortical areas from a whole-cortex diffusion MRI (dMRI) connectome, calibrated using the histologically measured callosal fiber density. Median connectivity is estimated as approximately 6,200 axons between cortical areas within hemisphere and approximately 1,300 axons interhemispherically, with axons connecting functionally related areas surprisingly sparse. For example, we estimate that <5% of the axons in the trunk of the arcuate and superior longitudinal fasciculi connect Wernicke’s and Broca’s areas. These results suggest that detailed information is transmitted between cortical areas either via linkage of the dense local connections or via rare, extraordinarily privileged long-range connections.
Interregional connections (e.g. parietal lobe to prefrontal lobe, or V1 to V2) are fewer, and consistent enough between different people, and involve many fewer total connections, so they’ve all been pretty well described by modern neuroscience.
Wait are you saying that not only there is quite low long-distance bandwidth, but also relatively low bandwith between neighboring areas? Numbers would be very helpful.
And if there’s much higher bandwidth between neighboring regions, might there not be a lot more information that’s propagating long-range but only slowly through intermediate areas (or would that be too slow or sth?)?
(Relatedly, how crisply does the neocortex factor into different (specialized) regions? (Like I’d have thought it’s maybe sorta continuous?))
I’m glad you’re curious to learn more!
The cortex factors quite crisply into specialized regions. These regions have different cell types and groupings, so were first noticed by early microscope users like Cajal.
In a cortical region, neurons are organized first into microcolumns of 80-100 neurons, and then into macrocolumns of many microcolumns.
Each microcolumn works together as a group to calculate a function. Neighboring microcolumns inhibit each other. So each macrocolumn is sort of a mixture of experts.
The question then is how many microcolumns from one region send an output to a different region. For the example of V1 to V2, basically every microcolumn in V1 sends a connection to V2 (and vise versa). This is why the connection percentage is about 1%. 100 neurons per microcolumn, 1 of which has a long distance axon to V2. The total number of neurons is roughly 10 million, organized into about 100,000 microcolumns.
For areas that are further apart, they send fewer axons. Which doesn’t mean their signal is unimportant, just lower resolution. In that case you’d ask something like “how many microcolumns per macrocolumn send out a long distance axon from region A to region B?” This might be 1, just a summary report of the macrocolumn. So for roughly 10 million neurons, and 100,000 microcolumns organized into around 1000 macrocolumns… You get around 1000 neurons send axons from region A to region B.
More details are in the papers I linked elsewhere in this comment thread.
Yeah I believe what you say about that long-distance connections not that many.
I meant that there might be more non-long-distance connections between neighboring areas. (E.g. boundaries of areas are a bit fuzzy iirc, so macrocolumns towards the “edge” of a region are sorta intertwined with macrocolumns of the other side of the “edge”.) (I thought when you mean V1 to V2 you include those too, but I guess you didn’t?)
Do you think those inter-area non-long-distance connections are relatively unimportant, and if so why?
Your characterization of the current state of research matches my impressions (though it’s good to hear from someone who knows more). My reasons for thinking BCIs are weaksause have never been about that, though. The reasons are that:
I don’t see any compelling case for anything you can do on a computer which, when you hook it up to a human brain, makes the human brain very substantially better at solving philosophical problems. I can think of lots of cool things you can do with a good BCI, and I’m sure you and others can think of lots of other cool things, but that’s not answering the question. Do you see a compelling case? What is it? (To be more precise, I do see compelling cases for the few areas I mentioned: prosthetic intrabrain connectivity and networking humans. But those both seem quite difficult technically, and would plausibly be capped in their success by connection bandwidth, which is technically difficult to increase.)
It doesn’t seem like we understand nearly as much about intelligence compared to evolution (in a weak sense of “understand”, that includes stuff encoded in the human genome cloud). So stuff that we’ll program in a computer will be qualitatively much less helpful for real human thinking, compared to just copying evolution’s work. (If you can’t see that LLMs don’t think, I don’t expect to make progress debating that here.)
I think that cortical microcolumns are fairly close to acting in a pretty well stereotyped way that we can simulate pretty accurately on a computer. And I don’t think their precise behavior is all that critical. I think actually you could get 80-90% of the effective capacity by simply having a small (10k? 100k? parameter) transformer standing in for each simulated cortical column, rather than a less compute efficient but more biologically accurate simulation.
The tricky part is just setting up the rules for intercolumn connection (excitatory and inhibitory) properly. I’ve been making progress on this in my research, as I’ve mentioned to you in the past.
Interregional connections (e.g. parietal lobe to prefrontal lobe, or V1 to V2) are fewer, and consistent enough between different people, and involve many fewer total connections, so they’ve all been pretty well described by modern neuroscience. The full weighted directed graph is known, along with a good estimate of the variability on the weights seen between individuals.
It’s not the case that the whole brain is involved in each specific ability that a person has. The human brain has a lot of functional localization. For a specific skill, like math or language, there is some distributed contribution from various areas but the bulk of the computation for that skill is done by a very specific area. This means that if you want to increase someone’s math skill, you probably need to just increase that specific known 5% or so of their brain most relevant to math skill by 10x. This is a lot easier than needing to 10x the entire brain.
I don’t know enough to evaluate your claims, but more importantly, I can’t even just take your word for everything because I don’t actually know what you’re saying without asking a whole bunch of followup questions. So hopefully we can hash some of this out on the phone.
Sorry that my attempts to communicate technical concepts don’t always go smoothly!
I keep trying to answer your questions about ‘what I think I know and how I think I know it’ with dumps of lists of papers. Not ideal!
But sometimes I’m not sure what else to do, so.… here’s a paper!
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001575
An estimation of the absolute number of axons indicates that human cortical areas are sparsely connected
Burke Q. Rosen, Eric Halgren
Abstract
The tracts between cortical areas are conceived as playing a central role in cortical information processing, but their actual numbers have never been determined in humans. Here, we estimate the absolute number of axons linking cortical areas from a whole-cortex diffusion MRI (dMRI) connectome, calibrated using the histologically measured callosal fiber density. Median connectivity is estimated as approximately 6,200 axons between cortical areas within hemisphere and approximately 1,300 axons interhemispherically, with axons connecting functionally related areas surprisingly sparse. For example, we estimate that <5% of the axons in the trunk of the arcuate and superior longitudinal fasciculi connect Wernicke’s and Broca’s areas. These results suggest that detailed information is transmitted between cortical areas either via linkage of the dense local connections or via rare, extraordinarily privileged long-range connections.
Wait are you saying that not only there is quite low long-distance bandwidth, but also relatively low bandwith between neighboring areas? Numbers would be very helpful.
And if there’s much higher bandwidth between neighboring regions, might there not be a lot more information that’s propagating long-range but only slowly through intermediate areas (or would that be too slow or sth?)?
(Relatedly, how crisply does the neocortex factor into different (specialized) regions? (Like I’d have thought it’s maybe sorta continuous?))
I’m glad you’re curious to learn more! The cortex factors quite crisply into specialized regions. These regions have different cell types and groupings, so were first noticed by early microscope users like Cajal. In a cortical region, neurons are organized first into microcolumns of 80-100 neurons, and then into macrocolumns of many microcolumns. Each microcolumn works together as a group to calculate a function. Neighboring microcolumns inhibit each other. So each macrocolumn is sort of a mixture of experts. The question then is how many microcolumns from one region send an output to a different region. For the example of V1 to V2, basically every microcolumn in V1 sends a connection to V2 (and vise versa). This is why the connection percentage is about 1%. 100 neurons per microcolumn, 1 of which has a long distance axon to V2. The total number of neurons is roughly 10 million, organized into about 100,000 microcolumns.
For areas that are further apart, they send fewer axons. Which doesn’t mean their signal is unimportant, just lower resolution. In that case you’d ask something like “how many microcolumns per macrocolumn send out a long distance axon from region A to region B?” This might be 1, just a summary report of the macrocolumn. So for roughly 10 million neurons, and 100,000 microcolumns organized into around 1000 macrocolumns… You get around 1000 neurons send axons from region A to region B.
More details are in the papers I linked elsewhere in this comment thread.
Thanks!
Yeah I believe what you say about that long-distance connections not that many.
I meant that there might be more non-long-distance connections between neighboring areas. (E.g. boundaries of areas are a bit fuzzy iirc, so macrocolumns towards the “edge” of a region are sorta intertwined with macrocolumns of the other side of the “edge”.)
(I thought when you mean V1 to V2 you include those too, but I guess you didn’t?)
Do you think those inter-area non-long-distance connections are relatively unimportant, and if so why?