I really don’t want to spend even more time arguing over my evolution post, so I’ll just copy over our interactions from the previous times you criticized it, since that seems like context readers may appreciate.
[very long, but mainly about your “many other animals also transmit information via non-genetic means” objection + some other mechanisms you think might have caused human takeoff]
I don’t think this objection matters for the argument I’m making. All the cross-generational information channels you highlight are at rough saturation, so they’re not able to contribute to the cross-generational accumulation of capabilities-promoting information. Thus, the enormous disparity between the brain’s with-lifetime learning versus evolution cannot lead to a multiple OOM faster accumulation of capabilities as compared to evolution.
When non-genetic cross-generational channels are at saturation, the plot of capabilities-related info versus generation count looks like this:
with non-genetic information channels only giving the “All info” line a ~constant advantage over “Genetic info”. Non-genetic channels might be faster than evolution, but because they’re saturated, they only give each generation a fixed advantage over where they’d be with only genetic info. In contrast, once the cultural channel allows for an ever-increasing volume of transmitted information, then the vastly faster rate of within-lifetime learning can start contributing to the slope of the “All info” line, and not just its height.
Thus, humanity’s sharp left turn.
In Twitter comments on Open Philanthropy’s announcement of prize winners:
But what’s the central point, than? Evolution discovered how to avoid the genetic bottleneck myriad times; also discovered potentially unbounded ways how to transmit arbitrary number of bits, like learning-teaching behaviours; except humans, nothing foomed. So the updated story would be more like “some amount of non-genetic/cultural accumulation is clearly convergent and is common, but there is apparently some threshold crossed so far only by humans. Once you cross it you unlock a lot of free energy and the process grows explosively”. (&the cause or size of treshold is unexplained)
(note: this was a reply and part of a slightly longer chain)
Firstly, I disagree with your statement that other species have “potentially unbounded ways how to transmit arbitrary number of bits”. Taken literally, of course there’s no species on earth that can actually transmit an *unlimited* amount of cultural information between generations. However, humans are still a clear and massive outlier in the volume of cultural information we can transmit between generations, which is what allows for our continuously increasing capabilities across time.
Secondly, the main point of my article was not to determine why humans, in particular, are exceptional in this regard. The main point was to connect the rapid increase in human capabilities relative to previous evolution-driven progress rates with the greater optimization power of brains as compared to evolution. Being so much better at transmitting cultural information as compared to other species allowed humans to undergo a “data-driven singularity” relative to evolution. While our individual brains and learning processes might not have changed much between us and ancestral humans, the volume and quality of data available for training future generations did increase massively, since past generations were much better able to distill the results of their lifetime learning into higher-quality data.
This allows for a connection between the factors we’ve identified are important for creating powerful AI systems (data volume, data quality, and effectively applied compute), and the process underlying the human “sharp left turn”. It reframes the mechanisms that drove human progress rates in terms of the quantities and narratives that drive AI progress rates, and allows us to more easily see what implications the latter has for the former.
In particular, this frame suggests that the human “sharp left turn” was driven by the exploitation of a one-time enormous resource inefficiency in the structure of the human, species-level optimization process. And while the current process of AI training is not perfectly efficient, I don’t think it has comparably sized overhangs which can be exploited easily. If true, this would mean human evolutionary history provides little evidence for sudden increases in AI capabilities.
The above is also consistent with rapid civilizational progress depending on many additional factors: it relies on resource overhand being a *necessary* factor, but does not require it to be alone *sufficient* to accelerate human progress. There are doubtless many other factors that are relevant, such as a historical environment favorable to progress, a learning process that sufficiently pays attention to other members of ones species, not being a purely aquatic species, and so on. However, any full explanation of the acceleration in human progress of the form: “sudden progress happens exactly when (resource overhang) AND (X) AND (Y) AND (NOT Z) AND (W OR P OR NOT R) AND...” is still going to have the above implications for AI progress rates.
There’s also an entire second half to the article, which discusses what human “misalignment” to inclusive genetic fitness (doesn’t) mean for alignment, as well as the prospects for alignment during two specific fast takeoff (but not sharp left turn) scenarios, but that seems secondary to this discussion.
All the cross-generational information channels you highlight are at rough saturation, so they’re not able to contribute to the cross-generational accumulation of capabilities-promoting information.
This seems clearly contradicted by empirical evidence. Mirror neurons would likely be able to saturate what you assume is brains learning rate, so not transferring more learned bits is much more likely because marginal cost of doing so is higher than than other sensible options. Which is a different reason than “saturated, at capacity”.
Firstly, I disagree with your statement that other species have “potentially unbounded ways how to transmit arbitrary number of bits”. Taken literally, of course there’s no species on earth that can actually transmit an *unlimited* amount of cultural information between generations
Sure. Taken literally, the statement is obviously false … literally nothing can store arbitrary number of bits because of Bekenstein bound. More precisely, the claim is existing non-human ways how to transmit leaned bits to the next generation in practice do not seem to be constrained by limits how many bits they can transmit, but by some other limits (e.g. you can transmit more bits than the capacity of the animal to learn).
Secondly, the main point of my article was not to determine why humans, in particular, are exceptional in this regard. The main point was to connect the rapid increase in human capabilities relative to previous evolution-driven progress rates with the greater optimization power of brains as compared to evolution. Being so much better at transmitting cultural information as compared to other species allowed humans to undergo a “data-driven singularity” relative to evolution. While our individual brains and learning processes might not have changed much between us and ancestral humans, the volume and quality of data available for training future generations did increase massively, since past generations were much better able to distill the results of their lifetime learning into higher-quality data.
1. As explained in my post, there is no reason to assume ancestral humans were so much better at transmitting information as compared to other species
2. The qualifier they were better at transmitting cultural information may (or may not) do a lot of work.
The crux is something like “what is the type signature of culture”. Your original post roughly assumes “it’s just more data”. But this seems very unclear: a comment above yours, jacob_cannell confidently claims I miss the forest and makes a guess the critical innovation is “symbolic language”. But, obviously, “symbolic language” is a very different type of innovation than “more data transmitted across generations”.
Symbolic language likely - allows to use any type of channel more effectively - in particular, allows more efficient horizontal synchronization, allowing parallel computations across many brains - overall sounds more like software upgrade
Consider plain old telephone network wires: these have surprisingly large intrinsic capacity, which isn’t that effectively used by analog voice calls. Yes, when you plug a modem on both sides you experience “jump” in capacity—but this is much more like “software update” and can be more sudden.
Or a different example—empirically, it seems possible to teach various non-human apes sign language (their general purpose predictive processing brains are general enough to learn this). I would classify this as “software” or “algorithm” upgrade,. If someone did this to a group of apes in the wild, it seems plausible knowledge of language would stick and make them differentially more fit. But teaching apes symbolic language sounds in principle different from “it’s just more data” or “it’s a higher quality data”, and implications for AI progress would be different.
it relies on resource overhand being a *necessary* factor,
My impression is compared to your original post your model drifts to more and more general concepts where it becomes more likely true, harder to refute and less clear what the implication for AI is. What is the “resource” here? Does negentropy stored in wood count as “a resource overhang”?
I’m arguing specifically against a version where “resource overhang” is caused by “exploitable resources you easily unlock by transmitting more bits learned by your brain vertically to your offspring brain” because your map of humans to AI progress is based on quite specific model of what are the bottlenecks and overhangs.
If the current version of the argument is “sudden progress happens exactly when (resource overhang) AND …” with “generally any kind of resource” then yes, this sounds more likely, but it seems very unclear what does this imply for AI.
(Yes I’m basically not discussing the second half of the article)
Isn’t there an alternative story here where we care about the sharp left turn, but in the cultural sense, similar to Drexler’s CAIS where we have similar types of experimentation as happened during the cultural evolution phase?
You’ve convinced me that the sharp left turn will not happen in the classical way that people have thought about it, but are you that certain that there isn’t that much free energy available in cultural style processes? If so, why?
I can imagine that there is something to say about SGD already being pretty algorithmically efficient, but I guess I would say that determining how much available free energy there is in improving optimisation processes is an open question. If the error bars are high here, how can we then know that the AI won’t spin up something similar internally?
I also want to add something about genetic fitness becoming twisted as a consequence of cultural evolutionary pressure on individuals. Culture in itself changed the optimal survival behaviour of humans, which then meant that the meta-level optimisation loop changed the underlying optimisation loop. Isn’t the culture changing the objective function still a problem that we have to potentially contend with, even though it might not be as difficult as the normal sharp left turn?
For example, let’s say that we deploy GPT-6 and it figures out that in order to solve the loosely defined objective that we have determined for it using (Constitutional AI)^2 should be discussed by many different iterations of itself to create a democratic process of multiple COT reasoners. This meta-process seems, in my opinion, like something that the cultural evolution hypothesis would predict is more optimal than just one GPT-6, and it also seems a lot harder to align than normal?
I really don’t want to spend even more time arguing over my evolution post, so I’ll just copy over our interactions from the previous times you criticized it, since that seems like context readers may appreciate.
In the comment sections of the original post:
Your comment
[very long, but mainly about your “many other animals also transmit information via non-genetic means” objection + some other mechanisms you think might have caused human takeoff]
My response
In Twitter comments on Open Philanthropy’s announcement of prize winners:
Your tweet
(note: this was a reply and part of a slightly longer chain)
My response
I’ll try to keep it short
This seems clearly contradicted by empirical evidence. Mirror neurons would likely be able to saturate what you assume is brains learning rate, so not transferring more learned bits is much more likely because marginal cost of doing so is higher than than other sensible options. Which is a different reason than “saturated, at capacity”.
Sure. Taken literally, the statement is obviously false … literally nothing can store arbitrary number of bits because of Bekenstein bound. More precisely, the claim is existing non-human ways how to transmit leaned bits to the next generation in practice do not seem to be constrained by limits how many bits they can transmit, but by some other limits (e.g. you can transmit more bits than the capacity of the animal to learn).
1. As explained in my post, there is no reason to assume ancestral humans were so much better at transmitting information as compared to other species
2. The qualifier they were better at transmitting cultural information may (or may not) do a lot of work.
The crux is something like “what is the type signature of culture”. Your original post roughly assumes “it’s just more data”. But this seems very unclear: a comment above yours, jacob_cannell confidently claims I miss the forest and makes a guess the critical innovation is “symbolic language”. But, obviously, “symbolic language” is a very different type of innovation than “more data transmitted across generations”.
Symbolic language likely
- allows to use any type of channel more effectively
- in particular, allows more efficient horizontal synchronization, allowing parallel computations across many brains
- overall sounds more like software upgrade
Consider plain old telephone network wires: these have surprisingly large intrinsic capacity, which isn’t that effectively used by analog voice calls. Yes, when you plug a modem on both sides you experience “jump” in capacity—but this is much more like “software update” and can be more sudden.
Or a different example—empirically, it seems possible to teach various non-human apes sign language (their general purpose predictive processing brains are general enough to learn this). I would classify this as “software” or “algorithm” upgrade,. If someone did this to a group of apes in the wild, it seems plausible knowledge of language would stick and make them differentially more fit. But teaching apes symbolic language sounds in principle different from “it’s just more data” or “it’s a higher quality data”, and implications for AI progress would be different.
My impression is compared to your original post your model drifts to more and more general concepts where it becomes more likely true, harder to refute and less clear what the implication for AI is. What is the “resource” here? Does negentropy stored in wood count as “a resource overhang”?
I’m arguing specifically against a version where “resource overhang” is caused by “exploitable resources you easily unlock by transmitting more bits learned by your brain vertically to your offspring brain” because your map of humans to AI progress is based on quite specific model of what are the bottlenecks and overhangs.
If the current version of the argument is “sudden progress happens exactly when (resource overhang) AND …” with “generally any kind of resource” then yes, this sounds more likely, but it seems very unclear what does this imply for AI.
(Yes I’m basically not discussing the second half of the article)
Isn’t there an alternative story here where we care about the sharp left turn, but in the cultural sense, similar to Drexler’s CAIS where we have similar types of experimentation as happened during the cultural evolution phase?
You’ve convinced me that the sharp left turn will not happen in the classical way that people have thought about it, but are you that certain that there isn’t that much free energy available in cultural style processes? If so, why?
I can imagine that there is something to say about SGD already being pretty algorithmically efficient, but I guess I would say that determining how much available free energy there is in improving optimisation processes is an open question. If the error bars are high here, how can we then know that the AI won’t spin up something similar internally?
I also want to add something about genetic fitness becoming twisted as a consequence of cultural evolutionary pressure on individuals. Culture in itself changed the optimal survival behaviour of humans, which then meant that the meta-level optimisation loop changed the underlying optimisation loop. Isn’t the culture changing the objective function still a problem that we have to potentially contend with, even though it might not be as difficult as the normal sharp left turn?
For example, let’s say that we deploy GPT-6 and it figures out that in order to solve the loosely defined objective that we have determined for it using (Constitutional AI)^2 should be discussed by many different iterations of itself to create a democratic process of multiple COT reasoners. This meta-process seems, in my opinion, like something that the cultural evolution hypothesis would predict is more optimal than just one GPT-6, and it also seems a lot harder to align than normal?