Below is what I see is required for AI-Caused Extinction to happen in the next few tens of years (years 2024-2050 or so). In brackets is my very approximate probability estimation as of 2024-07-25 assuming all previous steps have happened.
AI technologies continue to develop at approximately current speeds or faster (80%)
AI manages to reach a level where it can cause an extinction (90%)
AI that can cause an extinction did not have enough alignment mechanisms in place (90%)
AI executes an unaligned scenario (low, maybe less than 10%)
Other AIs and humans aren’t able to notice and stop the unaligned scenario in time (50-50ish)
Once the scenario is executed humanity is never able to roll it back (50-50ish)
I think #1 implies #2 pretty strongly, but OK, I was mostly with you until #4. Why is it that low? I think #3 implies #4, with high probability. Why don’t you?
#5 and #6 don’t seem like strong objections. Multiple scenarios could happen multiple times in the interval we are talking about. Only one has to deal the final blow for it to be final, and even blows we survive, we can’t necessarily recover from, or recover from quickly. The weaker civilization gets, the less likely it is to survive the next blow.
We can hope that warning shots wake up the world enough to make further blows less likely, but consider that the opposite may be true. Damage leads to desperation, which leads to war, which leads to arms races, which leads to cutting corners on safety, which leads to the next blow. Or human manipulation/deception through AI leads to widespread mistrust, which prevents us from coordinating on our collective problems in time. Or AI success leads to dependence, which leads to reluctance to change course, which makes recovery harder. Or repeated survival leads to complacency until we boil the frog to death. Or some combination of these, or similar cascading failures. It depends on the nature of the scenario. There are lots of ways things could go wrong, many roads to ruin; disaster is disjunctive.
Would warnings even work? Those in the know are sounding the alarm already. Are we taking them seriously enough? If not, why do you expect this to change?
I believe ranked comparing intelligence between two artificial or biological agents can only be down subjectively with someone deciding what they value.
Additionally, I think there is no agreed upon whether the definition “intelligence” should include knowledge. For example, can you consider an AI “smart” if it doesn’t know anything about humans?
On the other hand, I value my dad’s ability to have knowledge about my childhood and have a model of my behavior across tens of years very highly. Thus, I will never agree that AI is smarter than my dad I will only agree that AI is better at certain cognitive skills while my dad is better at certain other cognitive skills even if some of those skills only requires a simple memory lookup.
Whether certain relatively general AI will be better at learning a random set of cognitive tasks than my dad is a different question, if it will be then I will admit that it’s better at certain or maybe all known generality benchmarks but only I can decide what cognitive skills I value for myself.
I see two related fundamental problems with the modern discourse around AI.
1) As with most words, there is no agreed upon definition on the term “intelligence”.
2) Intelligence is often used in a ranked comparison as a single dimension, e.g. “AI smarter than a human”.
When people use the word “intelligence” it seems people often assume it should include various analytical, problem skills, and learning skills. What’s less clear if it includes creative skills, communication skills, emotional intelligence, etc.
I think because people often like simplifying concepts and ranking people against each other, term “inteligence” started to be used as a single dimension—“your son is the smartest child in class”, “my boyfriend is much smarter than hers”, and of course “we will soon reach AI that is smarter than a human”.
I believe this type of thinking lead to the development and popularization of the IQ score at some point. The IQ score seems to be mainly out of the discource between thought leaders.
I believe in place of the term “intelligence” a less wrong and more precise & useful term for the discource should be “cognitive skills”, it much better represents that the concept involves multiple subjective dimensions rather than a single objective dimension.
This way we will more clearly evaluate various AIs by their various cognitive skills and not fall into a false sense that “intelligence” is a single clear dimension.
I don’t really have a problem with the term “intelligence” myself, but I see how it could carry anthropomorphic baggage for some people. However, I think the important parts are, in fact, analogous between AGI and humans. But I’m not attached to that particular word. One may as well say “competence” or “optimization power” without losing hold of the sense of “intelligence” we mean when we talk about AI.
In the study of human intelligence, it’s useful to break down the g factor (what IQ tests purport to measure) into fluid and crystallized intelligence. The former being the processing power required to learn and act in novel situations, and the latter being what has been learned and the ability to call upon and apply that knowledge.
“Cognitive skills” seems like a reasonably good framing for further discussion, but I think recent experience in the field contradicts your second problem, even given this framing. The Bitter Lesson says it well. Here are some relevant excerpts (it’s worth a read and not that long).
The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. [...] Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation.
[...] researchers always tried to make systems that worked the way the researchers thought their own minds worked—they tried to put that knowledge in their systems—but it proved ultimately counterproductive, and a colossal waste of researcher’s time, when, through Moore’s law, massive computation became available and a means was found to put it to good use.
[...] We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.
One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.
[...] the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds [...] these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. [...] We want AI agents that can discover like we can, not which contain what we have discovered.
Your conception of intelligence in the “cognitive skills” framing seems to be mainly about the crystalized sort. The knowledge and skills and application thereof. You see how complex and multidimensional that is and object to the idea that collections of such should be well-ordered, making concepts like “smarter-than human” if not wholly devoid of meaning, at least wrongheaded.
I agree that “competence” is ultimately a synonym for “skill”, but you’re neglecting the fluid intelligence. We already know how to give computers the only “cognitive skills” that matters: the ones that let you acquire all the others. The ability to learn, mainly. And that one can be brute forced with more compute. All the complexity and multidimensionality you see come when something profoundly simple, algorithms measured in mere kilobytes of source code, interacts with data from the complex and multidimensional real world.
In the idealized limit, what I call “intelligence” is AIXI. Though the explanation is long, the definition is not. It really is that simple. All else we call “intelligence” is mere approximation and optimization of that.
AI-Caused Extinction Ingredients
Below is what I see is required for AI-Caused Extinction to happen in the next few tens of years (years 2024-2050 or so). In brackets is my very approximate probability estimation as of 2024-07-25 assuming all previous steps have happened.
AI technologies continue to develop at approximately current speeds or faster (80%)
AI manages to reach a level where it can cause an extinction (90%)
AI that can cause an extinction did not have enough alignment mechanisms in place (90%)
AI executes an unaligned scenario (low, maybe less than 10%)
Other AIs and humans aren’t able to notice and stop the unaligned scenario in time (50-50ish)
Once the scenario is executed humanity is never able to roll it back (50-50ish)
I think #1 implies #2 pretty strongly, but OK, I was mostly with you until #4. Why is it that low? I think #3 implies #4, with high probability. Why don’t you?
#5 and #6 don’t seem like strong objections. Multiple scenarios could happen multiple times in the interval we are talking about. Only one has to deal the final blow for it to be final, and even blows we survive, we can’t necessarily recover from, or recover from quickly. The weaker civilization gets, the less likely it is to survive the next blow.
We can hope that warning shots wake up the world enough to make further blows less likely, but consider that the opposite may be true. Damage leads to desperation, which leads to war, which leads to arms races, which leads to cutting corners on safety, which leads to the next blow. Or human manipulation/deception through AI leads to widespread mistrust, which prevents us from coordinating on our collective problems in time. Or AI success leads to dependence, which leads to reluctance to change course, which makes recovery harder. Or repeated survival leads to complacency until we boil the frog to death. Or some combination of these, or similar cascading failures. It depends on the nature of the scenario. There are lots of ways things could go wrong, many roads to ruin; disaster is disjunctive.
Would warnings even work? Those in the know are sounding the alarm already. Are we taking them seriously enough? If not, why do you expect this to change?
“AI will never be smarter than my dad.”
I believe ranked comparing intelligence between two artificial or biological agents can only be down subjectively with someone deciding what they value.
Additionally, I think there is no agreed upon whether the definition “intelligence” should include knowledge. For example, can you consider an AI “smart” if it doesn’t know anything about humans?
On the other hand, I value my dad’s ability to have knowledge about my childhood and have a model of my behavior across tens of years very highly. Thus, I will never agree that AI is smarter than my dad I will only agree that AI is better at certain cognitive skills while my dad is better at certain other cognitive skills even if some of those skills only requires a simple memory lookup.
Whether certain relatively general AI will be better at learning a random set of cognitive tasks than my dad is a different question, if it will be then I will admit that it’s better at certain or maybe all known generality benchmarks but only I can decide what cognitive skills I value for myself.
I see two related fundamental problems with the modern discourse around AI.
1) As with most words, there is no agreed upon definition on the term “intelligence”.
2) Intelligence is often used in a ranked comparison as a single dimension, e.g. “AI smarter than a human”.
When people use the word “intelligence” it seems people often assume it should include various analytical, problem skills, and learning skills. What’s less clear if it includes creative skills, communication skills, emotional intelligence, etc.
I think because people often like simplifying concepts and ranking people against each other, term “inteligence” started to be used as a single dimension—“your son is the smartest child in class”, “my boyfriend is much smarter than hers”, and of course “we will soon reach AI that is smarter than a human”.
I believe this type of thinking lead to the development and popularization of the IQ score at some point. The IQ score seems to be mainly out of the discource between thought leaders.
I believe in place of the term “intelligence” a less wrong and more precise & useful term for the discource should be “cognitive skills”, it much better represents that the concept involves multiple subjective dimensions rather than a single objective dimension.
This way we will more clearly evaluate various AIs by their various cognitive skills and not fall into a false sense that “intelligence” is a single clear dimension.
I don’t really have a problem with the term “intelligence” myself, but I see how it could carry anthropomorphic baggage for some people. However, I think the important parts are, in fact, analogous between AGI and humans. But I’m not attached to that particular word. One may as well say “competence” or “optimization power” without losing hold of the sense of “intelligence” we mean when we talk about AI.
In the study of human intelligence, it’s useful to break down the g factor (what IQ tests purport to measure) into fluid and crystallized intelligence. The former being the processing power required to learn and act in novel situations, and the latter being what has been learned and the ability to call upon and apply that knowledge.
“Cognitive skills” seems like a reasonably good framing for further discussion, but I think recent experience in the field contradicts your second problem, even given this framing. The Bitter Lesson says it well. Here are some relevant excerpts (it’s worth a read and not that long).
Your conception of intelligence in the “cognitive skills” framing seems to be mainly about the crystalized sort. The knowledge and skills and application thereof. You see how complex and multidimensional that is and object to the idea that collections of such should be well-ordered, making concepts like “smarter-than human” if not wholly devoid of meaning, at least wrongheaded.
I agree that “competence” is ultimately a synonym for “skill”, but you’re neglecting the fluid intelligence. We already know how to give computers the only “cognitive skills” that matters: the ones that let you acquire all the others. The ability to learn, mainly. And that one can be brute forced with more compute. All the complexity and multidimensionality you see come when something profoundly simple, algorithms measured in mere kilobytes of source code, interacts with data from the complex and multidimensional real world.
In the idealized limit, what I call “intelligence” is AIXI. Though the explanation is long, the definition is not. It really is that simple. All else we call “intelligence” is mere approximation and optimization of that.