It seems to me that Effective Altruism uses a theoretical negative outcome (an extinction-level event) as motivation for action in a very similar way to how Judeo-Christian religions use another theoretical negative outcome (your unsaved soul going to Hell for eternal torment) as motivation for action.
Both have high priests who establish dogma, and legions of believers who evangelize and grow the base.
Both spend vast amounts of money to persuade others to adopt their belief system.
There’s nothing new there regarding how religions work, but for a philosophical belief that’s supposed to be grounded in rational decision-making, there’s a giant looming gap in the reasoning chain when it comes to AI posing an existential risk to humanity.
Unless I’m missing something.
Is there any proof that I haven’t read yet which demonstrates that AGI or Superintelligence will have the capability to go rogue and bring about Armagadden?
Analogies can be found in many places. FDA prevents you from selling certain kinds of food? Sounds similar to ancient priests declaring food taboos for their followers. Vaccination? That’s just modern people performing a ritual to literally protect them from invisible threats. They even believe that a bad thing will happen to them if someone else in their neighborhood refuses to perform the ritual properly.
The difference is that we already have examples of food poisoning or people dying from a disease, but we do not have an example of a super-intelligent AI exterminating the humanity. That is a fair objection, but it is also clear why waiting to get the example first might be a wrong approach, so...
One possible approach is to look at smaller versions. What is a smaller version of “a super-intelligent AI exterminating the humanity”? If it is “a stupid program doing things its authors clearly did not intend”, then every software developer has stories to tell.
This is not the full answer, of course, but I think that a reasonable debate should be more like this.
I downvoted this post. I claim it’s for the public good, maybe you find this strange, but let me explain my reasoning.
You’ve come on Less Wrong, a website that probably has more discussion of this than any other website on the internet. If you want to find arguments, they aren’t hard to find. It’s a bit like walking into a library and saying that you can’t find a book to read.
The trouble isn’t that you literally can’t find any book/arguments, it’s that you’ve got a bunch of unstated requirements that you want satisfied. Now that’s perfectly fine, it’s good to have standards. At the same time, you’ve asked the question in a maximally vague way. I don’t expect you to be able to list all your requirements. That’s probably impossible and when it is. possible, it’s often a lot of work. At the same time, I do believe that it’s possible to do better than maximally vague.
The problem with maximally vague questions is that they almost guarantee that any attempt to provide an answer will be unsatisfying both for the person answering and the person receiving the answer. Worse, you’ve framed the question in such a way that some people will likely feel compelled to attempt to answer anyway, lest people who think that there is such a risk come off as unable to respond to critics.
If that’s the case, downvoting seems logical. Why support a game where no-one wins?
Sorry if this comes off as harsh, that’s not my intent. I’m simply attempting to prompt reflection.
My apologies for not being clear in my Quick Take, Chris. As Zach pointed out in his reply, I posed two issues.
The first being an obvious parallel for me between EA and Judeo-Christian religions. You may or may not agree with me, which is fine. I’m not looking to convince anyone of my point-of-view. I was merely interested in seeing if others here had a similar POV.
The second issue I raised was what I saw as a failure in the reasoning chain where you go from Deep Learning to Consciousness to an AI Armageddon. Why was that leap in faith so compelling to people?
I don’t see either of those questions as not being in the interest of the “public good”, but perhaps you just said that because my first attempt wasn’t clear. Hopefully, I’ve remedied that with this answer.
Oh, they’re definitely valid questions. The problem is that the second question is rather vague. You need to either state what a good answer would look like or why existing answers aren’t satisifying.
The argument chain you presented (Deep Learning → Consciousness → AI Armageddon) is a strawman. If you sincerely think that’s our position, you haven’t read enough. Read more, and you’ll be better received. If you don’t think that, stop being unfair about what we said, and you’ll be better received.
Last I checked, most of us were agnostic on the AI Consciousness question. If you think that’s a key point to our Doom arguments, you haven’t understood us; that step isn’t necessarily required; it’s not a link in the chain of argument. Maybe AI can be dangerous, even existentially so, without “having qualia”. But neither are we confident that AI necessarily won’t be conscious. We’re not sure how it works in humans but seems to be an emergent property of brains, so why not artificial brains as well? We don’t understand how the inscrutable matrices work either, so it seems like a possibility. Maybe gradient descent and evolution stumbled upon similar machinery for similar reasons. AI consciousness is mostly beside the point. Where it does come up is usually not in the AI Doom arguments, but questions about what we ethically owe AIs, as moral patients.
Deep Learning is also not required for AI Doom. Doom is a disjunctive claim; there are multiple paths for getting there. The likely-looking path at this point would go through the frontier LLM paradigm, but that isn’t required for Doom. (However, it probably is required for most short timelines.)
I mean, I agree that there are psycho-sociological similarities between religions and the AI risk movement (and indeed, I sometimes pejoratively refer to the latter as a “robot cult”), but analyzing the properties of the social group that believes that AI is an extinction risk is a separate question from whether AI in fact poses an extinction risk, which one could call Armageddon. (You could spend vast amounts of money trying to persuade people of true things, or false things; the money doesn’t care either way.)
Obviously, there’s not going to be a “proof” of things that haven’t happened yet, but there’s lots of informed speculation. Have you read, say, “The Alignment Problem from a Deep Learning Perspective”? (That may not be the best introduction for you, depending on the reasons for your skepticism, but it’s the one that happened to come to mind, which is more grounded in real AI research than previous informed speculation that had less empirical data to work from.)
I looked at the paper you recommended Zack. The specific section having to do with “how” AGI is developed (para 1.2) skirts around the problem.
“We assume that AGI is developed by pretraining a single large foundation model using selfsupervised learning on (possibly multi-modal) data [Bommasani et al., 2021], and then fine-tuning it using model-free reinforcement learning (RL) with a reward function learned from human feedback [Christiano et al., 2017] on a wide range of computer-based tasks.4 This setup combines elements of the techniques used to train cutting-edge systems such as GPT-4 [OpenAI, 2023a], Sparrow [Glaese et al., 2022], and ACT-1 [Adept, 2022]; we assume, however, that 2 the resulting policy goes far beyond their current capabilities, due to improvements in architectures, scale, and training tasks. We expect a similar analysis to apply if AGI training involves related techniques such as model-based RL and planning [Sutton and Barto, 2018] (with learned reward functions), goal-conditioned sequence modeling [Chen et al., 2021, Li et al., 2022, Schmidhuber, 2020], or RL on rewards learned via inverse RL [Ng and Russell, 2000]—however, these are beyond our current scope.”
Altman has recently said in a speech that continuing to do what has led them to GPT4 is probably not going to get to AGI. “”Let’s use the word superintelligence now, as superintelligence can’t discover novel physics, I don’t think it’s a superintelligence. Training on the data of what you know, teaching to clone the behavior of humans and human text, I don’t think that’s going to get there. So there’s this question that has been debated in the field for a long time: what do we have to do in addition to a language model to make a system that can go discover new physics?”
I think it’s pretty clear that no one has a clear path to AGI, nor do we know what a superintelligence will do, yet the Longtermist ecosystem is thriving. I find that curious, to say the least.
Thank you for the link to that paper, Zack. That’s not one that I’ve read yet.
And you’re correct that I raised two separate issues. I’m interested in hearing any responses that members of this community would like to give to either issue.
It seems to me that Effective Altruism uses a theoretical negative outcome (an extinction-level event) as motivation for action in a very similar way to how Judeo-Christian religions use another theoretical negative outcome (your unsaved soul going to Hell for eternal torment) as motivation for action.
Both have high priests who establish dogma, and legions of believers who evangelize and grow the base.
Both spend vast amounts of money to persuade others to adopt their belief system.
There’s nothing new there regarding how religions work, but for a philosophical belief that’s supposed to be grounded in rational decision-making, there’s a giant looming gap in the reasoning chain when it comes to AI posing an existential risk to humanity.
Unless I’m missing something.
Is there any proof that I haven’t read yet which demonstrates that AGI or Superintelligence will have the capability to go rogue and bring about Armagadden?
Analogies can be found in many places. FDA prevents you from selling certain kinds of food? Sounds similar to ancient priests declaring food taboos for their followers. Vaccination? That’s just modern people performing a ritual to literally protect them from invisible threats. They even believe that a bad thing will happen to them if someone else in their neighborhood refuses to perform the ritual properly.
The difference is that we already have examples of food poisoning or people dying from a disease, but we do not have an example of a super-intelligent AI exterminating the humanity. That is a fair objection, but it is also clear why waiting to get the example first might be a wrong approach, so...
One possible approach is to look at smaller versions. What is a smaller version of “a super-intelligent AI exterminating the humanity”? If it is “a stupid program doing things its authors clearly did not intend”, then every software developer has stories to tell.
This is not the full answer, of course, but I think that a reasonable debate should be more like this.
I downvoted this post. I claim it’s for the public good, maybe you find this strange, but let me explain my reasoning.
You’ve come on Less Wrong, a website that probably has more discussion of this than any other website on the internet. If you want to find arguments, they aren’t hard to find. It’s a bit like walking into a library and saying that you can’t find a book to read.
The trouble isn’t that you literally can’t find any book/arguments, it’s that you’ve got a bunch of unstated requirements that you want satisfied. Now that’s perfectly fine, it’s good to have standards. At the same time, you’ve asked the question in a maximally vague way. I don’t expect you to be able to list all your requirements. That’s probably impossible and when it is. possible, it’s often a lot of work. At the same time, I do believe that it’s possible to do better than maximally vague.
The problem with maximally vague questions is that they almost guarantee that any attempt to provide an answer will be unsatisfying both for the person answering and the person receiving the answer. Worse, you’ve framed the question in such a way that some people will likely feel compelled to attempt to answer anyway, lest people who think that there is such a risk come off as unable to respond to critics.
If that’s the case, downvoting seems logical. Why support a game where no-one wins?
Sorry if this comes off as harsh, that’s not my intent. I’m simply attempting to prompt reflection.
My apologies for not being clear in my Quick Take, Chris. As Zach pointed out in his reply, I posed two issues.
The first being an obvious parallel for me between EA and Judeo-Christian religions. You may or may not agree with me, which is fine. I’m not looking to convince anyone of my point-of-view. I was merely interested in seeing if others here had a similar POV.
The second issue I raised was what I saw as a failure in the reasoning chain where you go from Deep Learning to Consciousness to an AI Armageddon. Why was that leap in faith so compelling to people?
I don’t see either of those questions as not being in the interest of the “public good”, but perhaps you just said that because my first attempt wasn’t clear. Hopefully, I’ve remedied that with this answer.
Oh, they’re definitely valid questions. The problem is that the second question is rather vague. You need to either state what a good answer would look like or why existing answers aren’t satisifying.
The argument chain you presented (Deep Learning → Consciousness → AI Armageddon) is a strawman. If you sincerely think that’s our position, you haven’t read enough. Read more, and you’ll be better received. If you don’t think that, stop being unfair about what we said, and you’ll be better received.
Last I checked, most of us were agnostic on the AI Consciousness question. If you think that’s a key point to our Doom arguments, you haven’t understood us; that step isn’t necessarily required; it’s not a link in the chain of argument. Maybe AI can be dangerous, even existentially so, without “having qualia”. But neither are we confident that AI necessarily won’t be conscious. We’re not sure how it works in humans but seems to be an emergent property of brains, so why not artificial brains as well? We don’t understand how the inscrutable matrices work either, so it seems like a possibility. Maybe gradient descent and evolution stumbled upon similar machinery for similar reasons. AI consciousness is mostly beside the point. Where it does come up is usually not in the AI Doom arguments, but questions about what we ethically owe AIs, as moral patients.
Deep Learning is also not required for AI Doom. Doom is a disjunctive claim; there are multiple paths for getting there. The likely-looking path at this point would go through the frontier LLM paradigm, but that isn’t required for Doom. (However, it probably is required for most short timelines.)
I mean, I agree that there are psycho-sociological similarities between religions and the AI risk movement (and indeed, I sometimes pejoratively refer to the latter as a “robot cult”), but analyzing the properties of the social group that believes that AI is an extinction risk is a separate question from whether AI in fact poses an extinction risk, which one could call Armageddon. (You could spend vast amounts of money trying to persuade people of true things, or false things; the money doesn’t care either way.)
Obviously, there’s not going to be a “proof” of things that haven’t happened yet, but there’s lots of informed speculation. Have you read, say, “The Alignment Problem from a Deep Learning Perspective”? (That may not be the best introduction for you, depending on the reasons for your skepticism, but it’s the one that happened to come to mind, which is more grounded in real AI research than previous informed speculation that had less empirical data to work from.)
I looked at the paper you recommended Zack. The specific section having to do with “how” AGI is developed (para 1.2) skirts around the problem.
“We assume that AGI is developed by pretraining a single large foundation model using selfsupervised learning on (possibly multi-modal) data [Bommasani et al., 2021], and then fine-tuning it using model-free reinforcement learning (RL) with a reward function learned from human feedback [Christiano et al., 2017] on a wide range of computer-based tasks.4 This setup combines elements of the techniques used to train cutting-edge systems such as GPT-4 [OpenAI, 2023a], Sparrow [Glaese et al., 2022], and ACT-1 [Adept, 2022]; we assume, however, that 2 the resulting policy goes far beyond their current capabilities, due to improvements in architectures, scale, and training tasks. We expect a similar analysis to apply if AGI training involves related techniques such as model-based RL and planning [Sutton and Barto, 2018] (with learned reward functions), goal-conditioned sequence modeling [Chen et al., 2021, Li et al., 2022, Schmidhuber, 2020], or RL on rewards learned via inverse RL [Ng and Russell, 2000]—however, these are beyond our current scope.”
Altman has recently said in a speech that continuing to do what has led them to GPT4 is probably not going to get to AGI. “”Let’s use the word superintelligence now, as superintelligence can’t discover novel physics, I don’t think it’s a superintelligence. Training on the data of what you know, teaching to clone the behavior of humans and human text, I don’t think that’s going to get there. So there’s this question that has been debated in the field for a long time: what do we have to do in addition to a language model to make a system that can go discover new physics?”
https://the-decoder.com/sam-altman-on-agi-scaling-large-language-models-is-not-enough/
I think it’s pretty clear that no one has a clear path to AGI, nor do we know what a superintelligence will do, yet the Longtermist ecosystem is thriving. I find that curious, to say the least.
Thank you for the link to that paper, Zack. That’s not one that I’ve read yet.
And you’re correct that I raised two separate issues. I’m interested in hearing any responses that members of this community would like to give to either issue.