I’m a trained rationalist and all the things I’ve read precedently about AI being an existential risk were bullshit.
But I know the Lesswrong community (which I respect) is involved in AI risk.
So where can I find a concise, exhaustive list of all sound arguments pro and con AGI being likely an existential risk?
If no such curated list exist, are people really caring about the potential issue?
I would like to update my belief about the risk.
But I suppose that most people talking about AGI risk have not enough knowledge about what technically constitute an AGI.
I’m currently building an AGI that aims to understand natural language and to optimally answer questions, internally satisfying a coded utilitarian effective altruistic finality system.
The AGI take language as input and output natural language text.
That’s it. How can text be an existential risk is to be answered…
There’s no reason to give effectors to AGI, just asking her knowledge and optimal decision would be suffisant for revolutionizing humanity well being (e.g optimal politics), and the output would be analysed by rational humans, stopping it from AGI mistakes.
As for thinking that an AGI will become self conscious, this is nonsense and I would be fascinated to be proved otherwise.
So where can I find a concise, exhaustive list of all sound arguments pro and con AGI being likely an existential risk?
Nick Bostrom’s book ‘Superintelligence’ is the standard reference here. I also find the AI FOOM Debate especially enlightening, which hits a lot of the same points. Both you can find easily using google.
But I suppose that most people talking about AGI risk have not enough knowledge about what technically constitute an AGI.
I agree most people who talk about it are not experts in mathematics, computer science, or the field of ML, but the smaller set of people that I trust often are, such as researchers at UC Berkeley (Stuart Russell, Andrew Critch, many more), OpenAI (Paul Christiano, Chris Olah, many more), DeepMind (Jan Leike, Vika Krakovna, many more), MIRI, FHI, and so on. And of course just being an expert in a related technical domain does not make you an expert in long-term forecasting or even AGI, of which there are plausibly zero people with deep understanding.
And in this community Eliezer has talked often about actually solving the hard problem of AGI, not bouncing off and solving something easier and nearby, in part here but also in other places I’m having a hard time linking right now.
Bostrom’s book is a bit out of date, and perhaps isn’t the best reference on the AI safety community’s current concerns. Here are some more recent articles:
The AI asks for lots of info on biochemistry, and gives you a long list of chemicals that it claims cure various diseases. Most of these are normal cures. One of these chemicals will mutate the common cold into a lethal super plague. Soon we start some clinical trials of the various drugs, until someone with a cold takes the wrong one and suddenly the wold has a super plague.
The medial marvel AI is asked about the plague, It gives a plausible cover story for the plagues origins, along with describing an easy to make and effective vaccine. As casualties mount, humans rush to put the vaccine into production. The vaccine is designed to have an interesting side effect, a subtle modification of how the brain handles trust and risk. Soon the AI project leaders have been vaccinated. The AI says that it can cure the plague, it has a several billion base pair DNA file, that should be put into a bacterium. We allow it to output this file. We inspect it in less detail than we should have, given the effect of the vaccine, then we synthesize the sequence and put it in a bacteria. A few minutes later, the sequence bootstraps molecular nanotech. over the next day, the nanotech spreads around the world. Soon its exponentially expanding across the universe turning all matter into drugged out brains in vats. This is the most ethical action according to the AI’s total utilitarian ethics.
The fundamental problem is that any time that you make a decision based on the outputs of an AI, that gives it a chance to manipulate you. If what you want isn’t exactly what it wants, then it has incentive to manipulate.
(There is also the possibility of a side channel. For example, manipulating its own circuits to produce a cell phone signal, spinning its hard drive in a way that makes a particular sound, ect. Making a computer just output text, rather than outputing text, and traces of sound, microwaves and heat which can normally be ignored but might be maliciously manipulated by software, is hard)
My understanding is that we don’t really know a reliable way to produce anything that could be called a “trained rationalist”, a label which sets impossibly high standards (in the view of a layperson) and is thus pretty much unusable. (A large part of becoming an aspiring rationalist involves learning how any agent’s rationality is necessarily limited, laypeople have overoptimistic intuitions about that)
An AGI that can reason about it’s own capabilities to decide how to spend resources might be more capable then one that can’t reason about itself because it know better how to approach solving a given problem. It’s plausible that a sufficiently complex neural net finds that this is a useful sub-feature and implements it.
I wouldn’t expect Google translate to suddenly develop self consciousness but self consciousness is a tool that helps humans to reason better. Self consciousness allows us to reflect about our own action and think about how we should best approach a given problem.
An AGI that can reason about it’s own capabilities to decide how to spend resources might be more capable then one that can’t reason about itself because it know better how to approach solving a given problem. It’s plausible that a sufficiently complex neural net finds that this is a useful subfeature and implements it.
I’m a trained rationalist and all the things I’ve read precedently about AI being an existential risk were bullshit. But I know the Lesswrong community (which I respect) is involved in AI risk. So where can I find a concise, exhaustive list of all sound arguments pro and con AGI being likely an existential risk? If no such curated list exist, are people really caring about the potential issue?
I would like to update my belief about the risk. But I suppose that most people talking about AGI risk have not enough knowledge about what technically constitute an AGI. I’m currently building an AGI that aims to understand natural language and to optimally answer questions, internally satisfying a coded utilitarian effective altruistic finality system. The AGI take language as input and output natural language text. That’s it. How can text be an existential risk is to be answered… There’s no reason to give effectors to AGI, just asking her knowledge and optimal decision would be suffisant for revolutionizing humanity well being (e.g optimal politics), and the output would be analysed by rational humans, stopping it from AGI mistakes. As for thinking that an AGI will become self conscious, this is nonsense and I would be fascinated to be proved otherwise.
Nick Bostrom’s book ‘Superintelligence’ is the standard reference here. I also find the AI FOOM Debate especially enlightening, which hits a lot of the same points. Both you can find easily using google.
I agree most people who talk about it are not experts in mathematics, computer science, or the field of ML, but the smaller set of people that I trust often are, such as researchers at UC Berkeley (Stuart Russell, Andrew Critch, many more), OpenAI (Paul Christiano, Chris Olah, many more), DeepMind (Jan Leike, Vika Krakovna, many more), MIRI, FHI, and so on. And of course just being an expert in a related technical domain does not make you an expert in long-term forecasting or even AGI, of which there are plausibly zero people with deep understanding.
And in this community Eliezer has talked often about actually solving the hard problem of AGI, not bouncing off and solving something easier and nearby, in part here but also in other places I’m having a hard time linking right now.
Bostrom’s book is a bit out of date, and perhaps isn’t the best reference on the AI safety community’s current concerns. Here are some more recent articles:
Disentangling arguments for the importance of AI safety
A shift in arguments for AI risk
The Main Sources of AI Risk?
Thanks. I’ll further add Paul’s post What Failure Looks Like, and say that the Alignment Forum sequences raise a lot more specific technical concerns.
The AI asks for lots of info on biochemistry, and gives you a long list of chemicals that it claims cure various diseases. Most of these are normal cures. One of these chemicals will mutate the common cold into a lethal super plague. Soon we start some clinical trials of the various drugs, until someone with a cold takes the wrong one and suddenly the wold has a super plague.
The medial marvel AI is asked about the plague, It gives a plausible cover story for the plagues origins, along with describing an easy to make and effective vaccine. As casualties mount, humans rush to put the vaccine into production. The vaccine is designed to have an interesting side effect, a subtle modification of how the brain handles trust and risk. Soon the AI project leaders have been vaccinated. The AI says that it can cure the plague, it has a several billion base pair DNA file, that should be put into a bacterium. We allow it to output this file. We inspect it in less detail than we should have, given the effect of the vaccine, then we synthesize the sequence and put it in a bacteria. A few minutes later, the sequence bootstraps molecular nanotech. over the next day, the nanotech spreads around the world. Soon its exponentially expanding across the universe turning all matter into drugged out brains in vats. This is the most ethical action according to the AI’s total utilitarian ethics.
The fundamental problem is that any time that you make a decision based on the outputs of an AI, that gives it a chance to manipulate you. If what you want isn’t exactly what it wants, then it has incentive to manipulate.
(There is also the possibility of a side channel. For example, manipulating its own circuits to produce a cell phone signal, spinning its hard drive in a way that makes a particular sound, ect. Making a computer just output text, rather than outputing text, and traces of sound, microwaves and heat which can normally be ignored but might be maliciously manipulated by software, is hard)
What training process did you go through? o.o
My understanding is that we don’t really know a reliable way to produce anything that could be called a “trained rationalist”, a label which sets impossibly high standards (in the view of a layperson) and is thus pretty much unusable. (A large part of becoming an aspiring rationalist involves learning how any agent’s rationality is necessarily limited, laypeople have overoptimistic intuitions about that)
An AGI that can reason about it’s own capabilities to decide how to spend resources might be more capable then one that can’t reason about itself because it know better how to approach solving a given problem. It’s plausible that a sufficiently complex neural net finds that this is a useful sub-feature and implements it.
I wouldn’t expect Google translate to suddenly develop self consciousness but self consciousness is a tool that helps humans to reason better. Self consciousness allows us to reflect about our own action and think about how we should best approach a given problem.
An AGI that can reason about it’s own capabilities to decide how to spend resources might be more capable then one that can’t reason about itself because it know better how to approach solving a given problem. It’s plausible that a sufficiently complex neural net finds that this is a useful subfeature and implements it.