I think the problem is conflating different aspects of intelligence into one variable. The three major groups of aspects are:
1: thought/engineering/problem-solving/etc; it can work entirely within mathematical model. This we are making steady progress at.
2: real-world volition, especially the will to form most accurate beliefs of the world. This we don’t know how to solve, and don’t even need to automate. We ourselves aren’t even a shining example of 2, but generally don’t care so much about that. 2 is a hard philosophical problem.
3: Morals.
Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1. 2 without 1 can’t invent anything. The 3 may follow from strong 1 and 2 assuming that AI assigns non zero chance to being under test in a simulation, and strong 1 providing enormous resources.
So, what is your human level AI?
It seems to me that people with high capacity for 1, i.e. the engineers and scientists, are so dubious about AI risk because it is pretty clear to them, both internally, and from the AI effort, that 1 doesn’t imply 2 and adding 2 won’t strengthen 1. There isn’t some great issue with 1 that 2 would resolve. The 1 works just fine. If for example we invent awesome automatic software development AI, it will be harmless even if superhuman at programming, and will self improve as much as possible without 2. Not just harmless, there’s no reason why 1-agent plus human are together any less powerful than 1-agent with 2-capability.
Eliezer, it looks like, is very concerned with forming accurate beliefs, i.e. 2-type behaviour, but i don’t see him inventing novel solutions as much. Maybe he’s so scared of the AI because he attributes other people’s problem solving to intellect paralleling his, while it’s more orthogonal. Maybe he imagines that very strongly more-2 agent will somehow be innovative and foom, and he sees a lot of room for improving the 2. Or something along those lines. He is a very unusual person; I don’t know how he thinks. The way I think it is very natural for me that the problem solving does not require wanting to actually do anything real first. That also parallels the software effort because ultimately everyone who is capable of working effectively as innovative software developers are very 1-orientated and don’t see 2 as either necessary or desirable. I don’t think 2 would just suddenly appear out of nothing by some emergence or accident.
Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1.
Type 1 intelligence is dangerous as soon as you try to use it for anything practical simply because it is powerful. If you ask it “how can we reduce global temperatures” and “causing a nuclear winter” is in its solution space, it may return that. Powerful tools must be wielded precisely.
See, that’s what is so incredibly irritating about dealing with people who lack any domain specific knowledge. You can’t ask it, “how can we reduce global temperatures” in the real world.
You can ask it how to make a model out of data, you can ask it what to do to the model so that such and such function decreases, it may try nuking this model (inside the model), and generate such solution. You got to actually put a lot of effort, like replicating it’s in-model actions in real world in mindless manner, for this nuking to happen in real world. (and you’ll also have the model visualization to examine, by the way)
What if instead of giving the solution “cause nuclear war” it simply returns a seemingly innocuous solution expected to cause nuclear war? I’m assuming that the modelling portion is a black box so you can’t look inside and see why that solution is expected to lead to a reduction in global temperatures.
If the software is using models we can understand and check ourselves then it isn’t nearly so dangerous.
I’m assuming that the modelling portion is a black box so you can’t look inside and see why that solution is expected to lead to a reduction in global temperatures.
Let’s just assume that mister president sits on nuclear launch button by accident, shall we?
It isn’t an amazing novel philosophical insight that type-1 agents ‘love’ to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind. You, of course, also have some pretty visualization of what is the scenario where the parameter was minimized or maximized.
edit: also the answers could be really funny. How do we solve global warming? Okay, just abduct the prime minister of china! That should cool the planet off.
It isn’t an amazing novel philosophical insight that type-1 agents ‘love’ to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind.
Of course it isn’t.
Let’s just assume that mister president sits on nuclear launch button by accident, shall we?
There are machine learning techniques like genetic programming that can result in black-box models. As I stated earlier, I’m not sure humans will ever combine black-box problem solving techniques with self-optimization and attempt to use the product to solve practical problems; I just think it is dangerous to do so once the techniques become powerful enough.
Yup, we seem safe for the moment because we simply lack the ability to create anything dangerous.
Actually your scenario already happened… Fukushima reactor failure: they used computer modelling to simulate tsunami, it was 1960s, the computers were science woo, and if computer said so, then it was true.
For more subtle cases though—see, the problem is substitution of ‘intellectually omnipotent omniscient entity’ for AI. If the AI tells to assassinate foreign official, nobody’s going to do that; got to be starting the nuclear war via butterfly effect, and that’s pretty much intractable.
For more subtle cases though—see, the problem is substitution of ‘intellectually omnipotent omniscient entity’ for AI. If the AI tells to assassinate foreign official, nobody’s going to do that; got to be starting the nuclear war via butterfly effect, and that’s pretty much intractable.
I would prefer our only line of defense not be “most stupid solutions are going to look stupid”. It’s harder to recognize stupid solutions in say, medicine (although there we can verify with empirical data).
It is unclear to me that artificial intelligence adds any risk there, though, that isn’t present from natural stupidity.
Right now, look, so many plastics around us, food additives, and other novel substances. Rising cancer rates even after controlling for age. With all the testing, when you have hundred random things a few bad ones will slip through. Or obesity. This (idiotic solutions) is a problem with technological progress in general.
edit: actually, our all natural intelligence is very prone to quite odd solutions. Say, reproductive drive, secondary sex characteristics, yadda yadda, end result, cosmetic implants. Desire to sell more product, end result, overconsumption. Etc etc.
I think the problem is conflating different aspects of intelligence into one variable. The three major groups of aspects are:
1: thought/engineering/problem-solving/etc; it can work entirely within mathematical model. This we are making steady progress at.
2: real-world volition, especially the will to form most accurate beliefs of the world. This we don’t know how to solve, and don’t even need to automate. We ourselves aren’t even a shining example of 2, but generally don’t care so much about that. 2 is a hard philosophical problem.
3: Morals.
Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1. 2 without 1 can’t invent anything. The 3 may follow from strong 1 and 2 assuming that AI assigns non zero chance to being under test in a simulation, and strong 1 providing enormous resources.
So, what is your human level AI?
It seems to me that people with high capacity for 1, i.e. the engineers and scientists, are so dubious about AI risk because it is pretty clear to them, both internally, and from the AI effort, that 1 doesn’t imply 2 and adding 2 won’t strengthen 1. There isn’t some great issue with 1 that 2 would resolve. The 1 works just fine. If for example we invent awesome automatic software development AI, it will be harmless even if superhuman at programming, and will self improve as much as possible without 2. Not just harmless, there’s no reason why 1-agent plus human are together any less powerful than 1-agent with 2-capability.
Eliezer, it looks like, is very concerned with forming accurate beliefs, i.e. 2-type behaviour, but i don’t see him inventing novel solutions as much. Maybe he’s so scared of the AI because he attributes other people’s problem solving to intellect paralleling his, while it’s more orthogonal. Maybe he imagines that very strongly more-2 agent will somehow be innovative and foom, and he sees a lot of room for improving the 2. Or something along those lines. He is a very unusual person; I don’t know how he thinks. The way I think it is very natural for me that the problem solving does not require wanting to actually do anything real first. That also parallels the software effort because ultimately everyone who is capable of working effectively as innovative software developers are very 1-orientated and don’t see 2 as either necessary or desirable. I don’t think 2 would just suddenly appear out of nothing by some emergence or accident.
Type 1 intelligence is dangerous as soon as you try to use it for anything practical simply because it is powerful. If you ask it “how can we reduce global temperatures” and “causing a nuclear winter” is in its solution space, it may return that. Powerful tools must be wielded precisely.
See, that’s what is so incredibly irritating about dealing with people who lack any domain specific knowledge. You can’t ask it, “how can we reduce global temperatures” in the real world.
You can ask it how to make a model out of data, you can ask it what to do to the model so that such and such function decreases, it may try nuking this model (inside the model), and generate such solution. You got to actually put a lot of effort, like replicating it’s in-model actions in real world in mindless manner, for this nuking to happen in real world. (and you’ll also have the model visualization to examine, by the way)
What if instead of giving the solution “cause nuclear war” it simply returns a seemingly innocuous solution expected to cause nuclear war? I’m assuming that the modelling portion is a black box so you can’t look inside and see why that solution is expected to lead to a reduction in global temperatures.
If the software is using models we can understand and check ourselves then it isn’t nearly so dangerous.
Let’s just assume that mister president sits on nuclear launch button by accident, shall we?
It isn’t an amazing novel philosophical insight that type-1 agents ‘love’ to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind. You, of course, also have some pretty visualization of what is the scenario where the parameter was minimized or maximized.
edit: also the answers could be really funny. How do we solve global warming? Okay, just abduct the prime minister of china! That should cool the planet off.
Of course it isn’t.
There are machine learning techniques like genetic programming that can result in black-box models. As I stated earlier, I’m not sure humans will ever combine black-box problem solving techniques with self-optimization and attempt to use the product to solve practical problems; I just think it is dangerous to do so once the techniques become powerful enough.
Which are even more prone to outputting crap solutions even without being superintelligent.
Yup, we seem safe for the moment because we simply lack the ability to create anything dangerous.
Sorry you’re being downvoted. It’s not me.
Actually your scenario already happened… Fukushima reactor failure: they used computer modelling to simulate tsunami, it was 1960s, the computers were science woo, and if computer said so, then it was true.
For more subtle cases though—see, the problem is substitution of ‘intellectually omnipotent omniscient entity’ for AI. If the AI tells to assassinate foreign official, nobody’s going to do that; got to be starting the nuclear war via butterfly effect, and that’s pretty much intractable.
I would prefer our only line of defense not be “most stupid solutions are going to look stupid”. It’s harder to recognize stupid solutions in say, medicine (although there we can verify with empirical data).
It is unclear to me that artificial intelligence adds any risk there, though, that isn’t present from natural stupidity.
Right now, look, so many plastics around us, food additives, and other novel substances. Rising cancer rates even after controlling for age. With all the testing, when you have hundred random things a few bad ones will slip through. Or obesity. This (idiotic solutions) is a problem with technological progress in general.
edit: actually, our all natural intelligence is very prone to quite odd solutions. Say, reproductive drive, secondary sex characteristics, yadda yadda, end result, cosmetic implants. Desire to sell more product, end result, overconsumption. Etc etc.