If I understand correctly, his theses are that the normal research path will produce safe AI because it won’t blow up out of our control or generally behave like a Yudkowsky/Bostrom-style AI. Also that trying to prove friendlyness in advance is futile (and yet AI is still a good idea) because it will have to have “adaptive” goals, which for some reason has to extend to terminal goals.
It is my belief that an AGI will necessarily be adaptive, which implies that the goals it actively pursues constantly change as a function of its experience, and are not fully restricted by its initial (given) goals.
He needs to taboo “adapive”, read and understand Bostroms AI-behaviour stuff, and comprehend the Superpowerful-Optimizer view, and then explain exactly why it is that an AI cannot have a fixed goal architecture.
If AI’s can’t have a fixed goal architecture, Wang needs to show that AI’s with unpredictable goals are somehow safe, or start speaking out against AI.
So what sort of inconvienient word would it take for Wang’s major conclusions to be correct?
I don’t know, I’m not good enough at this steel-man thing, and my wife is sending me to bed.
He needs to taboo “adapive”, read and understand Bostroms AI-behaviour stuff, and comprehend the Superpowerful-Optimizer view, and then explain exactly why it is that an AI cannot have a fixed goal architecture.
If AI’s can’t have a fixed goal architecture, Wang needs to show that AI’s with unpredictable goals are somehow safe, or start speaking out against AI.
Damn right! These were my first thoughts as well. I know next to nothing about AI, but seriously, this is ordinary logic.
...read and understand Bostroms AI-behaviour stuff...
What makes you believe that his expected utility calculation of reading Bostroms paper suggests that it is worth reading it?
...and then explain exactly why it is that an AI cannot have a fixed goal architecture.
He answered that in the interview.
Wang needs to show that AI’s with unpredictable goals are somehow safe...
He answered that in the interview.
He wrote that AI’s with fixed goal architectures can’t be general intelligent and that AI’s with unpredictable goals can’t be guaranteed to be safe but that we have to do our best to educate them and restrict their experiences.
...and then explain exactly why it is that an AI cannot have a fixed goal architecture.
He answered that in the interview.
He answered, and asserted it but didn’t explain it.
Wang needs to show that AI’s with unpredictable goals are somehow safe...
He answered that in the interview.
He answered, but didn’t show that. (This does not represent an assertion that he couldn’t have or that in the circumstances that he should necessarily have tried.) (The previous disclaimer doesn’t represent an assertion that I wouldn’t claim that he’d have no hope of showing that credibly, just that I wasn’t right now making such a criticism.) (The second disclaimer was a tangent too far.)
He wrote that AI’s with fixed goal architectures can’t be general intelligent and that AI’s with unpredictable goals can’t be guaranteed to be safe but that we have to do our best to educate them and restrict their experiences.
The latter claim is the one that seems the most bizarre to me. He seems to not just assume that the AIs that humans create will have programming to respond to ‘education’ regarding their own motivations desirably but that all AIs must necessarily do so. And then there is the idea that you can prevent a superintelligence from rebelling against you by keeping it sheltered. That doesn’t even work on mere humans!
You’re assuming that an AI can in some sense be (super) intelligent without any kind of training or education. Pei is making the entirely valid point that no known AI works that way.
...and then explain exactly why it is that an AI cannot have a fixed goal architecture.
He answered that in the interview.
Yes, but the answer was:
If intelligence turns out to be adaptive (as believed by me and many others), then a “friendly AI” will be mainly the result of proper education, not proper design. There will be no way to design a “safe AI”, just like there is no way to require parents to only give birth to “safe baby” who will never become a criminal.
...which is pretty incoherent. His reference for this appears to be himself here and here. This material is also not very convincing. No doubt critics will find the section on “AI Ethics” in the second link revealing.
To those who disagree with Pei Wang: How would you improve his arguments? What assumptions would make his thesis correct?
If I understand correctly, his theses are that the normal research path will produce safe AI because it won’t blow up out of our control or generally behave like a Yudkowsky/Bostrom-style AI. Also that trying to prove friendlyness in advance is futile (and yet AI is still a good idea) because it will have to have “adaptive” goals, which for some reason has to extend to terminal goals.
He needs to taboo “adapive”, read and understand Bostroms AI-behaviour stuff, and comprehend the Superpowerful-Optimizer view, and then explain exactly why it is that an AI cannot have a fixed goal architecture.
If AI’s can’t have a fixed goal architecture, Wang needs to show that AI’s with unpredictable goals are somehow safe, or start speaking out against AI.
So what sort of inconvienient word would it take for Wang’s major conclusions to be correct?
I don’t know, I’m not good enough at this steel-man thing, and my wife is sending me to bed.
Damn right! These were my first thoughts as well. I know next to nothing about AI, but seriously, this is ordinary logic.
The reason would the that the goal stability problem is currently unsolved.
Taboo “adapive”—is good advice for Pei, IMHO.
What makes you believe that his expected utility calculation of reading Bostroms paper suggests that it is worth reading it?
He answered that in the interview.
He answered that in the interview.
He wrote that AI’s with fixed goal architectures can’t be general intelligent and that AI’s with unpredictable goals can’t be guaranteed to be safe but that we have to do our best to educate them and restrict their experiences.
He answered, and asserted it but didn’t explain it.
He answered, but didn’t show that. (This does not represent an assertion that he couldn’t have or that in the circumstances that he should necessarily have tried.) (The previous disclaimer doesn’t represent an assertion that I wouldn’t claim that he’d have no hope of showing that credibly, just that I wasn’t right now making such a criticism.) (The second disclaimer was a tangent too far.)
The latter claim is the one that seems the most bizarre to me. He seems to not just assume that the AIs that humans create will have programming to respond to ‘education’ regarding their own motivations desirably but that all AIs must necessarily do so. And then there is the idea that you can prevent a superintelligence from rebelling against you by keeping it sheltered. That doesn’t even work on mere humans!
You’re assuming that an AI can in some sense be (super) intelligent without any kind of training or education. Pei is making the entirely valid point that no known AI works that way.
Yes, but the answer was:
...which is pretty incoherent. His reference for this appears to be himself here and here. This material is also not very convincing. No doubt critics will find the section on “AI Ethics” in the second link revealing.
Nothing. That’s what he should do, not what he knows he should do.