It doesn’t matter what happens when we sample a mind at random. We only care about the sorts of minds we might build, whether by designing them or evolving them. Either way, they’ll be far from random.
Consider my “at random” short hand for “at random from the space of possible minds built by humans”.
The Eliezer approved example of humans not getting a simple system to do what they want is the classic Machine Learning example where a Neural Net was trained on two different sorts of tanks. It had happened that the photographs of the different types of tanks had been taken at different times of day. So the classifier just worked on that rather than actually looking at the types of tank. So we didn’t build a tank classifier but a day/night classifier. More here.
While I may not agree with Eliezer on everything, I do agree with him it is damn hard to get a computer to do what you want when you stop programming them explicitly .
Obviously AI is hard, and obviously software has bugs.
To counter my argument, you need to make a case that the bugs will be so fundamental and severe, and go undetected for so long, that despite any safeguards we take, they will lead to catastrophic results with probability greater than 99%.
Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.
It doesn’t matter what happens when we sample a mind at random. We only care about the sorts of minds we might build, whether by designing them or evolving them. Either way, they’ll be far from random.
Consider my “at random” short hand for “at random from the space of possible minds built by humans”.
The Eliezer approved example of humans not getting a simple system to do what they want is the classic Machine Learning example where a Neural Net was trained on two different sorts of tanks. It had happened that the photographs of the different types of tanks had been taken at different times of day. So the classifier just worked on that rather than actually looking at the types of tank. So we didn’t build a tank classifier but a day/night classifier. More here.
While I may not agree with Eliezer on everything, I do agree with him it is damn hard to get a computer to do what you want when you stop programming them explicitly .
Obviously AI is hard, and obviously software has bugs.
To counter my argument, you need to make a case that the bugs will be so fundamental and severe, and go undetected for so long, that despite any safeguards we take, they will lead to catastrophic results with probability greater than 99%.
How do you consider “formalizing friendliness” to be different from “building safeguards”?
Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
I’ve never seen the appeal of AI boxing.
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.