Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.
How do you consider “formalizing friendliness” to be different from “building safeguards”?
Things like AI boxing or “emergency stop buttons” would be instances of safeguards. Basically any form of human supervision that can keep the AI in check even if it’s not safe to let it roam free.
Are you really suggesting a trial and error approach where we stick evolved and human created AIs in boxes and then eyeball them to see what they are like? Then pick the nicest looking one, on a hunch, to have control over our light cone?
I’ve never seen the appeal of AI boxing.
This is why we need to create friendliness before AGI → A lot of people who are loosely familiar with the subject think those options will work!
A goal directed intelligence will work around any obstacles in front of it. It’ll make damn sure that it prevents anyone from pressing emergency stop buttons.