For solving the Friendly AI problem, I suggest the following constraints for your initial hardware system:
1.) All outside input (and input libraries) are explicitly user selected.
2.) No means for the system to establish physical action (e.g., no robotic arms.)
3.) No means for the system to establish unexpected communication (e.g., no radio transmitters.)
Once this closed system has reached a suitable level of AI, then the problem of making it friendly can be worked on much easier and more practically, and without risk of the world ending.
To start out from the beginning to make a GAI friendly through some other means seems rather ambitious to me. Why not just work on AI now, make sure when you’re getting close to the goal, that the AI is suitably restricted, and then finally use the AI itself as an experimental testbed for “personality certification”.
(Can someone explain/link me to why this isn’t currently espoused?)
This is essentially the AI box experiment. Check out the link to see how even an AI that can only communicate with its handler(s) might be lethal without guaranteed Friendliness.
Test harnesses are a standard procedure—but they are not the only kind of test.
Basically, unless you are playing chess, or something, if you don’t test in the real world, you won’t really know if it works—and it can’t do much to help you do important things—like raise funds to fuel development.
I don’t understand why this comment was downvoted.
Yes, zero call asks a question many of us feel has been adequately answered in the past; but they are asking politely, and it would have taken extensive archive-reading for them to have already known about the AI-Box experiment.
Think before you downvote, especially with new users!
EDIT: As AdeleneDawner points out, zero call isn’t that new. Even so, the downvotes (at −2 when I first made my comment) looked more like signaling disagreement than anything else.
I downvoted the comment not because of AI box unsafety (which I don’t find convincing at the certainty level with which it’s usually asserted—disutility may well give weight to the worry, but not to the probability), but because it gives advice on the paint color for a spaceship in the time when Earth is still standing on a giant Turtle in the center of the world. It’s not a sane kind of advice.
If I’d never heard of the AI-Box Experiment, I’d think that zero call’s comment was a reasonable contribution to a conversation about AI and safety in particular. It’s only when we realize that object-level methods of restraining a transhuman intelligence are probably doomed that we know we must focus so precisely on getting its goals right.
Please point me to some more details about the AI box experiment, since I think what i suggested earlier as isolated virtual worlds is pretty much the same as what zero call is suggesting here.
I feel that there are huge assumptions in the present AI Box experiment. The gatekeeper and the AI share a language, for one, by which the AI convinces the gatekeeper.
If AGI is your only criteria without regards to friendliness, just make sure not to communicate with the AI. Turing tests are not the only proofs of intelligence. If the agi can come up with unique solutions in the universe in which it is isolated, that is enough to understand this algorithm is creative.
If observing but not communicating with a boxed AI does a good enough job of patching the security holes (which I understand that it might not—that’s for someone who better understands the issue to look at), perhaps putting an instance of a potential FAI in a contained virtual world would be useful as a test. It seems to me that a FAI that didn’t have humans to start with would perhaps have to invent us, or something like us in some specific observable way(s), because of its values.
Good thought, but on further examination it turns out that zero isn’t all that new—xe’s been commenting since November; xyr karma is low because xe has been downvoted almost as often as upvoted.
For solving the Friendly AI problem, I suggest the following constraints for your initial hardware system:
1.) All outside input (and input libraries) are explicitly user selected. 2.) No means for the system to establish physical action (e.g., no robotic arms.) 3.) No means for the system to establish unexpected communication (e.g., no radio transmitters.)
Once this closed system has reached a suitable level of AI, then the problem of making it friendly can be worked on much easier and more practically, and without risk of the world ending.
To start out from the beginning to make a GAI friendly through some other means seems rather ambitious to me. Why not just work on AI now, make sure when you’re getting close to the goal, that the AI is suitably restricted, and then finally use the AI itself as an experimental testbed for “personality certification”.
(Can someone explain/link me to why this isn’t currently espoused?)
This is essentially the AI box experiment. Check out the link to see how even an AI that can only communicate with its handler(s) might be lethal without guaranteed Friendliness.
I don’t think the publicly available details establish “how”, merely “that”.
Sure, though the mechanism I was referring to is “it can convince its handler(s) to let it out of the box through some transhuman method(s).”
Wait, since when is Eliezer transhuman?
Who said he was? If Eliezer can convince somebody to let him out of the box—for a financial loss no less—then certainly a transhuman AI can, right?
Certainly they can; what I am emphasizing is that “transhuman” is an overly strong criterion.
Definitely. Eliezer reflects perhaps a maximum lower bound on the amount of intelligence necessary to pull that off.
Didn’t David Chalmers propose that here:
http://www.vimeo.com/7320820
...?
Test harnesses are a standard procedure—but they are not the only kind of test.
Basically, unless you are playing chess, or something, if you don’t test in the real world, you won’t really know if it works—and it can’t do much to help you do important things—like raise funds to fuel development.
I don’t understand why this comment was downvoted.
Yes, zero call asks a question many of us feel has been adequately answered in the past; but they are asking politely, and it would have taken extensive archive-reading for them to have already known about the AI-Box experiment.
Think before you downvote, especially with new users!
EDIT: As AdeleneDawner points out, zero call isn’t that new. Even so, the downvotes (at −2 when I first made my comment) looked more like signaling disagreement than anything else.
I downvoted the comment not because of AI box unsafety (which I don’t find convincing at the certainty level with which it’s usually asserted—disutility may well give weight to the worry, but not to the probability), but because it gives advice on the paint color for a spaceship in the time when Earth is still standing on a giant Turtle in the center of the world. It’s not a sane kind of advice.
If I’d never heard of the AI-Box Experiment, I’d think that zero call’s comment was a reasonable contribution to a conversation about AI and safety in particular. It’s only when we realize that object-level methods of restraining a transhuman intelligence are probably doomed that we know we must focus so precisely on getting its goals right.
Vladimir and orthonormal,
Please point me to some more details about the AI box experiment, since I think what i suggested earlier as isolated virtual worlds is pretty much the same as what zero call is suggesting here.
I feel that there are huge assumptions in the present AI Box experiment. The gatekeeper and the AI share a language, for one, by which the AI convinces the gatekeeper.
If AGI is your only criteria without regards to friendliness, just make sure not to communicate with the AI. Turing tests are not the only proofs of intelligence. If the agi can come up with unique solutions in the universe in which it is isolated, that is enough to understand this algorithm is creative.
This just evoked a possibly-useful thought:
If observing but not communicating with a boxed AI does a good enough job of patching the security holes (which I understand that it might not—that’s for someone who better understands the issue to look at), perhaps putting an instance of a potential FAI in a contained virtual world would be useful as a test. It seems to me that a FAI that didn’t have humans to start with would perhaps have to invent us, or something like us in some specific observable way(s), because of its values.
Good thought, but on further examination it turns out that zero isn’t all that new—xe’s been commenting since November; xyr karma is low because xe has been downvoted almost as often as upvoted.