Right, but as far as I can tell without having put lots of hours into trying to solve the problem of clippyAI, it’s really damn hard to precisely specify a paperclip.
I made a stab at it here, and it got some upvotes. So here’s a repost:
Make a wire, 10 cm long and 1mm in diameter, composed of an alloy of 99.8% iron and 0.2% carbon. Start at one end and bend it such that the segments from 2-2.5cm, 2.75-3.25cm, 5.25-5.75cm form half-circles, with all the bends in the same direction and forming an inward spiral (the end with the first bend is outside the third bend).
(Please let me know if reposting violates LW ettiquette so I know not to do it again.)
Here’s a sort of fully general counterargument against proposals to naturalize human concepts in AI terms: if you can naturalize human concepts, you should be able to naturalize the human concept of a box. And if you can do that, we can build Oracle AI and save the world. It’s very easy to describe what we mean by ‘stay in the box’, but it turns out that seed (self-modifying!) AIs just don’t have a natural ontology for the descriptions.
This argument might be hella flawed; it seems kind of tenuous.
That assumption isn’t really a core part of the argument… the general “if specifying human concepts is easy, then come up with a plan for making a seed AI want to stay in a box” argument still stands, even if we don’t actually want to keep arbitrary seed AIs in boxes.
For the record I am significantly less certain than most LW or SIAI singularitarians that seed AIs not explicitly coded with human values in mind will end up creating a horrible future, or at least a more horrible future than something like CEV. I do think it’s worth a whole lot of continued investigation.
I made a stab at it here, and it got some upvotes. So here’s a repost:
Make a wire, 10 cm long and 1mm in diameter, composed of an alloy of 99.8% iron and 0.2% carbon. Start at one end and bend it such that the segments from 2-2.5cm, 2.75-3.25cm, 5.25-5.75cm form half-circles, with all the bends in the same direction and forming an inward spiral (the end with the first bend is outside the third bend).
(Please let me know if reposting violates LW ettiquette so I know not to do it again.)
I don’t think it violates LW etiquette.
Here’s a sort of fully general counterargument against proposals to naturalize human concepts in AI terms: if you can naturalize human concepts, you should be able to naturalize the human concept of a box. And if you can do that, we can build Oracle AI and save the world. It’s very easy to describe what we mean by ‘stay in the box’, but it turns out that seed (self-modifying!) AIs just don’t have a natural ontology for the descriptions.
This argument might be hella flawed; it seems kind of tenuous.
Aren’t you simply assuming that the world is doomed here? It sure looks like it!
Since when is that assumption part of a valid argument?
That assumption isn’t really a core part of the argument… the general “if specifying human concepts is easy, then come up with a plan for making a seed AI want to stay in a box” argument still stands, even if we don’t actually want to keep arbitrary seed AIs in boxes.
For the record I am significantly less certain than most LW or SIAI singularitarians that seed AIs not explicitly coded with human values in mind will end up creating a horrible future, or at least a more horrible future than something like CEV. I do think it’s worth a whole lot of continued investigation.