Here’s a sort of fully general counterargument against proposals to naturalize human concepts in AI terms: if you can naturalize human concepts, you should be able to naturalize the human concept of a box. And if you can do that, we can build Oracle AI and save the world. It’s very easy to describe what we mean by ‘stay in the box’, but it turns out that seed (self-modifying!) AIs just don’t have a natural ontology for the descriptions.
This argument might be hella flawed; it seems kind of tenuous.
That assumption isn’t really a core part of the argument… the general “if specifying human concepts is easy, then come up with a plan for making a seed AI want to stay in a box” argument still stands, even if we don’t actually want to keep arbitrary seed AIs in boxes.
For the record I am significantly less certain than most LW or SIAI singularitarians that seed AIs not explicitly coded with human values in mind will end up creating a horrible future, or at least a more horrible future than something like CEV. I do think it’s worth a whole lot of continued investigation.
I don’t think it violates LW etiquette.
Here’s a sort of fully general counterargument against proposals to naturalize human concepts in AI terms: if you can naturalize human concepts, you should be able to naturalize the human concept of a box. And if you can do that, we can build Oracle AI and save the world. It’s very easy to describe what we mean by ‘stay in the box’, but it turns out that seed (self-modifying!) AIs just don’t have a natural ontology for the descriptions.
This argument might be hella flawed; it seems kind of tenuous.
Aren’t you simply assuming that the world is doomed here? It sure looks like it!
Since when is that assumption part of a valid argument?
That assumption isn’t really a core part of the argument… the general “if specifying human concepts is easy, then come up with a plan for making a seed AI want to stay in a box” argument still stands, even if we don’t actually want to keep arbitrary seed AIs in boxes.
For the record I am significantly less certain than most LW or SIAI singularitarians that seed AIs not explicitly coded with human values in mind will end up creating a horrible future, or at least a more horrible future than something like CEV. I do think it’s worth a whole lot of continued investigation.