Phil_Goetz6 comments on What I Think, If Not Why

Phil_Goetz6 12 Dec 2008 1:25 UTC
0 points
It would have been better of me to reference Eliezer’s Al Qaeda argument, and explain why I find it unconvincing.

Vladimir:
Phil, in suggesting to replace an unFriendly AI that converges on a bad utility by a collection of AIs that never converge, you are effectively trying to improve the situation by injecting randomness in the system.
You believe evolution works, right?

You can replace randomness only once you understand the search space. Eliezer wants to replace the evolution of values, without understanding what it is that that evolution is optimizing. He wants to replace evolution that works, with a theory that has so many weak links in its long chain of logic that there is very little chance it will do what he wants it to, even supposing that what he wants it to do is the right thing to do.

Vladimir:
Your perception of lawful extrapolation of values as “stasis” seems to stem from intuitions about free will.
That’s a funny thing to say in response to what I said, including: ‘One question is where “extrapolation” fits on a scale between “value stasis” and “what a free wild-type AI would think of on its own.”’ It’s not that I think “extrapolation” is supposed to be stasis; I think it may be incoherent to talk about an “extrapolation” that is less free than “wild-type AI”, and yet doesn’t keep values out of some really good areas in value-space. Any way you look at it, it’s primates telling superintelligences what’s good.

As I just said, clearly “extrapolation” is meant to impose restrictions on the development of values. Otherwise it would be pointless.

Vladimir:
it could act as a special “luck” that in the end results in the best possible outcome given the allowed level of interference.
Please remember that I am not assuming that FAI-CEV is an oracle that magically works perfectly to produce the best possible outcome. Yes, an AI could subtly change things so that we’re not aware that it is RESTRICTING how our values develop. That doesn’t make it good for the rest of all time to be controlled by the utility functions of primates (even at a meta level).

Here’s a question whose answer could diminish my worries: Can CEV lead to the decision to abandon CEV? If smarter-than-humans “would decide” (modulo the gigantic assumption CEV makes that it makes sense to talk about what “smarter than humans would decide”, as if greater intelligence made agreement more rather than less likely—and, no, they will not be perfect Bayesians) that CEV is wrong, does that mean an AI guided by CEV would then stop following CEV?

If this is so, isn’t it almost probability 1 that CEV will be abandoned at some point?