Rob Bensinger comments on Magical Categories

Rob Bensinger Jan 24, 2014, 12:45 AM
0 points
0

“The superintelligent AI is smart enough to solve FAI, but also too smart to be safely boxed;”

Huh? If it is moral and alien friendly , why would you need to box it?

You’re confusing ‘smart enough to solve FAI’ with ‘actually solved FAI’, and you’re confusing ‘actually solved FAI’ with ‘self-modified to become Friendly’. Most possible artificial superintelligences have no desire to invest much time into figuring out human value, and most possible ones that do figure out human value have no desire to replace their own desires with the desires of humans. If the genie knows how to build a Friendly AI, that doesn’t imply that the genie is Friendly; so superintelligence doesn’t in any way imply Friendliness even if it implies the ability to become Friendly.

No, his argument is irrelevant as explained in this comment.

Why does that comment make his point irrelevant? Are you claiming that it’s easy to program superintelligences to be ‘rational″, where ‘rationality’ doesn’t mean instrumental or epistemic rationality but instead means something that involves being a moral paragon? It just looks to me like black-boxing human morality to make it look simpler or more universal.

If you have reason to suspect that there are no intrinsically compelling concepts, then you can build an AI that wants to be moral, but needs to figure otu what that is.

And how do you code that? If the programmers don’t know what ‘be moral’ means, then how do they code the AI to want to ‘be moral’? See Truly Part Of You.

An AI with imperfect, good-enough morality would not be an existential threat.

A human with superintelligence-level superpowers would be an existential threat. An artificial intelligence with superintelligence-level superpowers would therefore also be an existential threat, if it were merely as ethical as a human. If your bar is set low enough to cause an extinction event, you should probably raise your bar a bit.

And does Haidt’s work mean that everyone is one par, morally? Does it mean that no one can progress in moral insight?

No. Read Haidt’s paper, and beware of goalpost drift.

It isn’t good enough for a ceiling: it is good enough for a floor.

No. Human law isn’t built for superintelligences, so it doesn’t put special effort into blocking loopholes that would be available to an ASI. E.g., there’s no law against disassembling the Sun, because no lawmaker anticipated that anyone would have that capability.

There’s a theory of morality that can be expressed in a few sentences, and leaves preferences as variables to be filled in later. It’s called utilitarianism.

… Which isn’t computable, and provides no particular method for figuring out what the variables are. ‘Preferences’ isn’t operationalized.

You,and other lesswrongian writers, keep behaving as though “values are X” is obviously equivalent to “morality is X”.

Values in general are what matters for Friendly AI, not moral values. Moral values are a proper subset of what’s important and worth protecting in humanity.