Strange7 comments on Open Thread: March 2010

Strange7 11 Mar 2010 22:46 UTC
3 points
Well, I’ll certainly concede that my suggestion fails the feasibility criterion, since a literal implementation might involve compiling a multiple-choice opinion poll with a countably infinite number of questions, translating it into every language and numbering system in history, and then presenting it to a number of subjects equal to the number of people who’ve ever lived multiplied by the average pre-singularity human lifespan in Planck-seconds multiplied by the number of possible orders in which those questions could be presented multiplied by the number of AI proposals under consideration.

I don’t mind. I was thinking about some more traditional flawed proposals, like the smile-maximizer, how they cast the net broadly enough to catch deeply Unfriendly outcomes, and decided to deliberately err in the other direction: design a test that would be too strict, that even a genuinely Friendly AI might not be able to pass, but that would definitely exclude any Unfriendly outcome.

It should also not look like a hack.

Please taboo the word ‘hack.’