I can’t speak for ciphergoth, but in this context I understand “friendly” to be as opposed to adversarial. (It perhaps goes without saying that this is unrelated to “Friendly” as applied to nonhuman environment-optimizing systems, and distinct from “friendly” as opposed to “rude.”)
I usually approach it by assuming that everyone involved in the conversation is engaged in the joint project of improving their own and one another’s models of each other and of reality.
Can you clarify why this is a key question for SIAI?
Can you clarify why this is a key question for SIAI?
They want to make friendly AI. As you described it, “”friendly” as applied to nonhuman environment-optimizing systems” probably has a different definition than in normal parlance, and so defining it is non-trivial and important.
Gotcha. Yes, it has a radically different meaning (and is unrelated to cipergoth’s use of the word in the comment you respond to). Also, “Friendly” is often capitalized when used in the sense you mean.
Roughly speaking, a nonhuman environment-optimizing system is said to be Friendly if the goals it is optimizing for are stable under reflection and self-modification (that is, no matter how sophisticated its awareness of what it is optimizing for, or its ability to change what it is optimizing for, becomes it will continue to optimize for those things), and if those goals are such that optimizing our environment for them results in an environment that’s optimal for humans.
Of course, defining that operationally is, as you say, non-trivial and of key importance. As far as I know that problem has not been solved yet. There’s a general consensus that even defining those goals (let alone designing and building a system that preserves them under reflection and self-modification) is a staggeringly hard problem.
I can’t speak for ciphergoth, but in this context I understand “friendly” to be as opposed to adversarial. (It perhaps goes without saying that this is unrelated to “Friendly” as applied to nonhuman environment-optimizing systems, and distinct from “friendly” as opposed to “rude.”)
I usually approach it by assuming that everyone involved in the conversation is engaged in the joint project of improving their own and one another’s models of each other and of reality.
Can you clarify why this is a key question for SIAI?
They want to make friendly AI. As you described it, “”friendly” as applied to nonhuman environment-optimizing systems” probably has a different definition than in normal parlance, and so defining it is non-trivial and important.
Gotcha. Yes, it has a radically different meaning (and is unrelated to cipergoth’s use of the word in the comment you respond to). Also, “Friendly” is often capitalized when used in the sense you mean.
Roughly speaking, a nonhuman environment-optimizing system is said to be Friendly if the goals it is optimizing for are stable under reflection and self-modification (that is, no matter how sophisticated its awareness of what it is optimizing for, or its ability to change what it is optimizing for, becomes it will continue to optimize for those things), and if those goals are such that optimizing our environment for them results in an environment that’s optimal for humans.
Of course, defining that operationally is, as you say, non-trivial and of key importance. As far as I know that problem has not been solved yet. There’s a general consensus that even defining those goals (let alone designing and building a system that preserves them under reflection and self-modification) is a staggeringly hard problem.