I think this is a great post, but naively optimistic. You’re missing the rhetorical point. The purpose of using the term “Friendly AI” is to prevent people from thinking about what it means, to get them to agree that it’s a good thing before they know what it means.
The thing about algorithms is, knowing what “quicksort” means is equivalent to knowing how to quicksort. The source code to a quicksort function is an unambiguous description of what “quicksort” means.
If you knew what “Friendly AI” means — in sufficient detail — then you already possess the source code to Friendly AI.
So what you’re calling a “rhetorical point” is merely an inescapable fact about thinking about algorithms that we do not yet possess. If you don’t know how to quicksort, but you do know that you’d like an efficient in-place sorting function, then you don’t know what “quicksort” refers to, but you do know something about what you would like from it.
Eliezer has said things along the lines of, “I want to figure out how to build Friendly AI”. How is such a statement meaningful under your interpretation? According to you, either he already knows how to build it, or he doesn’t know what it is.
The term “Friendly AI” does not correspond in your example to “quicksort”, but to what you would like from it.
Knowing (some of) what you want an algorithm to do is not the same as knowing what the algorithm is. It seems likely to me that neither Eliezer nor anyone else knows what a Friendly AI algorithm is. They have a partial understanding of it and would like to have more of one.
It is conceivable that some future research might (for instance) prove that Friendly AI is impossible, in the same regard that a general solution to the halting problem is impossible. In such a case, a person who subsequently said “I want to figure out how to build Friendly AI” would be engaging in wishful thinking.
Once upon a time, someone could have said, “I want to figure out how to build a general solution to the halting problem.” That person might have known what the inputs and outputs of a general solution to the halting problem would look like, but they could not have known what a general solution to it was, since there’s no such thing.
I didn’t claim an algorithm was specified by its postcondition; just that saying “agree that X is a good thing without knowing what X means” is, for algorithms X, equivalent to “agree that X is a good thing before we possess X”.
I think this is a great post, but naively optimistic. You’re missing the rhetorical point. The purpose of using the term “Friendly AI” is to prevent people from thinking about what it means, to get them to agree that it’s a good thing before they know what it means.
The thing about algorithms is, knowing what “quicksort” means is equivalent to knowing how to quicksort. The source code to a quicksort function is an unambiguous description of what “quicksort” means.
If you knew what “Friendly AI” means — in sufficient detail — then you already possess the source code to Friendly AI.
So what you’re calling a “rhetorical point” is merely an inescapable fact about thinking about algorithms that we do not yet possess. If you don’t know how to quicksort, but you do know that you’d like an efficient in-place sorting function, then you don’t know what “quicksort” refers to, but you do know something about what you would like from it.
Eliezer has said things along the lines of, “I want to figure out how to build Friendly AI”. How is such a statement meaningful under your interpretation? According to you, either he already knows how to build it, or he doesn’t know what it is.
The term “Friendly AI” does not correspond in your example to “quicksort”, but to what you would like from it.
Knowing (some of) what you want an algorithm to do is not the same as knowing what the algorithm is. It seems likely to me that neither Eliezer nor anyone else knows what a Friendly AI algorithm is. They have a partial understanding of it and would like to have more of one.
It is conceivable that some future research might (for instance) prove that Friendly AI is impossible, in the same regard that a general solution to the halting problem is impossible. In such a case, a person who subsequently said “I want to figure out how to build Friendly AI” would be engaging in wishful thinking.
Once upon a time, someone could have said, “I want to figure out how to build a general solution to the halting problem.” That person might have known what the inputs and outputs of a general solution to the halting problem would look like, but they could not have known what a general solution to it was, since there’s no such thing.
So P=NP? If I can verify and understand a solution, then producing the solution must be equally easy...
I didn’t claim an algorithm was specified by its postcondition; just that saying “agree that X is a good thing without knowing what X means” is, for algorithms X, equivalent to “agree that X is a good thing before we possess X”.