On fuzzy tasks: I think the appropriate frame of comparison is neither an average subset (Mechanical Turk) or the ideal human (Go), but instead the median resource that someone would be reasonably likely to seek out. To use healthcare as an example, you’d want your AI to beat the average family doctor that most people would reach out to, as opposed to either a layman’s opinion or the preeminent doctor in the field.
I think that makes sense for “building a useful product”, but less so for “test the hypothesis that you can get aligned superhuman performance out of an unaligned-by-default intelligence, for purposes of later being more informed when you go to build an aligned, godlike intelligence.”
Right, but I’m not sure how you’d “test” for success in that scenario. Usefulness to humanity, as demonstrated by effective product use, seems to me like the only way to get a rigorous result. If you can’t measure the success or failure of an idea objectively, then the idea probably isn’t going to matter much.
On fuzzy tasks: I think the appropriate frame of comparison is neither an average subset (Mechanical Turk) or the ideal human (Go), but instead the median resource that someone would be reasonably likely to seek out. To use healthcare as an example, you’d want your AI to beat the average family doctor that most people would reach out to, as opposed to either a layman’s opinion or the preeminent doctor in the field.
Hello fellow Charlie! For half a second I thought I’d written a comment in a fugue state and forgotten it :P
I think that makes sense for “building a useful product”, but less so for “test the hypothesis that you can get aligned superhuman performance out of an unaligned-by-default intelligence, for purposes of later being more informed when you go to build an aligned, godlike intelligence.”
Right, but I’m not sure how you’d “test” for success in that scenario. Usefulness to humanity, as demonstrated by effective product use, seems to me like the only way to get a rigorous result. If you can’t measure the success or failure of an idea objectively, then the idea probably isn’t going to matter much.