I agree with HK that at this point SI should not be one of the priority charities supported by GiveWell, mainly due to the lack of demonstrated progress in the stated area of AI risk evaluation. If and when SI publishes peer-reviewed papers containing new insights into the subject matter, clearly demonstrating the dangers of AGI and providing a hard-to-dispute probability estimate of the UFAI takeover within a given time frame, as well as outlining constructive ways to mitigate this risk (“solve the friendliness problem” is too vague), GiveWell should reevaluate its stance.
On the other hand, the soon-to-be-spawned Applied Rationality org will have to be evaluated on its own merits, and is likely to have easier time of meeting GiveWell’s requirements, mostly because the relevant metrics (of “raising the sanity waterline”) can be made so much more concrete and near-term.
On the other hand, the soon-to-be-spawned Applied Rationality org will have to be evaluated on its own merits, and is likely to have easier time of meeting GiveWell’s requirements, mostly because the relevant metrics (of “raising the sanity waterline”) can be made so much more concrete and near-term.
I disagree. As far as I can tell, there has been very little progress on the rationality verification problem (see this thread). I don’t think anyone at CFAR or GiveWell knows what the relevant metrics really are and how they can be compared with, say, QALYs or other approximations of utility.
As far as I can tell, there has been very little progress on the rationality verification problem
First, this seems like a necessary stepping stone toward any kind of FAI-related work, and so cannot be skipped. Indeed, if you cannot tell which of two entities in front of you is more rational that the other, what hope do you have of solving a much larger problem of proven friendliness, of which proven rationality is only a tiny part.
Anyway, this limited-scope project (consistently ordering people by rationality level in a specific setting) should be something rather uncontroversial and achievable.
First, this seems like a necessary stepping stone toward any kind of FAI-related work, and so cannot be skipped.
It’s one stepping-stone to FAI work, but other things could be substituted for it, like publishing lots and lots of well-received peer-reviewed papers.
Anyway, this limited-scope project (consistently ordering people by rationality level in a specific setting) should be something rather uncontroversial and achievable.
I don’t think it is. What exactly is “rationality level,” and how would it be measured? There’s no well-defined quantity that you can scan and get a measure of someone’s rationality. Even “winning” isn’t that good of a metric.
What exactly is “rationality level,” and how would it be measured?
This is a harder question than “which one of two given behaviors is more rational in a given setting?”, that’s why I suggested starting with the latter. Once you accumulate enough answers like that, you can start assembling them into a more general metric.
Gotta start somewhere. I proposed a step that may or may not lead in the right direction, leveling a criticism that it does not solve the whole problem at once is not very productive. Even the hardest problems tend to yield to incremental approaches. If you have a better idea, by all means, suggest it.
leveling a criticism that it does not solve the whole problem at once is not very productive.
I’m not trying to be negative for the sake of being negative, or even for the sake of criticizing your proposal—I was disagreeing with your prediction that CFAR will have an easier time of meeting GiveWell’s requirements.
(I actually like your proposal quite a bit, and I think it’s an avenue that CFAR should investigate. But I still think that the verification problem is hard, and hence I predict that CFAR will not be very good at providing GiveWell with a workable rationality metric.)
...peer-reviewed papers containing new insights into the subject matter, clearly demonstrating the dangers of AGI and providing a hard-to-dispute probability estimate of the UFAI takeover within a given time frame, as well as outlining constructive ways to mitigate this risk...
I agree with HK that at this point SI should not be one of the priority charities supported by GiveWell, mainly due to the lack of demonstrated progress in the stated area of AI risk evaluation. If and when SI publishes peer-reviewed papers containing new insights into the subject matter, clearly demonstrating the dangers of AGI and providing a hard-to-dispute probability estimate of the UFAI takeover within a given time frame, as well as outlining constructive ways to mitigate this risk (“solve the friendliness problem” is too vague), GiveWell should reevaluate its stance.
On the other hand, the soon-to-be-spawned Applied Rationality org will have to be evaluated on its own merits, and is likely to have easier time of meeting GiveWell’s requirements, mostly because the relevant metrics (of “raising the sanity waterline”) can be made so much more concrete and near-term.
I disagree. As far as I can tell, there has been very little progress on the rationality verification problem (see this thread). I don’t think anyone at CFAR or GiveWell knows what the relevant metrics really are and how they can be compared with, say, QALYs or other approximations of utility.
First, this seems like a necessary stepping stone toward any kind of FAI-related work, and so cannot be skipped. Indeed, if you cannot tell which of two entities in front of you is more rational that the other, what hope do you have of solving a much larger problem of proven friendliness, of which proven rationality is only a tiny part.
Anyway, this limited-scope project (consistently ordering people by rationality level in a specific setting) should be something rather uncontroversial and achievable.
It’s one stepping-stone to FAI work, but other things could be substituted for it, like publishing lots and lots of well-received peer-reviewed papers.
I don’t think it is. What exactly is “rationality level,” and how would it be measured? There’s no well-defined quantity that you can scan and get a measure of someone’s rationality. Even “winning” isn’t that good of a metric.
This is a harder question than “which one of two given behaviors is more rational in a given setting?”, that’s why I suggested starting with the latter. Once you accumulate enough answers like that, you can start assembling them into a more general metric.
I maintain that this is a very hard problem. We know what the correct answers to various cognitive bias quizzes are (e.g. the conjunction fallacy questions questions), but it’s not clear that aggregating a lot of these tests corresponds to what we really mean when we say “rationality”.
Gotta start somewhere. I proposed a step that may or may not lead in the right direction, leveling a criticism that it does not solve the whole problem at once is not very productive. Even the hardest problems tend to yield to incremental approaches. If you have a better idea, by all means, suggest it.
I’m not trying to be negative for the sake of being negative, or even for the sake of criticizing your proposal—I was disagreeing with your prediction that CFAR will have an easier time of meeting GiveWell’s requirements.
(I actually like your proposal quite a bit, and I think it’s an avenue that CFAR should investigate. But I still think that the verification problem is hard, and hence I predict that CFAR will not be very good at providing GiveWell with a workable rationality metric.)
I’d like to emphasize that part.