You are welcome. Another answer to your question just occurred to me.
If you count AI fairness research
as a sub-type of AI alignment research, then you can find a whole community of alignment researchers who talk quite a lot with each other about ‘aligned with whom’ in quite sophisticated ways. Reference: the main conference of this community is ACM FAccT.
In EA and on this forum, when people count the number of alignment researchers, they usually count dedicated x-risk alignment researchers only, and not the people working on fairness, or on the problem of making self-driving cars safer. There is a somewhat unexamined assumption in the AI x-risk community that fairness and self-driving car safety techniques are not very relevant to managing AI x-risk, both in the technical space and the policy space. The way my x-risk technical work is going, it is increasingly telling me that this unexamined assumption is entirely wrong.
On a lighter note:
ignoring those values means we won’t actually achieve ‘alignment’ even when we think we have.
Well, as long as the ‘we’ you are talking about here is a group of people that still includes Eliezer Yudkowsky, then I can guarantee that ‘we’ are in no danger of ever collectively believing that we have achieved alignment.
Koen—thanks for the link to ACM FAccT; looks interesting. I’ll see what their people have to say about the ‘aligned with whom’ question.
I agree that AI X-risk folks should probably pay more attention to the algorithmic fairness folks and self-driving car folks, in terms of seeing what general lessons can be learned about alignment from these specific domains.
You are welcome. Another answer to your question just occurred to me.
If you count AI fairness research as a sub-type of AI alignment research, then you can find a whole community of alignment researchers who talk quite a lot with each other about ‘aligned with whom’ in quite sophisticated ways. Reference: the main conference of this community is ACM FAccT.
In EA and on this forum, when people count the number of alignment researchers, they usually count dedicated x-risk alignment researchers only, and not the people working on fairness, or on the problem of making self-driving cars safer. There is a somewhat unexamined assumption in the AI x-risk community that fairness and self-driving car safety techniques are not very relevant to managing AI x-risk, both in the technical space and the policy space. The way my x-risk technical work is going, it is increasingly telling me that this unexamined assumption is entirely wrong.
On a lighter note:
Well, as long as the ‘we’ you are talking about here is a group of people that still includes Eliezer Yudkowsky, then I can guarantee that ‘we’ are in no danger of ever collectively believing that we have achieved alignment.
Koen—thanks for the link to ACM FAccT; looks interesting. I’ll see what their people have to say about the ‘aligned with whom’ question.
I agree that AI X-risk folks should probably pay more attention to the algorithmic fairness folks and self-driving car folks, in terms of seeing what general lessons can be learned about alignment from these specific domains.