To get a better idea of your model of what you expect the new focus to do, here’s a hypothetical. Say we have a rationality-qua-rationality CFAR (CFAR-1) and an AI-Safety CFAR (CFAR-2). Each starts with the same team, works independently of each other, and they can’t share work. Two years later, we ask them to write a curriculum for the other organization, to the best of their abilities. This is along the lines of having them do an Ideological Turing Test on each other. How well do they match? In addition, is the newly written version better in any case? Is CFAR-1′s CFAR-2 curriculum better than CFAR-2′s CFAR-2 curriculum?
I’m treating curriculum quality as a proxy for research progress, and somewhat ignoring things like funding and operations quality. The question is only meant to address worries of research slowdowns.
To get a better idea of your model of what you expect the new focus to do, here’s a hypothetical. Say we have a rationality-qua-rationality CFAR (CFAR-1) and an AI-Safety CFAR (CFAR-2). Each starts with the same team, works independently of each other, and they can’t share work. Two years later, we ask them to write a curriculum for the other organization, to the best of their abilities. This is along the lines of having them do an Ideological Turing Test on each other. How well do they match? In addition, is the newly written version better in any case? Is CFAR-1′s CFAR-2 curriculum better than CFAR-2′s CFAR-2 curriculum?
I’m treating curriculum quality as a proxy for research progress, and somewhat ignoring things like funding and operations quality. The question is only meant to address worries of research slowdowns.