What happened here wasn’t that Harvard and the CDC stopped being interested in truth. That ship sailed a while ago. What happened here was that Harvard and the CDC’s lack of interest in truth was revealed more explicitly and clearly, and became closer to common knowledge.
Further, thinking and talking about organizations as if they were interested or disinterested in anything keeps leading to errors. CDC and Harvard likely have no institutional rules or incentives in place to promote truth over falsehood, or even to promote trust in the institutions themselves. (They may have “vision statements” or “principles”, but these are, in practice, neither rules nor incentives for most people in an organization.) The people who were the first members of the organizations were very interested in and careful about truth and trust, and that’s how the reputation arose. But the CDC was founded in 1946 and it’s all different people now. This applies even more to Harvard, which was founded in 1636, but their current reputation was mostly formed around the same time. In the long run, one cannot trust future people flowing through an organization to do anything except obey the rules written in stone and the incentives in place. Even then, if rules are felt to be too restrictive it’s likely future people will ignore, undermine, reinterpret, and outright rewrite them. And one can expect future people might do anything which is permitted by the rules and incentives, even if it would be unthinkable today.
The corollary is that one cannot trust organizational “character” not to change, and one must regularly update the reputation one attributes to an organization (really to the people in an organization), because the current people are regularly changing, both individuals simply changing over time and different people outright taking the place of previous ones.
The corollary to the corollary is that it can be helpful to have an estimate of the rate of turnover and mentoring in each organization, because that affects how long one can safely go without updating the rest of the reputation of the organization (which, again, is really the combined reputation of the particular people in that organization). Harvard professors likely have very low rates of turnover and excellent mentoring, so I wouldn’t expect their teaching and research to have changed much in the last decade, but university administrators probably change jobs as often CDC administrators. Looking into it more deeply can give you an estimate of how often you’ll have to re-evaluate the reputation of any particular organization. Every decade? Every five years? Yearly?
Another line of thought: what incentives and rules for an institution could one set up which would encourage desirable behaviour and discourage undesirable behaviour in all the future people who will take positions in the organization? Making progress on problems like this, where agents are human but have a very long future time in which to find and exploit loopholes, seems like an obvious prerequisite to making progress on AI alignment. One cannot simply trust that no future person will abuse the rules for their own benefit, just as one cannot trust that no AI will not immediately do the same.
Maybe if some progress were made on this, we could have some sustainable trust in some institutions. The checks-and-balances concept is a good start: set up independent institutions all able to monitor and correct each other.
Further, thinking and talking about organizations as if they were interested or disinterested in anything keeps leading to errors. CDC and Harvard likely have no institutional rules or incentives in place to promote truth over falsehood, or even to promote trust in the institutions themselves. (They may have “vision statements” or “principles”, but these are, in practice, neither rules nor incentives for most people in an organization.) The people who were the first members of the organizations were very interested in and careful about truth and trust, and that’s how the reputation arose. But the CDC was founded in 1946 and it’s all different people now. This applies even more to Harvard, which was founded in 1636, but their current reputation was mostly formed around the same time. In the long run, one cannot trust future people flowing through an organization to do anything except obey the rules written in stone and the incentives in place. Even then, if rules are felt to be too restrictive it’s likely future people will ignore, undermine, reinterpret, and outright rewrite them. And one can expect future people might do anything which is permitted by the rules and incentives, even if it would be unthinkable today.
The corollary is that one cannot trust organizational “character” not to change, and one must regularly update the reputation one attributes to an organization (really to the people in an organization), because the current people are regularly changing, both individuals simply changing over time and different people outright taking the place of previous ones.
The corollary to the corollary is that it can be helpful to have an estimate of the rate of turnover and mentoring in each organization, because that affects how long one can safely go without updating the rest of the reputation of the organization (which, again, is really the combined reputation of the particular people in that organization). Harvard professors likely have very low rates of turnover and excellent mentoring, so I wouldn’t expect their teaching and research to have changed much in the last decade, but university administrators probably change jobs as often CDC administrators. Looking into it more deeply can give you an estimate of how often you’ll have to re-evaluate the reputation of any particular organization. Every decade? Every five years? Yearly?
Another line of thought: what incentives and rules for an institution could one set up which would encourage desirable behaviour and discourage undesirable behaviour in all the future people who will take positions in the organization? Making progress on problems like this, where agents are human but have a very long future time in which to find and exploit loopholes, seems like an obvious prerequisite to making progress on AI alignment. One cannot simply trust that no future person will abuse the rules for their own benefit, just as one cannot trust that no AI will not immediately do the same.
Maybe if some progress were made on this, we could have some sustainable trust in some institutions. The checks-and-balances concept is a good start: set up independent institutions all able to monitor and correct each other.