AGI will probably be deployed by a Moral Maze
Moral Mazes is my favorite management book ever, because instead of “how to be a good manager” it’s about “empirical observations of large-scale organizational dynamics involving management”.
I wish someone would write an updated version—a lot has changed (though a lot has stayed the same) since the research for the book was done in the early 1980s.
My take (and the author’s take) is that any company of nontrivial size begins to take on the characteristics of a moral maze. It seems to be a pretty good null hypothesis—any company saying “we aren’t/won’t become a moral maze” has a pretty huge evidential burden to cross.
I keep this point in mind when thinking about strategy around when it comes time to make deployment decisions about AGI, and deploy AGI. These decisions are going to be made within the context of a moral maze.
To me, this means that some strategies (“everyone in the company has a thorough and complete understanding of AGI risks”) will almost certainly fail. I think the only strategies that work well inside of moral mazes will work at all.
To sum up my takes here:
basically every company eventually becomes a moral maze
AGI deployment decisions will be made in the context of a moral maze
understanding moral maze dynamics is important to AGI deployment strategy
More Ideas or More Consensus?
I think one aspect you can examine about a scientific field is it’s “spread”-ness of ideas and resources.
High energy particle physics is an interesting extrema here—there’s broad agreement in the field about building higher energy accelerators, and this means there can be lots of consensus about supporting a shared collaborative high energy accelerator.
I think a feature of mature scientific fields that “more consensus” can unlock more progress. Perhaps if there had been more consensus, the otherwise ill-fated superconducting super collider would have worked out. (I don’t know if other extenuating circumstances would still prevent it.)
I think a feature of less mature scientific fields that “more ideas” (and less consensus) would unlock more progress. In this case, we’re more limited about generating and validating new good ideas. One way this looks is that there’s not a lot of confidence with what to do with large sums of research funding, and instead we think our best bet is making lots of small bets.
My field (AI alignment) is a less mature scientific field in this way, I think. We don’t have a “grand plan” for alignment, which we just need to get funding. Instead we have a fractal of philanthropic organizations empowering individual grantmakers to try to get small and early ideas off the ground with small research grants.
A couple thoughts, if this model does indeed fit:
There’s a lot more we could do to orienting as a field with “the most important problem is increasing the rate of coming up with good research ideas”. In addition to being willing to fund lots of small and early stage research, I think we could factorize and interrogate the skills and mindsets needed to do this kind of work. It’s possible that this is one of the most important meta-skills we need to improve as a field.
I also think this could be more of a priority when “field building”. When recruiting or trying to raise awareness of the field, it would be good to consider more focus or priority on places where we expect to find people who are likely to be good generators of new ideas. I think one of the ways this looks is to focus on more diverse and underrepresented groups.
Finally, at some point it seems like we’ll transition to “more mature” as a field, and it’s good to spend some time thinking about what would help that go better. Understanding the history of other fields making this transition, and trying to prepare for predicted problems/issues would be good here.