Alignment’s phlogiston

Crossposted from the EA Forum: https://​​forum.effectivealtruism.org/​​posts/​​DCtqgsywCRakLvHn6/​​alignment-s-phlogiston

Epistemic status: quick and dirty reaction to the claim that alignment research is like archaeology. I wouldn’t trust anyone suggesting that scientific and technological revolutions can be simply broken down into a series of discoveries. But I like the phlogiston metaphor, nevertheless.

In this post, John Wentworth makes the case that alignment research is at a pre-paradigmatic stage. As he states, experts in the field share a “fundamental confusion” and there is no explicit consensus on the nature of the subject matter and the best ways to approach it. What is also characteristic of an immature science is the disunity [1]of frameworks that concerns the concepts, theories, agendas, practices, methodological tools, and other criteria for what qualifies as having high explanatory force. Such disunity seems to describe the current state of AI safety.

In this post, Adam Shimi compares alignment/​AI safety to historical sciences such as archaeology. This can only mean that either alignment is not at a pre-paradigmatic state or that archaeology is not a mature science. However, archaeology doesn’t suffer from the “fundamental confusion” of alignment; it might not be possible to employ the same observational tools researchers do in physics or chemistry, but archaeologists do have a shared view of how to study their subject. I very much doubt that the average archaeologist would go ahead and tell you that they’re fundamentally confused about their field and how they approach the most important questions of their research agenda.

Looking back at the history of science, the field of AI safety seems to have more similarities with alchemy. The alchemists were people deeply confused about their methods and how likely they are to succeed. They all, however, shared a common aim summarized in this threefold: to find the Stone of Knowledge (The Philosophers’ Stone), to discover the medium of Eternal Youth and Health, and to discover the transmutation of metals. Their “science” had the shape of a pre-paradigmatic field that would eventually transform into the natural science of chemistry. Their agenda ceased to be grounded upon mystical investigations as the science began to mature.

The claim here is not that alignment has in any sense the mystical substrate of alchemy. It shares the high uncertainty combined with attempts to work at the experimental/​observational/​empirical level that cannot be supported as in physical sciences/​STEM. It shares the intention to find something that doesn’t yet exist and when it does, it will make the human world substantially qualitatively different than it currently is.

It would be very helpful for the progress of alignment research to be able to trace what exactly happened when alchemy became chemistry. Is it the articulation of an equation? Is it the discovery and analysis of a substance like phlogiston? Then we’d need to find alignment’s phlogiston and that would bring us closer to discovering alignment’s oxygen.

  1. ^

    This paper argues that unity isn’t necessary for a science to qualify as mature.