On the other hand, I don’t think we can give people money just because they say they are doing good things, because of the risk of abuse. There are many other reasons for not publishing anything. Some simple alternative hypothesis include “we failed to produce anything publishable” or “it is fun to fool ourselves into thinking we have exciting secrets” or “we are doing bad things and don’t want to get caught.” The fact that MIRI’s researchers appear intelligent suggest they at least think they are doing important and interesting issues, but history has many examples of talented reclusive teams spending years working on pointless stuff in splendid isolation.
Additionally, by hiding the highest quality work we risk impoverishing the field, making it look unproductive and unattractive to potential new researchers.
a Mesa-Optimizer—a sub-agent of an optimizer that is itself an optimizer
I think this is a poor description of mesa-optimization. A mesa-optimizer is not a subagent, it’s just a trained model implementing a search algorithm.
My work at MIRI is public, btw.
I think this is a poor description of mesa-optimization. A mesa-optimizer is not a subagent, it’s just a trained model implementing a search algorithm.