Rob Bensinger comments on Hashing out long-standing disagreements seems low-value to me

Rob Bensinger 16 Feb 2023 21:14 UTC
19 points
12
I was surprised at how low the hour estimates were, particularly for the OP people (especially Holden) and even for Paul.
Maybe worth keeping in mind that Nate isn’t the only MIRI person who’s spent lots of hours on this (e.g., Eliezer and Benya have as well), and the numbers only track Nate-time.
Also maybe worth keeping in mind the full list of things that need doing in the world. This is one of the key important leveraged things that needs doing, so it’s easy to say “spend more time on it”. But spending a thousand hours (so, like, a good chunk of a year where that’s the only thing Nate does?) trying to sync up with a specific individual, means that all the other things worth doing with that time can’t happen, like:
- spending more focused time on the alignment problem, and trying to find new angles of attack that seem promising.
- spending more time sharing models of the alignment problem so others can better attack the problem. (Models that may not be optimized for converging with Paul or Dario or others, but that are very useful for the many people with more Nate-ish technical intuitions.)
- spending more time sharing models of the larger strategic landscape (that aren’t specifically optimized for converging with Paul etc.).
- mindset interventions, like recruiting more people to the field with security mindset, or trying to find ways to instill security mindset in more people in the field.
- work on making more operationally adequate institutions exist.
- work on improving coordination between existing institutions.
- work on building capacity such that if and when people come up with good alignment experiments to run, we’re able to quickly and effectively execute.
Some of those things seem easier to make fast progress on, or have more levers that haven’t been tried yet, and there are only so many hours to split between them. When you have tried something a lot and haven’t made much progress on it, at some point you should start making the update that it’s not so tractable, and change your strategy in response.
(At least, not so tractable right now. Another possible reason to put off full epistemic convergence is that it may be easier when the field is more mature and more of the arguments can pass through established formalisms, as opposed to needing to pass through fuzzy intuitions and heuristics. Though I don’t know to what extent that’s how Nate is thinking about the thing.)