Agreed. It’s the same principle by which people are advised to engage in plan-making even if any specific plan they will invent will break on contact with reality; the same principle that underlies “do the math, then burn the math and go with your gut”.
While any specific model is likely to be wrong, trying to derive a consistent model gives you valuable insights into how a consistent model would look like at all, builds model-building skills. What specific externally-visible features of the system do you need to explain? How much complexity is required to do so? How does the process that created the system you’re modeling interact with its internals? How does the former influence the relative probabilities of different internal designs? How would you be able to distinguish one internal structure from another?
Thinking about concrete models forces you to, well, solidify your understanding of the subject matter into a concrete model — and that’s non-trivial in itself.
I’d done that exercise with a detailed story of AI agency development a few months ago, and while that model seems quite naive and uninformed to me now, having built it significantly improved my ability to understand others’ models, see where they connect and what they’re meant to explain.
(Separately, this is why I agree with e. g. Eliezer that people should have a concrete, detailed plan not just for technical alignment, but for how they’ll get the friendly AGI all the way to deployment and AI-Risk-amelioration in the realistic sociopolitical conditions. These plans won’t work as written, but they’ll orient you, give you an idea of how it even looks like to be succeeding at this task vs. failing.)
Agreed. It’s the same principle by which people are advised to engage in plan-making even if any specific plan they will invent will break on contact with reality; the same principle that underlies “do the math, then burn the math and go with your gut”.
While any specific model is likely to be wrong, trying to derive a consistent model gives you valuable insights into how a consistent model would look like at all, builds model-building skills. What specific externally-visible features of the system do you need to explain? How much complexity is required to do so? How does the process that created the system you’re modeling interact with its internals? How does the former influence the relative probabilities of different internal designs? How would you be able to distinguish one internal structure from another?
Thinking about concrete models forces you to, well, solidify your understanding of the subject matter into a concrete model — and that’s non-trivial in itself.
I’d done that exercise with a detailed story of AI agency development a few months ago, and while that model seems quite naive and uninformed to me now, having built it significantly improved my ability to understand others’ models, see where they connect and what they’re meant to explain.
(Separately, this is why I agree with e. g. Eliezer that people should have a concrete, detailed plan not just for technical alignment, but for how they’ll get the friendly AGI all the way to deployment and AI-Risk-amelioration in the realistic sociopolitical conditions. These plans won’t work as written, but they’ll orient you, give you an idea of how it even looks like to be succeeding at this task vs. failing.)