There are plenty of systems where we rationally form beliefs about likely outputs from a system without a full understanding of how it works. Weather prediction is an example.
What makes it rational is that there is an actual underlying hypothesis about how weather works, instead of vague “LLMs are a lot like human uploads”. And weather prediction outputs numbers connected to reality we actually care about. And there is no alternative credible hypothesis that implies weather prediction not working.
I don’t want to totally dismiss empirical extrapolations, but given the stakes, I would personally prefer for all sides to actually state their model of reality and how they think evidence changed it’s plausibility, as formally as possible.
There are plenty of systems where we rationally form beliefs about likely outputs from a system without a full understanding of how it works. Weather prediction is an example.
What makes it rational is that there is an actual underlying hypothesis about how weather works, instead of vague “LLMs are a lot like human uploads”. And weather prediction outputs numbers connected to reality we actually care about. And there is no alternative credible hypothesis that implies weather prediction not working.
I don’t want to totally dismiss empirical extrapolations, but given the stakes, I would personally prefer for all sides to actually state their model of reality and how they think evidence changed it’s plausibility, as formally as possible.