and some audiences have measurably better calibration.
It’s not straightforward in all contexts to establish what counts as good calibration. It’s straightforward for empirical forecasting, but if we were to come up with a notion like “good calibration for ethical judgments,” we’d have to make some pretty subjective judgment calls. Similarly, something like “good calibration for coming up with helpful abstractions for language games” (which we might call “doing philosophy” or a subskill of it) also seems (at least somewhat) subjective.
That doesn’t mean “anything goes,” but I don’t yet see how your point about dialogue trees applies to “maybe a society of AIs would build abstractions we don’t yet understand, so there’d be a translation problem between their language games and ours.”
It’s not straightforward in all contexts to establish what counts as good calibration. It’s straightforward for empirical forecasting, but if we were to come up with a notion like “good calibration for ethical judgments,” we’d have to make some pretty subjective judgment calls. Similarly, something like “good calibration for coming up with helpful abstractions for language games” (which we might call “doing philosophy” or a subskill of it) also seems (at least somewhat) subjective.
That doesn’t mean “anything goes,” but I don’t yet see how your point about dialogue trees applies to “maybe a society of AIs would build abstractions we don’t yet understand, so there’d be a translation problem between their language games and ours.”