On the first disagreement, I think we can look at the decisions in detail and find evidence for who is right. The utility function hypothesis expects to find sensible trade-offs being made, with bad decisions only being made when good information was unavailable or political pressure was stronger than the physical stakes. We can ask ourselves if we see this pattern. The difficulty is that the political pressures are often invisible to us, or hard to measure in magnitude. But if it’s a matter of battling real interests, we should expect political pressure to generally push in favor of useful actions for particular interests, rather than for perversity in general.
In addition to looking at outcomes, on priors I expect it to be possible to pick up evidence here and there by listening to what people say (especially off-the-cuff responses to new information and challenges) and drawing inferences about what specific cognitive moves are occurring.
E.g., Fauci exhibits amused self-awareness about the fact that he lied. This is evidence that he has some amount of self-awareness as well as reflective consistency (‘I still endorse saying false things before, for the same reason I endorse saying less-false things now—to encourage the right level of caution in people’). This in turn is nonzero evidence that he’s optimizing a real-world outcome rather than acting on reflex, because that level of self-awareness is more necessary for optimizing real-world outcomes.
Basically, I think it’s very hard for sufficiently confused/myopic/rationalizing humans to properly simulate a long-term-outcome-optimizing human in detail, and vice versa; so I think just listening could help. It’s like inferring anosognosia from listening to what the patient says and inferring cognition from their slip-ups and improvisations (‘wait, that makes absolutely no sense’), vs. inferring anosognosia from macro-outcomes like ‘how well do they hold down a job?’ and ‘how good are they at navigating obstacle courses?’.
In addition to looking at outcomes, on priors I expect it to be possible to pick up evidence here and there by listening to what people say (especially off-the-cuff responses to new information and challenges) and drawing inferences about what specific cognitive moves are occurring.
E.g., Fauci exhibits amused self-awareness about the fact that he lied. This is evidence that he has some amount of self-awareness as well as reflective consistency (‘I still endorse saying false things before, for the same reason I endorse saying less-false things now—to encourage the right level of caution in people’). This in turn is nonzero evidence that he’s optimizing a real-world outcome rather than acting on reflex, because that level of self-awareness is more necessary for optimizing real-world outcomes.
Basically, I think it’s very hard for sufficiently confused/myopic/rationalizing humans to properly simulate a long-term-outcome-optimizing human in detail, and vice versa; so I think just listening could help. It’s like inferring anosognosia from listening to what the patient says and inferring cognition from their slip-ups and improvisations (‘wait, that makes absolutely no sense’), vs. inferring anosognosia from macro-outcomes like ‘how well do they hold down a job?’ and ‘how good are they at navigating obstacle courses?’.