Ah I see ur point. Yh I think that’s a natural next step. Why do you think it not very interesting to investigate? Being able to make very accurate inferences given the evidence at hand seems important for capabilities, including alignment relevant ones?
I don’t have a strong take haha. I’m just expressing my own uncertainty.
Here’s my best reasoning: Under Bayesian reasoning, a sufficiently small posterior probability would be functionally equivalent to impossibility (for downstream purposes anyway). If models reason in a Bayesian way then we wouldn’t expect the deductive and abductive experiments discussed above to be that different (assuming the abductive setting gave the model sufficient certainty over the posterior).
But I guess this could still be a good indicator of whether models do reason in a Bayesian way. So maybe still worth doing? Haven’t thought about it much more than that, so take this w/ a pinch of salt.
Ah I see ur point. Yh I think that’s a natural next step. Why do you think it not very interesting to investigate? Being able to make very accurate inferences given the evidence at hand seems important for capabilities, including alignment relevant ones?
I don’t have a strong take haha. I’m just expressing my own uncertainty.
Here’s my best reasoning: Under Bayesian reasoning, a sufficiently small posterior probability would be functionally equivalent to impossibility (for downstream purposes anyway). If models reason in a Bayesian way then we wouldn’t expect the deductive and abductive experiments discussed above to be that different (assuming the abductive setting gave the model sufficient certainty over the posterior).
But I guess this could still be a good indicator of whether models do reason in a Bayesian way. So maybe still worth doing? Haven’t thought about it much more than that, so take this w/ a pinch of salt.