That makes a lot of sense, but it doesn’t explain why calibration post-RLHF is much better for the 10-40% buckets than for the 60-90% buckets.
That makes a lot of sense, but it doesn’t explain why calibration post-RLHF is much better for the 10-40% buckets than for the 60-90% buckets.