Vaniver comments on Bayesian examination

Vaniver 11 Dec 2019 8:25 UTC
15 points
A class I took in graduate school worked this way; here’s the professor’s paper about it. Some notes on how it worked:
1. He used the logarithmic scoring rule, and normalized it such that a maxent guess was 0 points.
2. It takes students a while to learn calibration, and so it’s worth doing many small-stakes versions of this before doing large-stakes versions of it. (The way he did this—one question as a homework assignment each week, and then one or two large exams—didn’t do all that well for this, especially since the homework assignments didn’t fully replicate the “how well can I interpret the question without asking for clarification?” part of the uncertainty that was relevant on tests.)
3. Getting probabilities from the students lets you generate average probabilities for each answer, which is actually quite useful at figuring out where the class is confused. Importantly, you can tell the difference between a question where the average estimate on the right answer is 90% and one where the average estimate on the right answer is 50%, even though both of those will look almost identical in the world where students only choose their top answer!
- Panashe Fundira 11 Dec 2019 21:24 UTC
  3 points
  Parent
  As a student, did you experience any particular frustrations with this approach?
  - Vaniver 12 Dec 2019 6:14 UTC
    7 points
    Parent
    I mean, I personally was quite overconfident on the first midterm. ;) The primary reason was explicitly thinking it through and deciding that I wasn’t risk-neutral when it came to points; I cared more about having ‘the highest score’ than maximizing my expected score.
    It also takes a bit longer to process questions; rather than just bubbling in a single oval, you have to think about how you want to budget your probability for each question, and it’s slightly harder for the teacher to process answers to get grades. But I think it more than pays for itself in the increased expressiveness.
    - ChristianKl 12 Dec 2019 9:49 UTC
      2 points
      Parent
      It being harder for the teacher to process seems to be a feature of bad software support. Ideally you would want to automate the whole process.
      - Vaniver 12 Dec 2019 19:05 UTC
        2 points
        Parent
        If you have a digital exam, this works fine; if you want students to write things with pencil and paper, then you need to somehow turn the pencil marks into numbers that can be plugged into a simple spreadsheet.