When I learned probability, we were basically presented with a random variable X, told that it could occupy a bunch of different values, and asked to calculate what the average/expected value is based on the frequencies of what those different values could be. So you start with a question like “we roll a die. here are all the values it could be and they all happen one-sixth of the time. Add each value multiplied by one-sixth to each other to get the expected value.” This framing naturally leads to definition (1) when you expand to continuous random variables.
That’s a strong steelman of the status quo in cases where random variables are introduced as you describe. I’ll concede that (1) is fine in this case. I’m not sure it applies to cases (lectures) where probability spaces are formally introduced – but maybe it does; maybe other people still don’t think of RVs as functions, even if that’s what they technically are.
That’s a strong steelman of the status quo in cases where random variables are introduced as you describe. I’ll concede that (1) is fine in this case. I’m not sure it applies to cases (lectures) where probability spaces are formally introduced – but maybe it does; maybe other people still don’t think of RVs as functions, even if that’s what they technically are.