ArthurDenture comments on 2012 Less Wrong Census/Survey

ArthurDenture Nov 4, 2012, 2:38 PM
23 points
Took the survey.

I hope this question isn’t used the way I worry it will be used:

CFAR Question 3

A certain town is served by two hospitals. In the larger hospital, about 45 babies are born each day. In the smaller one, about 15 babies are born each day. Although the overall proportion of girls is about 50%, the actual proportion at either hospital may be greater or less on any day. At the end of a year, which hospital will have the greater number of days on which more than 60% of the babies born were girls?

This question was easy for me to answer by pattern-matching to the Law of Small Numbers, as outlined in Thinking, Fast and Slow. If I hadn’t read that, it’s hard to say whether I would have reasoned it out correctly. So if many respondents answer this question correctly, I hope that the survey authors don’t claim evidence that LW readers are better at statistical reasoning—it’d be more accurate to say that LW readers are more likely to have seen this very particular question before.

(I could, naturally, be assuming too much about the intents of the survey authors.)
- MixedNuts Nov 4, 2012, 4:26 PM
  9 points
  Parent
  Intuitive answer:
  
  Picture a horizontal line and points scattered around it. If there are many points, the line will be dark and there’ll be a cloud around it. If there are few points, you’ll get a vague shape and it won’t be easy to tell where the line originally was.
  
  Rigorous answer:
```
print [

    len(filter(lambda x: x > 0.6 *per_day,

    [

        sum([ randint(0,1) for birth in range(0, per_day) ])

        for day in range(0, 365)

    ]))

    for per_day in (15, 45)

]
```
  Thoughtful answer: Why would I bother thinking? Fetch me an apple.
  
  Edit: For copulation’s sake, whose kneecaps do I have to break to make Markdown leave my indentation the Christian Underworld alone, and who wrote those filthy blatant lies masquerading as comment formatting help?
  - Morendil Nov 5, 2012, 12:11 AM
    10 points
    Parent
    For another intuitive answer, try lower values of 15, like 1.
    
    The Python code works better, on my machine, if I add the line “from random import randint” at the top.
  - Emile Nov 5, 2012, 9:30 AM
    3 points
    Parent
    whose kneecaps do I have to break to make Markdown leave my indentation the Christian Underworld alone
    
    There may be a more convenient method, but using non-breaking spaces ( ) works.
    
    print [ len(filter(lambda x: x > 0.6 *per_day, [ sum([ randint(0,1) for birth in range(0, per_day) ]) for day in range(0, 365) ])) for per_day in (15, 45) ]
    - A1987dM Nov 6, 2012, 12:52 AM
      0 points
      Parent
      Certain browsers (early versions of Firefox, at least) for some reason automatically replace all hard spaces with regular spaces when submitting a form.
  - FAWS Nov 4, 2012, 5:31 PM
    3 points
    Parent
    Edit: For copulation’s sake, whose kneecaps do I have to break to make Markdown leave my indentation the Christian Underworld alone, and who wrote those filthy blatant lies masquerading as comment formatting help?
    
    does prefacing with 4 extra spaces work?
    EDIT: Apparently not. Very likely a bug then.
    - A1987dM Nov 4, 2012, 7:17 PM
      3 points
      Parent
      The usual kludge is to replace spaces with full stops.
  - dbaupp Nov 6, 2012, 11:31 PM
    2 points
    Parent
    That’s not such a rigorous answer:
    
    Imagine you have a random sample with n observations x_1, …, x_n, independently and identically distributed according to some distribution with mean mu and variance s^2.
    
    The sample mean is sum(x_i)/n (the expected value is mu as one would hope). Doing some manipulations we find that this has variance s^2/n, i.e. a large n means a small variance, so larger samples are more tightly clustered around mu.
  - wedrifid Nov 5, 2012, 12:26 PM
    0 points
    Parent
    print [ len(filter(lambda x: x > 0.6 *per_day, [ sum([ randint(0,1) for birth in range(0, per_day) ]) for day in range(0, 365) ])) for per_day in (15, 45) ]
- gwern Nov 4, 2012, 4:26 PM
  9 points
  Parent
  
  This question was easy for me to answer by pattern-matching to the Law of Small Numbers, as outlined in Thinking, Fast and Slow. If I hadn’t read that, it’s hard to say whether I would have reasoned it out correctly. So if many respondents answer this question correctly, I hope that the survey authors don’t claim evidence that LW readers are better at statistical reasoning—it’d be more accurate to say that LW readers are more likely to have seen this very particular question before.
  
  I don’t understand the distinction you are making here. If you can answer correctly more statistical questions, how is that not being ‘better at statistical reasoning’? Every area of thought draws heavily on memorization and caching.
  - ArthurDenture Nov 4, 2012, 6:43 PM
    11 points
    Parent
    
    If you can answer correctly more statistical questions, how is that not being ‘better at statistical reasoning’?
    
    Those are related abilities, but there’s being able to answer specific questions and then there’s being able to apply what you’ve learned more generally. For me, this particular question triggered more “aha! I’ve seen this one before!” than it triggered statistical thought. A correct answer to the question might give you a smidgen of information on whether the answerer can reason about statistics, but it probably gives you a lot more information about whether the answerer has seen the question before.
    
    One superficial example of dealing with this problem is how, in my college discrete math class, the professor gave us a problem involving placing pigeons in holes, with the solution having nothing to do with the pigeonhole principle. Even better than obfuscating a problem, of course, is stating a novel one that exercises the skills you’re testing for.
- A1987dM Nov 5, 2012, 12:43 AM
  6 points
  Parent
  
  it’d be more accurate to say that LW readers are more likely to have seen this very particular question before.
  
  BTW, I had seen the CFAR Question 1 before.
  - [deleted]Nov 5, 2012, 3:17 PM
    2 points
    Parent
    On the other hand, I hadn’t seen it before, but still got it correct.
- AnthonyC Nov 8, 2012, 9:53 PM
  0 points
  Parent
  I have not read Thinking Fast and Slow, but the answer to this follows directly from binomial probability distributions, which (at least in NY) were part of the 11th grade math curriculum as of 2004. That doesn’t mean most people will notice the connection, but technically they’ve been exposed to all the necessary information to solve it.

ArthurDenture comments on 2012 Less Wrong Census/​Survey

ArthurDenture comments on 2012 Less Wrong Census/Survey