DanielVarga comments on Best of Rationality Quotes, 2012 Edition

DanielVarga 25 Jan 2013 19:59 UTC
0 points
It is roughly exponential in the range between 3 and 60 karma.

You can find the raw data here.

Edit: I didn’t spot gwern’s more careful analysis. I am still digesting it. gwern, you should use the above link, it contains the below-10 quotes, too.
- gwern 25 Jan 2013 20:49 UTC
  0 points
  Parent
  The extra data doesn’t seem to make much difference:
```
R> karma <- read.table("http://people.mokk.bme.hu/~daniel/rationality_quotes_2012/scores″)
R> karma ← sort(karma$V2)
R> summary(karma)
   Min. 1st Qu. Median    Mean 3rd Qu.    Max.
   −8.0     4.0     8.0    10.7    15.0   105.0
…
Nonlinear regression model
  model:  y ~ exp(a + b * x)
   data:  temp
       a        b
−0.01088  0.00134
 residual sum-of-squares: 22772

Number of iterations to convergence: 7
Achieved convergence tolerance: 3.59e-06
```
  It is roughly exponential in the range between 3 and 60 karma.
  
  Eyeballing it, looks like the previous fit crosses around 40.
```
R> karma <- karma[karma<40]
...
Nonlinear regression model
  model:  y ~ exp(a + b * x)
   data:  temp
       a        b
-0.01088  0.00134
 residual sum-of-squares: 22772

Number of iterations to convergence: 7
Achieved convergence tolerance: 3.59e-06
```
  The fit looks much better:
  - DanielVarga 25 Jan 2013 21:06 UTC
    0 points
    Parent
    I am afraid I don’t understand your methodology. How is a rank versus value function supposed to look like for an exponentially distributed sample?
    - gwern 25 Jan 2013 21:08 UTC
      0 points
      Parent
      How else would you do it?
      - DanielVarga 25 Jan 2013 22:29 UTC
        0 points
        Parent
        When I stated that the middle is roughly exponential, this was the graph that I was looking at:
        
        d ← density(karma)
        
        plot(log(d$y) ~ d$x)
        
        I don’t do this for a living, so I am not sure at all, but if I really really had to make this formal, I would probably use maximum likelihood to fit an exponential distribution on the relevant interval, and then Kolmogorov-Smirnoff. It’s what shminux said, except there is probably no closed formula because the cutoffs complicate the thing. And at least one of the cutoffs is really necessary, because below 3 it is obviously not exponential.
      - Shmi 25 Jan 2013 21:19 UTC
        0 points
        Parent
        I expected something like this or the section thereafter.