saturn comments on 2011 Less Wrong Census / Survey

saturn 1 Nov 2011 3:37 UTC
22 points
I think it would be more informative to ask people to take one specific online test, now, and report their score. With everyone taking the same test, even if it’s miscalibrated, people could at least see how they compare to other LWers. Asking people to remember a score they were given years ago is just going to produce a ridiculous amount of bias.
- dspeyer 1 Nov 2011 5:13 UTC
  13 points
  Parent
  
  I think it would be more informative to ask people to take one specific online test, now, and report their score.
  
  Are there any free, non-spam-causlng, online IQ tests that produce reasonable results (i.e. correlate strongly to standard IQ tests)?
  - taw 2 Nov 2011 10:33 UTC
    1 point
    Parent
    Mensa organizes cheap standardized IQ testing worldwide with many available dates.
    
    I don’t care for everything else they’re doing, but at least that is a very valuable service to the world.
  - Viliam_Bur 1 Nov 2011 13:09 UTC
    −1 points
    Parent
    
    Are there any free, non-spam-causlng, online IQ tests that produce reasonable results (i.e. correlate strongly to standard IQ tests)?
    
    No chance.
    
    To calibrate a serious IQ test, you need to test (1) many (2) randomly selected people in (3) controlled environment; and when the test is ready, you must test your subjects in the same environment.
    
    Online calibration or even online testing fail the condition 3. Conditions 1 and 2 make creating of a test very expensive. This is why only a few serious IQ tests exist. And even those would not be considered valid when administered online.
    
    And there is also huge prior probability that an online IQ test is a scam. So even if they would provide some explanation of how they fulfilled the conditions 1, 2, 3, I still would not trust them.
    - Vladimir_Nesov 1 Nov 2011 23:28 UTC
      4 points
      Parent
      
      No chance.
      
      To calibrate a serious IQ test, you need to test (1) many (2) randomly selected people in (3) controlled environment; and when the test is ready, you must test your subjects in the same environment.
      
      If you have a test thus calibrated, you can use it to evaluate tests that can’t be calibrated in the same way.
      - Viliam_Bur 2 Nov 2011 19:52 UTC
        0 points
        Parent
        Will this evaluation include giving both tests to many randomly selected people and comparing the results?
- Scott Alexander 1 Nov 2011 8:39 UTC
  8 points
  Parent
  It’s a bit late now, but if you recommend a particular test that’s valid, short, and online, I can try that on the next survey.
  - saturn 2 Nov 2011 0:30 UTC
    13 points
    Parent
    Here’s one that closely imitates Raven’s Progressive Matrices and claims to have been calibrated with a sample of 250,000 people: http://www.iqtest.dk/
    
    Here’s another one: http://sifter.org/iqtest/ . I can’t find any mention of where the questions came from or how it’s calibrated, but it’s shorter and doesn’t require Flash.
    
    Neither one asks for an e-mail address or any identifying information. They might be too easy for some on LW, but harder ones tend to cost money. As Viliam_Bur pointed out, any free online test’s validity is questionable, but the first one is basically a direct copy of a “real” test, and neither one has any apparent ulterior motive. Anecdotally, they were both within 10 points of each other and my “real” score.
    What links here?
    Dreaded_Anomaly's comment on 2012 Less Wrong Census Survey: Call For Critiques/Questions by Scott Alexander (19 Oct 2012 4:27 UTC; 12 points)
    Dustin's comment on 2011 Less Wrong Census / Survey by Scott Alexander (5 Nov 2011 17:58 UTC; 5 points)
    - gwern 5 Nov 2011 18:18 UTC
      6 points
      Parent
      Incidentally, I keep a list for DNB purposes in http://www.gwern.net/DNB%20FAQ#available-tests focused on matrix-style tests. Doesn’t include that sifter.org one, though.
      - saturn 5 Nov 2011 21:02 UTC
        0 points
        Parent
        Wow. Wish I would’ve thought to google ‘iq site:gwern.net’.
        gwern 5 Nov 2011 23:20 UTC
        0 points
        Parent
        Wouldn’t necessarily have helped—Google’s excerpt for the DNB FAQ doesn’t mention the list of tests. Kind of have to know it’s already there.
    - A1987dM 6 Nov 2011 12:46 UTC
      1 point
      Parent
      The first test gave me a score a few points below that on the Mensa site I did a few years ago, but I gave up early on a few questions (I had about 10 minutes left when I finished).
      
      One weird thing about it is that there were so many questions based essentially on the same idea, which makes me think it would be possible to have a test with not-too-much-worse accuracy but half as many questions (unless they intended to test ‘stamina’ as well—but I’d guess that that varies more for a same person depending on how much they’ve slept recently than across people).
    - drc500free 11 Nov 2011 14:56 UTC
      0 points
      Parent
      Some data points: IQ (age 7, 14, 20) = ~145-150 S-B SAT (age 16) − 1590 = ~150 S-B iqtest.dk (age 29) = 133 S-B sifter.org/iqtest (age 29) = 139 S-B (159 euro scale)
      
      I don’t use my spacial skills in my daily work they way I used to use them in my daily school work, and both online tests seem to measure only that.
      
      I found the second test much more difficult—there wasn’t enough information to derive the exact missing item, so you had to choose things that could be explained with the simplest/least rules. There were some where I disagreed that the correct answer had a simpler rule-set. The problem style is also highly learnable, and I question the diagnostic value of “figuring out” that you’re looking at a 3x3 matrix where operations occur as you move around it, but various cells have been obscured to make the problem harder. Not including instructions makes it feel like there’s a secret handshake to get in.
    - MarkusRamikin 6 Nov 2011 16:07 UTC
      0 points
      Parent
      I got 130 on the first one and ¹⁵⁶⁄₁₃₇ on the second.
      
      Going with the lower result for the purpose of Yvain’s survey. I found the second result a little suspect because a lot of questions on the second test made little sense to me. I would often see 2-3 possible answers that made more or less equal (small) sense to me, and had to take a gut feeling guess on which the author might have possibly meant.
      
      Maybe I just got lucky. Or my gut is a better thinker than I suspected.
    - ArisKatsaris 6 Nov 2011 12:07 UTC
      0 points
      Parent
      Got 135 on the first test. Got 139 on the Stanford-Binet/USA scale (stdev 16) in the second. This seems about right.
      
      But since the second one was polite enough to tell me which answers I got wrong, I have to call bullshit on it: some of the “correct” answers it claimed made no sense, and seemed more wrong and illogical than the ones I had placed.
    - NancyLebovitz 6 Nov 2011 1:01 UTC
      0 points
      Parent
      I got 107 on the first test (which seems implausibly low), and 138 on the second (which seems reasonable).
      - Prismattic 6 Nov 2011 1:33 UTC
        1 point
        Parent
        I tried the second one after reading this and had similar results: 118 on the first one (implausibly low); 137 (stdev16) on the second one (sounds about right).
        
        Though if I was taking this more seriously I’d probably have to weigh the facts that my kids were being more distracting when I took the first one, and I ate flaxseed shortly before taking the second one.
        NancyLebovitz 6 Nov 2011 4:58 UTC
        0 points
        Parent
        I took the first one under reasonably good conditions, and the second under about the same conditions a little while afterwards.
        
        The first one seemed like a test of endurance as much as anything—it was as though my ability to focus was running out on the last ten questions or so, and possibly as though it would have been somewhat easier if I’d been in better physical condition.
        
        General question about that sort of puzzle—how much can effort help with them? Can they be solved reliably given more time (and probably a chance to write down theories and guesses), or does inspiration have to strike fairly quickly?
        MarkusRamikin 6 Nov 2011 19:37 UTC
        0 points
        Parent
        Interesting question. On the first test, I went through many of them quickly—some of them obviously pattern-matched to the same kind of a puzzle—but also solved a number by staring at them for a few minutes, refusing to give in to my brain’s “I don’t see any patterns, this doesn’t make any frakking sense, can we do something else now?”. I’m certain given 10 or 20 more minutes I’d have done better. And come out with a headache, probably.
        NancyLebovitz 6 Nov 2011 20:23 UTC
        0 points
        Parent
        My eyes were hurting after the first test, and this continued (less intensely, I think) into the second, even though reading on the monitor isn’t generally a problem for me. There may also be sensory issues involved in scores—I was running into trouble anyway, but having to distinguish between very dark gray squares and black squares in one of the later puzzles didn’t help. If I had more of a different sort of intelligence, I would have thought of fiddling with my monitor settings.
        
        I’m inclined to think that practice/information could help a lot with the puzzles—having a repertoire of possible patterns is going to make solutions easier than trying to find patterns cold.
        
        Possibly as a result of not being entirely pleased at that 107 score, I’m doubting the whole premise of IQ testing—that it’s important to find out what can’t be improved about people’s minds.
        
        Part of this is the arrogance problem—how complete is your knowledge of the possibility of improvement, anyway?-- and the other part is wondering whether all those resources could be better put into learning how to improve what can be improved.
        
        The other thing is that I’ve had some recent evidence that the ways the parts of the mind are interconnected aren’t completely obvious. I’ve been doing some psychological work on fading out self-hatred, and the results have been being less frightened about what I post (I decided before taking the IQ tests to post my scores, but there was still a bit of a pang), easier and faster typing—not tested, but I do seem somewhat apt to write at greater length (this seems to be the result of feeling less need to over-monitor so that typing can be a low-level habit), less akrasia (still pretty bad, but the desire to do things is happening more often), and the ability to walk downstairs more easily (I have some old knee injuries which can be ameliorated by better coordination—but I haven’t been working on coordination).
        ArisKatsaris 6 Nov 2011 12:10 UTC
        0 points
        Parent
        In this type of test, I can solve generally about all except about 4 of them almost immediately with some seconds of thought. I skip those few, then return to them at the end, and in the minutes that remain manage to make an educated guess for say two of them, while having to leave two more to complete chance.
      - saturn 6 Nov 2011 6:35 UTC
        0 points
        Parent
        Interesting. Did you find the questions in the first test more difficult than the second? I did notice that the first test relies a lot on mental rotation.
        NancyLebovitz 6 Nov 2011 7:34 UTC
        0 points
        Parent
        I found the last third or so of the questions in the first test much more difficult than almost anything in the second.
- Viliam_Bur 1 Nov 2011 13:14 UTC
  5 points
  Parent
  
  With everyone taking the same test, even if it’s miscalibrated, people could at least see how they compare to other LWers.
  
  There are two ways an IQ test can fail: a) it can be miscalibrated; b) it can measure something else than IQ.
  
  If you only want to know your percentile in LW population, (a) is not a problem, but (b) remains. What if the test does not measure the “general intelligence factor”, but something else? It can partly correlate to IQ, and partly to something else, e.g. mathematical or verbal skills.
  
  Also you have a preselection bias—some LWers will fill the survey, others won’t.
- kilobug 1 Nov 2011 23:15 UTC
  2 points
  Parent
  Don’t forget those of us who aren’t native English speakers. Didn’t try it again recently, but I used to have a 5-10 points difference between an IQ test in French (my native language) and English. Word-related questions are of course harder, but even for the rest, I’m not sure if it’s because it took me longer to process the English (while the IQ is time-limited), or just that decoding a non-native language use more brain power (leaving less for solving the problem). But anyway, I score better in my native language than in English, and I answered with my score in native.
- Paul Crowley 1 Nov 2011 8:27 UTC
  2 points
  Parent
  Yes—I’m quoting an IQ test I did as a kid which had a suspiciously high score, I’m pretty confident I’d get a much less spectacular score if I did one today.
  - wedrifid 1 Nov 2011 8:50 UTC
    10 points
    Parent
    
    Yes—I’m quoting an IQ test I did as a kid which had a suspiciously high score, I’m pretty confident I’d get a much less spectacular score if I did one today.
    
    Awesome. Definitely don’t do another one then. (Unless you need to diagnose something of course!)

saturn comments on 2011 Less Wrong Census /​ Survey

saturn comments on 2011 Less Wrong Census / Survey