I think it would be more informative to ask people to take one specific online test, now, and report their score. With everyone taking the same test, even if it’s miscalibrated, people could at least see how they compare to other LWers. Asking people to remember a score they were given years ago is just going to produce a ridiculous amount of bias.
Are there any free, non-spam-causlng, online IQ tests that produce reasonable results (i.e. correlate strongly to standard IQ tests)?
No chance.
To calibrate a serious IQ test, you need to test (1) many (2) randomly selected people in (3) controlled environment; and when the test is ready, you must test your subjects in the same environment.
Online calibration or even online testing fail the condition 3. Conditions 1 and 2 make creating of a test very expensive. This is why only a few serious IQ tests exist. And even those would not be considered valid when administered online.
And there is also huge prior probability that an online IQ test is a scam. So even if they would provide some explanation of how they fulfilled the conditions 1, 2, 3, I still would not trust them.
To calibrate a serious IQ test, you need to test (1) many (2) randomly selected people in (3) controlled environment; and when the test is ready, you must test your subjects in the same environment.
If you have a test thus calibrated, you can use it to evaluate tests that can’t be calibrated in the same way.
Here’s one that closely imitates Raven’s Progressive Matrices and claims to have been calibrated with a sample of 250,000 people: http://www.iqtest.dk/
Here’s another one: http://sifter.org/iqtest/ . I can’t find any mention of where the questions came from or how it’s calibrated, but it’s shorter and doesn’t require Flash.
Neither one asks for an e-mail address or any identifying information. They might be too easy for some on LW, but harder ones tend to cost money. As Viliam_Bur pointed out, any free online test’s validity is questionable, but the first one is basically a direct copy of a “real” test, and neither one has any apparent ulterior motive. Anecdotally, they were both within 10 points of each other and my “real” score.
The first test gave me a score a few points below that on the Mensa site I did a few years ago, but I gave up early on a few questions (I had about 10 minutes left when I finished).
One weird thing about it is that there were so many questions based essentially on the same idea, which makes me think it would be possible to have a test with not-too-much-worse accuracy but half as many questions (unless they intended to test ‘stamina’ as well—but I’d guess that that varies more for a same person depending on how much they’ve slept recently than across people).
Some data points:
IQ (age 7, 14, 20) = ~145-150 S-B
SAT (age 16) − 1590 = ~150 S-B
iqtest.dk (age 29) = 133 S-B
sifter.org/iqtest (age 29) = 139 S-B (159 euro scale)
I don’t use my spacial skills in my daily work they way I used to use them in my daily school work, and both online tests seem to measure only that.
I found the second test much more difficult—there wasn’t enough information to derive the exact missing item, so you had to choose things that could be explained with the simplest/least rules. There were some where I disagreed that the correct answer had a simpler rule-set. The problem style is also highly learnable, and I question the diagnostic value of “figuring out” that you’re looking at a 3x3 matrix where operations occur as you move around it, but various cells have been obscured to make the problem harder. Not including instructions makes it feel like there’s a secret handshake to get in.
I got 130 on the first one and 156⁄137 on the second.
Going with the lower result for the purpose of Yvain’s survey. I found the second result a little suspect because a lot of questions on the second test made little sense to me. I would often see 2-3 possible answers that made more or less equal (small) sense to me, and had to take a gut feeling guess on which the author might have possibly meant.
Maybe I just got lucky. Or my gut is a better thinker than I suspected.
Got 135 on the first test. Got 139 on the Stanford-Binet/USA scale (stdev 16) in the second. This seems about right.
But since the second one was polite enough to tell me which answers I got wrong, I have to call bullshit on it: some of the “correct” answers it claimed made no sense, and seemed more wrong and illogical than the ones I had placed.
I tried the second one after reading this and had similar results: 118 on the first one (implausibly low); 137 (stdev16) on the second one (sounds about right).
Though if I was taking this more seriously I’d probably have to weigh the facts that my kids were being more distracting when I took the first one, and I ate flaxseed shortly before taking the second one.
I took the first one under reasonably good conditions, and the second under about the same conditions a little while afterwards.
The first one seemed like a test of endurance as much as anything—it was as though my ability to focus was running out on the last ten questions or so, and possibly as though it would have been somewhat easier if I’d been in better physical condition.
General question about that sort of puzzle—how much can effort help with them? Can they be solved reliably given more time (and probably a chance to write down theories and guesses), or does inspiration have to strike fairly quickly?
Interesting question. On the first test, I went through many of them quickly—some of them obviously pattern-matched to the same kind of a puzzle—but also solved a number by staring at them for a few minutes, refusing to give in to my brain’s “I don’t see any patterns, this doesn’t make any frakking sense, can we do something else now?”. I’m certain given 10 or 20 more minutes I’d have done better. And come out with a headache, probably.
My eyes were hurting after the first test, and this continued (less intensely, I think) into the second, even though reading on the monitor isn’t generally a problem for me. There may also be sensory issues involved in scores—I was running into trouble anyway, but having to distinguish between very dark gray squares and black squares in one of the later puzzles didn’t help. If I had more of a different sort of intelligence, I would have thought of fiddling with my monitor settings.
I’m inclined to think that practice/information could help a lot with the puzzles—having a repertoire of possible patterns is going to make solutions easier than trying to find patterns cold.
Possibly as a result of not being entirely pleased at that 107 score, I’m doubting the whole premise of IQ testing—that it’s important to find out what can’t be improved about people’s minds.
Part of this is the arrogance problem—how complete is your knowledge of the possibility of improvement, anyway?-- and the other part is wondering whether all those resources could be better put into learning how to improve what can be improved.
The other thing is that I’ve had some recent evidence that the ways the parts of the mind are interconnected aren’t completely obvious. I’ve been doing some psychological work on fading out self-hatred, and the results have been being less frightened about what I post (I decided before taking the IQ tests to post my scores, but there was still a bit of a pang), easier and faster typing—not tested, but I do seem somewhat apt to write at greater length (this seems to be the result of feeling less need to over-monitor so that typing can be a low-level habit), less akrasia (still pretty bad, but the desire to do things is happening more often), and the ability to walk downstairs more easily (I have some old knee injuries which can be ameliorated by better coordination—but I haven’t been working on coordination).
In this type of test, I can solve generally about all except about 4 of them almost immediately with some seconds of thought. I skip those few, then return to them at the end, and in the minutes that remain manage to make an educated guess for say two of them, while having to leave two more to complete chance.
Interesting. Did you find the questions in the first test more difficult than the second? I did notice that the first test relies a lot on mental rotation.
With everyone taking the same test, even if it’s miscalibrated, people could at least see how they compare to other LWers.
There are two ways an IQ test can fail:
a) it can be miscalibrated;
b) it can measure something else than IQ.
If you only want to know your percentile in LW population, (a) is not a problem, but (b) remains. What if the test does not measure the “general intelligence factor”, but something else? It can partly correlate to IQ, and partly to something else, e.g. mathematical or verbal skills.
Also you have a preselection bias—some LWers will fill the survey, others won’t.
Don’t forget those of us who aren’t native English speakers. Didn’t try it again recently, but I used to have a 5-10 points difference between an IQ test in French (my native language) and English. Word-related questions are of course harder, but even for the rest, I’m not sure if it’s because it took me longer to process the English (while the IQ is time-limited), or just that decoding a non-native language use more brain power (leaving less for solving the problem). But anyway, I score better in my native language than in English, and I answered with my score in native.
Yes—I’m quoting an IQ test I did as a kid which had a suspiciously high score, I’m pretty confident I’d get a much less spectacular score if I did one today.
Yes—I’m quoting an IQ test I did as a kid which had a suspiciously high score, I’m pretty confident I’d get a much less spectacular score if I did one today.
Awesome. Definitely don’t do another one then. (Unless you need to diagnose something of course!)
I think it would be more informative to ask people to take one specific online test, now, and report their score. With everyone taking the same test, even if it’s miscalibrated, people could at least see how they compare to other LWers. Asking people to remember a score they were given years ago is just going to produce a ridiculous amount of bias.
Are there any free, non-spam-causlng, online IQ tests that produce reasonable results (i.e. correlate strongly to standard IQ tests)?
Mensa organizes cheap standardized IQ testing worldwide with many available dates.
I don’t care for everything else they’re doing, but at least that is a very valuable service to the world.
No chance.
To calibrate a serious IQ test, you need to test (1) many (2) randomly selected people in (3) controlled environment; and when the test is ready, you must test your subjects in the same environment.
Online calibration or even online testing fail the condition 3. Conditions 1 and 2 make creating of a test very expensive. This is why only a few serious IQ tests exist. And even those would not be considered valid when administered online.
And there is also huge prior probability that an online IQ test is a scam. So even if they would provide some explanation of how they fulfilled the conditions 1, 2, 3, I still would not trust them.
If you have a test thus calibrated, you can use it to evaluate tests that can’t be calibrated in the same way.
Will this evaluation include giving both tests to many randomly selected people and comparing the results?
It’s a bit late now, but if you recommend a particular test that’s valid, short, and online, I can try that on the next survey.
Here’s one that closely imitates Raven’s Progressive Matrices and claims to have been calibrated with a sample of 250,000 people: http://www.iqtest.dk/
Here’s another one: http://sifter.org/iqtest/ . I can’t find any mention of where the questions came from or how it’s calibrated, but it’s shorter and doesn’t require Flash.
Neither one asks for an e-mail address or any identifying information. They might be too easy for some on LW, but harder ones tend to cost money. As Viliam_Bur pointed out, any free online test’s validity is questionable, but the first one is basically a direct copy of a “real” test, and neither one has any apparent ulterior motive. Anecdotally, they were both within 10 points of each other and my “real” score.
Incidentally, I keep a list for DNB purposes in http://www.gwern.net/DNB%20FAQ#available-tests focused on matrix-style tests. Doesn’t include that
sifter.org
one, though.Wow. Wish I would’ve thought to google ‘iq site:gwern.net’.
Wouldn’t necessarily have helped—Google’s excerpt for the DNB FAQ doesn’t mention the list of tests. Kind of have to know it’s already there.
The first test gave me a score a few points below that on the Mensa site I did a few years ago, but I gave up early on a few questions (I had about 10 minutes left when I finished).
One weird thing about it is that there were so many questions based essentially on the same idea, which makes me think it would be possible to have a test with not-too-much-worse accuracy but half as many questions (unless they intended to test ‘stamina’ as well—but I’d guess that that varies more for a same person depending on how much they’ve slept recently than across people).
Some data points: IQ (age 7, 14, 20) = ~145-150 S-B SAT (age 16) − 1590 = ~150 S-B iqtest.dk (age 29) = 133 S-B sifter.org/iqtest (age 29) = 139 S-B (159 euro scale)
I don’t use my spacial skills in my daily work they way I used to use them in my daily school work, and both online tests seem to measure only that.
I found the second test much more difficult—there wasn’t enough information to derive the exact missing item, so you had to choose things that could be explained with the simplest/least rules. There were some where I disagreed that the correct answer had a simpler rule-set. The problem style is also highly learnable, and I question the diagnostic value of “figuring out” that you’re looking at a 3x3 matrix where operations occur as you move around it, but various cells have been obscured to make the problem harder. Not including instructions makes it feel like there’s a secret handshake to get in.
I got 130 on the first one and 156⁄137 on the second.
Going with the lower result for the purpose of Yvain’s survey. I found the second result a little suspect because a lot of questions on the second test made little sense to me. I would often see 2-3 possible answers that made more or less equal (small) sense to me, and had to take a gut feeling guess on which the author might have possibly meant.
Maybe I just got lucky. Or my gut is a better thinker than I suspected.
Got 135 on the first test. Got 139 on the Stanford-Binet/USA scale (stdev 16) in the second. This seems about right.
But since the second one was polite enough to tell me which answers I got wrong, I have to call bullshit on it: some of the “correct” answers it claimed made no sense, and seemed more wrong and illogical than the ones I had placed.
I got 107 on the first test (which seems implausibly low), and 138 on the second (which seems reasonable).
I tried the second one after reading this and had similar results: 118 on the first one (implausibly low); 137 (stdev16) on the second one (sounds about right).
Though if I was taking this more seriously I’d probably have to weigh the facts that my kids were being more distracting when I took the first one, and I ate flaxseed shortly before taking the second one.
I took the first one under reasonably good conditions, and the second under about the same conditions a little while afterwards.
The first one seemed like a test of endurance as much as anything—it was as though my ability to focus was running out on the last ten questions or so, and possibly as though it would have been somewhat easier if I’d been in better physical condition.
General question about that sort of puzzle—how much can effort help with them? Can they be solved reliably given more time (and probably a chance to write down theories and guesses), or does inspiration have to strike fairly quickly?
Interesting question. On the first test, I went through many of them quickly—some of them obviously pattern-matched to the same kind of a puzzle—but also solved a number by staring at them for a few minutes, refusing to give in to my brain’s “I don’t see any patterns, this doesn’t make any frakking sense, can we do something else now?”. I’m certain given 10 or 20 more minutes I’d have done better. And come out with a headache, probably.
My eyes were hurting after the first test, and this continued (less intensely, I think) into the second, even though reading on the monitor isn’t generally a problem for me. There may also be sensory issues involved in scores—I was running into trouble anyway, but having to distinguish between very dark gray squares and black squares in one of the later puzzles didn’t help. If I had more of a different sort of intelligence, I would have thought of fiddling with my monitor settings.
I’m inclined to think that practice/information could help a lot with the puzzles—having a repertoire of possible patterns is going to make solutions easier than trying to find patterns cold.
Possibly as a result of not being entirely pleased at that 107 score, I’m doubting the whole premise of IQ testing—that it’s important to find out what can’t be improved about people’s minds.
Part of this is the arrogance problem—how complete is your knowledge of the possibility of improvement, anyway?-- and the other part is wondering whether all those resources could be better put into learning how to improve what can be improved.
The other thing is that I’ve had some recent evidence that the ways the parts of the mind are interconnected aren’t completely obvious. I’ve been doing some psychological work on fading out self-hatred, and the results have been being less frightened about what I post (I decided before taking the IQ tests to post my scores, but there was still a bit of a pang), easier and faster typing—not tested, but I do seem somewhat apt to write at greater length (this seems to be the result of feeling less need to over-monitor so that typing can be a low-level habit), less akrasia (still pretty bad, but the desire to do things is happening more often), and the ability to walk downstairs more easily (I have some old knee injuries which can be ameliorated by better coordination—but I haven’t been working on coordination).
In this type of test, I can solve generally about all except about 4 of them almost immediately with some seconds of thought. I skip those few, then return to them at the end, and in the minutes that remain manage to make an educated guess for say two of them, while having to leave two more to complete chance.
Interesting. Did you find the questions in the first test more difficult than the second? I did notice that the first test relies a lot on mental rotation.
I found the last third or so of the questions in the first test much more difficult than almost anything in the second.
There are two ways an IQ test can fail: a) it can be miscalibrated; b) it can measure something else than IQ.
If you only want to know your percentile in LW population, (a) is not a problem, but (b) remains. What if the test does not measure the “general intelligence factor”, but something else? It can partly correlate to IQ, and partly to something else, e.g. mathematical or verbal skills.
Also you have a preselection bias—some LWers will fill the survey, others won’t.
Don’t forget those of us who aren’t native English speakers. Didn’t try it again recently, but I used to have a 5-10 points difference between an IQ test in French (my native language) and English. Word-related questions are of course harder, but even for the rest, I’m not sure if it’s because it took me longer to process the English (while the IQ is time-limited), or just that decoding a non-native language use more brain power (leaving less for solving the problem). But anyway, I score better in my native language than in English, and I answered with my score in native.
Yes—I’m quoting an IQ test I did as a kid which had a suspiciously high score, I’m pretty confident I’d get a much less spectacular score if I did one today.
Awesome. Definitely don’t do another one then. (Unless you need to diagnose something of course!)