I think something else is going on with the 2 4 6 experiment, as described. Many of the students are making the assumption about the set of potential rules. Specifically, the assumption is that most pairs of rules in this set have the following mutual relationship: most of the instances allowed by one rule, are disallowed by the other rule. This being the case, then the quickest way to test any hypothetical rule is to produce a variety of instances which conform with that rule, to see whether they conform with the hidden rule.
I’ll give you an example. Suppose that we are considering a family of rules, “the third number is an integer polynomial of the first two numbers”. The quickest way to disconfirm a hypothetical rule is to produce instances in accordance with it and test them. If the rule is wrong, then the chances are good that an instance will quickly be discovered that does not match the hidden rule. It is much less efficient to proceed by producing instances not in accordance with it.
I’ll give a specific example. Suppose the hidden rule is c = a + b, and the hypothesized rule being tested is c = a—b. Now pick just one random instance in accordance with the hypothesized rule. I will suppose a = 4, b = 6, so c = −2. So the instance is 4 6 −2. That instance does not match the hidden rule, so the hypothesized rule is immediately disconfirmed. Now try the following: instead of picking a random instance in accordance with the hypothesized rule, pick one not in accordance with it. I’ll pick 4 6 8. This also fails to match the hidden rule, so it fails to tell us whether our hypothesized rule is correct. We see that it was quicker to test an instance that agrees with the hypothetical rule.
Thus we can see that in a certain class of situations, the most efficient way to test a hypothesis is to come up with instances that conform with the hypothesis.
Now you can fault people on having made this assumption. But if you do, then it is still a different error from the one describe. If the assumption about the kind of problem faced had been correct, then the approach (testing instances that agree with the hypothesis) would have been a good one. The error, if any, lies not in the approach per se but in the assumption.
Finally, I do not think one can rightly fault people for making that assumption. For, it is inevitable that very large and completely untested assumptions must be made in order to come to a conclusion at all. For, infinitely many rules are consistent with the evidence no matter how many instances you test. The only way ever to whittle this infinity of rules consistent with all the evidence down to one concluded rule is to make very large assumptions. The assumption that I have described may simply be the assumption which they made (and they had to make some assumption).
Furthermore, it doesn’t matter what assumptions people make (and they must make some, because of the nature of the problem), a clever scientist can learn what assumptions people tend to make and then violate those assumptions. So no matter what people do, someone can come along, construct an experiment in which those assumptions are violated, and then say, “gotcha” when the majority of his test subjects come to the wrong conclusions (because of the assumptions they were making which were violated by the experiment).
The problem is not that they are trying examples which confirm their hypothesis it’s that they are trying only those examples which test their hypothesis.
The article focuses on testing examples which don’t work because people don’t do this enough. Searching for positive examples is (as you argue) a neccessary part of testing a hypothesis, and people seem to have no problem applying this. What people fail to do is to search for the negative as well.
Both positive and negative examples are, I’d say, equally important, but people’s focus is completely imbalanced.
Another serious problem is that the students must make the necessary assumption that the rule be simple. In the context of school, simple is generally “most trivial to figure out”.
This is a necessary assumption because there could be rules that would not be possible to determine by guessing. For example, you’d have to spend the lifetime of the universe guessing triplets to correctly identify that the rule is “Ascending integers except sequences containing the 22nd Busy Beaver number”, and then you still wouldn’t know if there’s some other rider.
If it was said, “It will require several more guesses to figure out the rule, but not more than a couple dozen, and the sequences you have don’t fully tell you what the rule is”, the exercise would be a lot more sane. At worst, the only mistake the students made was assuming that the exercise was supposed to be too simple. Which is like asking them to be mind readers: I’m thinking of a problem; on a scale of 1-10, please guess how difficult it is to solve.
In the situation you described, it would be necessary to test values that did and didn’t match the hypothesis, which ends up working an awful lot like adjusting away from an anchor.
Is there a way of solving the 2 4 6 problem without coming up with a hypothesis too early?
The problem is not that they come up with a hypothesis too early, it’s that they stop too early without testing examples that are not supposed to work. In most cases people are given as many opportunities to test as they’d like, yet they are confident in their answer after only testing one or two cases (all of which came up positive).
The trick is that you should come up with one or more hypotheses as soon as you can (maybe without announcing them), but test both cases which do and don’t confirm it, and be prepared to change your hypothesis if you are proven wrong.
If it requires a round-trip of human speech through a professor (and thus the requisition of the attention of the entire class) then you can hardly say they are given as many opportunities to test as they’d like. A person of functioning social intelligence certainly has no more than 20 such round-trips available consecutively, and less conservatively even 4 might be pushing it for many.
Give them a computer program to interact with and then you can say they have as many opportunities to test as they’d like.
Following what Constant has pointed out, I am wondering if there is, in fact, a way to solve the 2 4 6 problem without first guessing, and then adjusting your guess.
Following what Constant has pointed out, I am wondering if there is, in fact, a way to solve the 2 4 6 problem without first guessing, and then adjusting your guess.
In the situation you described, it would be necessary to test values that did and didn’t match the hypothesis, which ends up working an awful lot like adjusting away from an anchor.
Is there a way of solving the 2 4 6 problem without coming up with a hypothesis too early?
I think something else is going on with the 2 4 6 experiment, as described. Many of the students are making the assumption about the set of potential rules. Specifically, the assumption is that most pairs of rules in this set have the following mutual relationship: most of the instances allowed by one rule, are disallowed by the other rule. This being the case, then the quickest way to test any hypothetical rule is to produce a variety of instances which conform with that rule, to see whether they conform with the hidden rule.
I’ll give you an example. Suppose that we are considering a family of rules, “the third number is an integer polynomial of the first two numbers”. The quickest way to disconfirm a hypothetical rule is to produce instances in accordance with it and test them. If the rule is wrong, then the chances are good that an instance will quickly be discovered that does not match the hidden rule. It is much less efficient to proceed by producing instances not in accordance with it.
I’ll give a specific example. Suppose the hidden rule is c = a + b, and the hypothesized rule being tested is c = a—b. Now pick just one random instance in accordance with the hypothesized rule. I will suppose a = 4, b = 6, so c = −2. So the instance is 4 6 −2. That instance does not match the hidden rule, so the hypothesized rule is immediately disconfirmed. Now try the following: instead of picking a random instance in accordance with the hypothesized rule, pick one not in accordance with it. I’ll pick 4 6 8. This also fails to match the hidden rule, so it fails to tell us whether our hypothesized rule is correct. We see that it was quicker to test an instance that agrees with the hypothetical rule.
Thus we can see that in a certain class of situations, the most efficient way to test a hypothesis is to come up with instances that conform with the hypothesis.
Now you can fault people on having made this assumption. But if you do, then it is still a different error from the one describe. If the assumption about the kind of problem faced had been correct, then the approach (testing instances that agree with the hypothesis) would have been a good one. The error, if any, lies not in the approach per se but in the assumption.
Finally, I do not think one can rightly fault people for making that assumption. For, it is inevitable that very large and completely untested assumptions must be made in order to come to a conclusion at all. For, infinitely many rules are consistent with the evidence no matter how many instances you test. The only way ever to whittle this infinity of rules consistent with all the evidence down to one concluded rule is to make very large assumptions. The assumption that I have described may simply be the assumption which they made (and they had to make some assumption).
Furthermore, it doesn’t matter what assumptions people make (and they must make some, because of the nature of the problem), a clever scientist can learn what assumptions people tend to make and then violate those assumptions. So no matter what people do, someone can come along, construct an experiment in which those assumptions are violated, and then say, “gotcha” when the majority of his test subjects come to the wrong conclusions (because of the assumptions they were making which were violated by the experiment).
The problem is not that they are trying examples which confirm their hypothesis it’s that they are trying only those examples which test their hypothesis.
The article focuses on testing examples which don’t work because people don’t do this enough. Searching for positive examples is (as you argue) a neccessary part of testing a hypothesis, and people seem to have no problem applying this. What people fail to do is to search for the negative as well.
Both positive and negative examples are, I’d say, equally important, but people’s focus is completely imbalanced.
Another serious problem is that the students must make the necessary assumption that the rule be simple. In the context of school, simple is generally “most trivial to figure out”.
This is a necessary assumption because there could be rules that would not be possible to determine by guessing. For example, you’d have to spend the lifetime of the universe guessing triplets to correctly identify that the rule is “Ascending integers except sequences containing the 22nd Busy Beaver number”, and then you still wouldn’t know if there’s some other rider.
If it was said, “It will require several more guesses to figure out the rule, but not more than a couple dozen, and the sequences you have don’t fully tell you what the rule is”, the exercise would be a lot more sane. At worst, the only mistake the students made was assuming that the exercise was supposed to be too simple. Which is like asking them to be mind readers: I’m thinking of a problem; on a scale of 1-10, please guess how difficult it is to solve.
In the situation you described, it would be necessary to test values that did and didn’t match the hypothesis, which ends up working an awful lot like adjusting away from an anchor. Is there a way of solving the 2 4 6 problem without coming up with a hypothesis too early?
The problem is not that they come up with a hypothesis too early, it’s that they stop too early without testing examples that are not supposed to work. In most cases people are given as many opportunities to test as they’d like, yet they are confident in their answer after only testing one or two cases (all of which came up positive).
The trick is that you should come up with one or more hypotheses as soon as you can (maybe without announcing them), but test both cases which do and don’t confirm it, and be prepared to change your hypothesis if you are proven wrong.
If it requires a round-trip of human speech through a professor (and thus the requisition of the attention of the entire class) then you can hardly say they are given as many opportunities to test as they’d like. A person of functioning social intelligence certainly has no more than 20 such round-trips available consecutively, and less conservatively even 4 might be pushing it for many.
Give them a computer program to interact with and then you can say they have as many opportunities to test as they’d like.
Come up with several hypotheses in parallel, perhaps?
Sooo many double posts! This new interface is buggy as @#$!
Following what Constant has pointed out, I am wondering if there is, in fact, a way to solve the 2 4 6 problem without first guessing, and then adjusting your guess.
Following what Constant has pointed out, I am wondering if there is, in fact, a way to solve the 2 4 6 problem without first guessing, and then adjusting your guess.
In the situation you described, it would be necessary to test values that did and didn’t match the hypothesis, which ends up working an awful lot like adjusting away from an anchor. Is there a way of solving the 2 4 6 problem without coming up with a hypothesis too early?