Note that Google result counts on the first page of a search are approximate, not exact figures. On smaller result sets the actual count (as obtained by getting to the last page of the search results) can be close, or half, or even (that I’ve seen) a hundredth the approximated count. I would’t conclude much of anything from the ratio of estimates with such large error bars.
Those aren’t errors. If you repeat both searches with duplicates included, and go to the last page of results, you will find that Google is returning exactly 1000 for both. This is because Google never returns more than 1000, regardless of how many hits there are.
Do you have the empirical data to back up your unqualified assertions?
Try comparing Google’s estimates to actual hit counts (as reported by going to the last page), with and without “similar results” included, for searches returning fewer than 1000 hits.
Here is one experimental result: estimated count 585, actual with similar results excluded 177, actual with similar results included 224.
Do you have the empirical data to back up your unqualified assertions?
I gave some: Google never returns more than 1000 hits. Therefore estimates orders of magnitude above 1000 (as in the case at hand) cannot be tested by looking at the actual number of hits returned: the two numbers have nothing to do with each other.
I do not know how accurate the estimates are, but a factor of several seems to be about right, as in the test you just made. I have also seen anomalies such as a search for X giving an estimate lower than for a search of X and Y, but never by orders of magnitude, that I’ve noticed.
How about the totals according to the last page, excluding “similar results”? That gives 899 for Manipulate Men and 893 for Manipulate Women. That ratio is pretty close to 1:1.
And the totals were way off from the front page estimates, by orders of magnitude. Maybe this reflects a lot of excluded similar results?
Note that Google result counts on the first page of a search are approximate, not exact figures. On smaller result sets the actual count (as obtained by getting to the last page of the search results) can be close, or half, or even (that I’ve seen) a hundredth the approximated count. I would’t conclude much of anything from the ratio of estimates with such large error bars.
Those aren’t errors. If you repeat both searches with duplicates included, and go to the last page of results, you will find that Google is returning exactly 1000 for both. This is because Google never returns more than 1000, regardless of how many hits there are.
Comparing the estimates is the correct operation.
Do you have the empirical data to back up your unqualified assertions?
Try comparing Google’s estimates to actual hit counts (as reported by going to the last page), with and without “similar results” included, for searches returning fewer than 1000 hits.
Here is one experimental result: estimated count 585, actual with similar results excluded 177, actual with similar results included 224.
I gave some: Google never returns more than 1000 hits. Therefore estimates orders of magnitude above 1000 (as in the case at hand) cannot be tested by looking at the actual number of hits returned: the two numbers have nothing to do with each other.
I do not know how accurate the estimates are, but a factor of several seems to be about right, as in the test you just made. I have also seen anomalies such as a search for X giving an estimate lower than for a search of X and Y, but never by orders of magnitude, that I’ve noticed.
That’s worth knowing. Is there a source for non-obvious things about google searches?
Interesting.
How about the totals according to the last page, excluding “similar results”? That gives 899 for Manipulate Men and 893 for Manipulate Women. That ratio is pretty close to 1:1.
And the totals were way off from the front page estimates, by orders of magnitude. Maybe this reflects a lot of excluded similar results?