I assume they’re referring to the top right quadrant of the graph being totally empty while the top left quadrant has two events. But those two events are a pretty slender reed to rest their analysis on.
What looks more interesting to me is an apparent downward trend in the biggest crater size from Chixhulub onwards. I tried downloading the Earth Impact Database data referenced in the paper so I could zoom in on these more recent impacts, but the dataset’s only available as machine-unreadable HTML tables with various ad hoc notations. (This leads me to wonder how Ćirković, Sandberg & Bostrom turned these clunky tables of numbers into an unambiguous scatterplot, and makes me even more nervous about those two data points on which their analysis hinges.)
Whee! Thanks. I won’t ask how the sausage was made; I’ll just plunge right into the graphs.
[N.B.: this is the third set of plots I’ve made, having overhauled them twice in response to errors pointed out by the child comments. The second round of plots, which erroneously used log-log scales, are at this link and this link.]
Here’s my version of the Ćirković, Sandberg & Bostrom plot.
There appears to be even less evidence of anthropic shadow than in their original graph; there’s now an impact in the upper right quadrant, and if anything more mid-sized craters are recorded more recently. But the latter could be a reporting bias because there are better records for newer impacts. So let’s zoom in on that denser period:
Still no sign of anthropic shadowing. What about zooming in on the post-Chicxulub period I originally wondered about?
I think there’s actually mild evidence of anthropic shadowing at this scale, although visually that’s mostly suggested by the three biggest craters. Even ignoring those, though, there does seem to be a sort of downward wedge in crater sizes.
And lastly, a link to a bonus plot of the most recent period, the last 3 million years, during which Homo was evolving. I think there’s no visible evidence of anthropic shadowing on that plot, which doesn’t surprise me much because there’re so few opportunities for anthropic shadowing to appear on such a short time scale.
I think these corrected plots bring me back to where I was when I first read ĆS&B’s paper: open to the idea of anthropic shadowing, but seeing only a faint sign of it in the impact crater data.
The log scale creates that downward-sloping pattern as an artifact—it appears even if the craters are purely random (uniformly distributed across time).
Simplified example: suppose that we treat crater diameter as binary, with a 10km or more crater counting as a “big crater” and anything smaller getting ignored. If we get one “big crater” every 2 Myr, on average, then we’d expect the right half of the x-axis to be blank; the rightmost datapoint would be around the 1 Myr mark. Between 1-10 Myr we’d expect to see a few dots (4-5 big impacts). To the left of the 10 Myr mark, the dots would get denser and denser; there would be hundreds of them between 100 & 1000 Myr.
If we instead chose a smaller cutoff for what counts as a “bit crater”—say, a once every 0.2 Myr sized crater (which is perhaps a 2km diameter) - then the pattern would look the same, but shifted over to the right (by one tick mark, in that case).
In the two-dimensional log-log graph, that pattern (of increasing density to the left of the graph, petering out at different x-values depending on what size crater you’re looking for) translates into the downward slope that we see here.
You’re absolutely right, can’t believe I missed that. That datum’s the Dhala crater in India, which has its age listed as “> 1700 < 2100” Ma. Dropping the angle brackets and the spaces gave “17002100″. The three other data points over 2400 Ma (the “Jebel Waqf as Suwwan”, “Tunnunik (Prince Albert)”, and “Amelia Creek” craters) got misdated for similar reasons. Better fix the data file and replot. One sec....
Edit: fixed those four points, but I now notice there’s a Santa Fe crater that’s mis-sized at 613 km instead of “6-13”.
Edit 2: OK, think that’s rectified as well. If anyone’s wondering how I handled these ranges, I did the lazy thing and took the arithmetic mean of the upper & lower bounds.
The whole point of HTML is to be machine-readable. I find that if I copy a table from Apple’s web browser and paste it into Apple’s spreadsheet, it works fine. Maybe you need better machines.
Upvoted for the impressive feat of cramming Apple fanboyism and subtly flawed linguistic pedantry into three dozen words of double-barrelled flamebait. Downvoted for posting double-barrelled flamebait comprising Apple fanboyism and subtly flawed linguistic pedantry.
machine-unreadable HTML tables
The whole point of HTML is to be machine-readable.
That HTML’s meant to be machine-readable doesn’t mean it is. (Both webpages and browsers can fail to meet HTML standards.) But that is itself a counter-nitpick. The bigger problem with your nitpick is that you’re reading “machine-readable” in a blinkered way. As Wikipedia says, “machine-readable” data can refer to “human-readable data that is marked up so that it can also be read by machines (examples; microformats, RDFa) or data file formats intended principally for machines (RDF, XML, JSON)”. You seem to have only the first meaning in mind while I was thinking of the second. I wanted the data in a format that a computer could immediately interpret and turn into a scatterplot, such as a text file of two tab-delimited columns of numbers. Now, you do make the point that it’s fairly straightforward to get the data as two columns of numbers...
I find that if I copy a table from Apple’s web browser and paste it into Apple’s spreadsheet, it works fine. Maybe you need better machines.
...but telling me this is unhelpful. For one thing, I already have two spreadsheet programs on my computer that can do the same thing, Gnumeric and OpenOffice.org Calc. For another, why should I have to change my usual workflow (paste data into vim, clean it, save as plain text, load into R, make graphs) when the data could’ve been made available in a simple format in the first place? Lastly, once you’ve got the numbers into your spreadsheet, what happens when you try plotting them? Do you still “find that [...] it works fine”? I suspect not, because the age numbers include values like “< 0.001“, “0.004 ± 0.001”, “0.0054± 0.0015”, “~ 0.0066”, “> 0.05“, “>5, <36”, and “3-95”. Being able to circumvent the clunkiness of presenting data in HTML tables doesn’t eliminate the problem of the ad hoc human-readable-but-machine-unreadable notations.
I assume they’re referring to the top right quadrant of the graph being totally empty while the top left quadrant has two events. But those two events are a pretty slender reed to rest their analysis on.
What looks more interesting to me is an apparent downward trend in the biggest crater size from Chixhulub onwards. I tried downloading the Earth Impact Database data referenced in the paper so I could zoom in on these more recent impacts, but the dataset’s only available as machine-unreadable HTML tables with various ad hoc notations. (This leads me to wonder how Ćirković, Sandberg & Bostrom turned these clunky tables of numbers into an unambiguous scatterplot, and makes me even more nervous about those two data points on which their analysis hinges.)
Ask and ye shall receive (JSON)
Whee! Thanks. I won’t ask how the sausage was made; I’ll just plunge right into the graphs.
[N.B.: this is the third set of plots I’ve made, having overhauled them twice in response to errors pointed out by the child comments. The second round of plots, which erroneously used log-log scales, are at this link and this link.]
Here’s my version of the Ćirković, Sandberg & Bostrom plot.
There appears to be even less evidence of anthropic shadow than in their original graph; there’s now an impact in the upper right quadrant, and if anything more mid-sized craters are recorded more recently. But the latter could be a reporting bias because there are better records for newer impacts. So let’s zoom in on that denser period:
Still no sign of anthropic shadowing. What about zooming in on the post-Chicxulub period I originally wondered about?
I think there’s actually mild evidence of anthropic shadowing at this scale, although visually that’s mostly suggested by the three biggest craters. Even ignoring those, though, there does seem to be a sort of downward wedge in crater sizes.
And lastly, a link to a bonus plot of the most recent period, the last 3 million years, during which Homo was evolving. I think there’s no visible evidence of anthropic shadowing on that plot, which doesn’t surprise me much because there’re so few opportunities for anthropic shadowing to appear on such a short time scale.
I think these corrected plots bring me back to where I was when I first read ĆS&B’s paper: open to the idea of anthropic shadowing, but seeing only a faint sign of it in the impact crater data.
(The R code [for the old plots].)
The log scale creates that downward-sloping pattern as an artifact—it appears even if the craters are purely random (uniformly distributed across time).
Simplified example: suppose that we treat crater diameter as binary, with a 10km or more crater counting as a “big crater” and anything smaller getting ignored. If we get one “big crater” every 2 Myr, on average, then we’d expect the right half of the x-axis to be blank; the rightmost datapoint would be around the 1 Myr mark. Between 1-10 Myr we’d expect to see a few dots (4-5 big impacts). To the left of the 10 Myr mark, the dots would get denser and denser; there would be hundreds of them between 100 & 1000 Myr.
If we instead chose a smaller cutoff for what counts as a “bit crater”—say, a once every 0.2 Myr sized crater (which is perhaps a 2km diameter) - then the pattern would look the same, but shifted over to the right (by one tick mark, in that case).
In the two-dimensional log-log graph, that pattern (of increasing density to the left of the graph, petering out at different x-values depending on what size crater you’re looking for) translates into the downward slope that we see here.
Good point. Come to think of it, that’s probably why ĆS&B used a linear scale for the time axis in the first place.
I think the 10^7 Myr one is an error, seeing as the earth is less than 10^4 Myr old.
You’re absolutely right, can’t believe I missed that. That datum’s the Dhala crater in India, which has its age listed as “> 1700 < 2100” Ma. Dropping the angle brackets and the spaces gave “17002100″. The three other data points over 2400 Ma (the “Jebel Waqf as Suwwan”, “Tunnunik (Prince Albert)”, and “Amelia Creek” craters) got misdated for similar reasons. Better fix the data file and replot. One sec....
Edit: fixed those four points, but I now notice there’s a Santa Fe crater that’s mis-sized at 613 km instead of “6-13”.
Edit 2: OK, think that’s rectified as well. If anyone’s wondering how I handled these ranges, I did the lazy thing and took the arithmetic mean of the upper & lower bounds.
The whole point of HTML is to be machine-readable. I find that if I copy a table from Apple’s web browser and paste it into Apple’s spreadsheet, it works fine. Maybe you need better machines.
Upvoted for the impressive feat of cramming Apple fanboyism and subtly flawed linguistic pedantry into three dozen words of double-barrelled flamebait. Downvoted for posting double-barrelled flamebait comprising Apple fanboyism and subtly flawed linguistic pedantry.
That HTML’s meant to be machine-readable doesn’t mean it is. (Both webpages and browsers can fail to meet HTML standards.) But that is itself a counter-nitpick. The bigger problem with your nitpick is that you’re reading “machine-readable” in a blinkered way. As Wikipedia says, “machine-readable” data can refer to “human-readable data that is marked up so that it can also be read by machines (examples; microformats, RDFa) or data file formats intended principally for machines (RDF, XML, JSON)”. You seem to have only the first meaning in mind while I was thinking of the second. I wanted the data in a format that a computer could immediately interpret and turn into a scatterplot, such as a text file of two tab-delimited columns of numbers. Now, you do make the point that it’s fairly straightforward to get the data as two columns of numbers...
...but telling me this is unhelpful. For one thing, I already have two spreadsheet programs on my computer that can do the same thing, Gnumeric and OpenOffice.org Calc. For another, why should I have to change my usual workflow (paste data into vim, clean it, save as plain text, load into R, make graphs) when the data could’ve been made available in a simple format in the first place? Lastly, once you’ve got the numbers into your spreadsheet, what happens when you try plotting them? Do you still “find that [...] it works fine”? I suspect not, because the age numbers include values like “< 0.001“, “0.004 ± 0.001”, “0.0054± 0.0015”, “~ 0.0066”, “> 0.05“, “>5, <36”, and “3-95”. Being able to circumvent the clunkiness of presenting data in HTML tables doesn’t eliminate the problem of the ad hoc human-readable-but-machine-unreadable notations.
By that standard, an excel spreadsheet is “machine-unreadable.”
Debatable.
This should be one of the LW Rationality Quotes for next month.
Against the rules.
Interpret it as a wish that it could be, then?
ISTR we once had a rationality quotes thread with the reverse rule, but I can’t find it now!
This will teach me to skim next time. Thanks.