Upvoted for the impressive feat of cramming Apple fanboyism and subtly flawed linguistic pedantry into three dozen words of double-barrelled flamebait. Downvoted for posting double-barrelled flamebait comprising Apple fanboyism and subtly flawed linguistic pedantry.
machine-unreadable HTML tables
The whole point of HTML is to be machine-readable.
That HTML’s meant to be machine-readable doesn’t mean it is. (Both webpages and browsers can fail to meet HTML standards.) But that is itself a counter-nitpick. The bigger problem with your nitpick is that you’re reading “machine-readable” in a blinkered way. As Wikipedia says, “machine-readable” data can refer to “human-readable data that is marked up so that it can also be read by machines (examples; microformats, RDFa) or data file formats intended principally for machines (RDF, XML, JSON)”. You seem to have only the first meaning in mind while I was thinking of the second. I wanted the data in a format that a computer could immediately interpret and turn into a scatterplot, such as a text file of two tab-delimited columns of numbers. Now, you do make the point that it’s fairly straightforward to get the data as two columns of numbers...
I find that if I copy a table from Apple’s web browser and paste it into Apple’s spreadsheet, it works fine. Maybe you need better machines.
...but telling me this is unhelpful. For one thing, I already have two spreadsheet programs on my computer that can do the same thing, Gnumeric and OpenOffice.org Calc. For another, why should I have to change my usual workflow (paste data into vim, clean it, save as plain text, load into R, make graphs) when the data could’ve been made available in a simple format in the first place? Lastly, once you’ve got the numbers into your spreadsheet, what happens when you try plotting them? Do you still “find that [...] it works fine”? I suspect not, because the age numbers include values like “< 0.001“, “0.004 ± 0.001”, “0.0054± 0.0015”, “~ 0.0066”, “> 0.05“, “>5, <36”, and “3-95”. Being able to circumvent the clunkiness of presenting data in HTML tables doesn’t eliminate the problem of the ad hoc human-readable-but-machine-unreadable notations.
Upvoted for the impressive feat of cramming Apple fanboyism and subtly flawed linguistic pedantry into three dozen words of double-barrelled flamebait. Downvoted for posting double-barrelled flamebait comprising Apple fanboyism and subtly flawed linguistic pedantry.
That HTML’s meant to be machine-readable doesn’t mean it is. (Both webpages and browsers can fail to meet HTML standards.) But that is itself a counter-nitpick. The bigger problem with your nitpick is that you’re reading “machine-readable” in a blinkered way. As Wikipedia says, “machine-readable” data can refer to “human-readable data that is marked up so that it can also be read by machines (examples; microformats, RDFa) or data file formats intended principally for machines (RDF, XML, JSON)”. You seem to have only the first meaning in mind while I was thinking of the second. I wanted the data in a format that a computer could immediately interpret and turn into a scatterplot, such as a text file of two tab-delimited columns of numbers. Now, you do make the point that it’s fairly straightforward to get the data as two columns of numbers...
...but telling me this is unhelpful. For one thing, I already have two spreadsheet programs on my computer that can do the same thing, Gnumeric and OpenOffice.org Calc. For another, why should I have to change my usual workflow (paste data into vim, clean it, save as plain text, load into R, make graphs) when the data could’ve been made available in a simple format in the first place? Lastly, once you’ve got the numbers into your spreadsheet, what happens when you try plotting them? Do you still “find that [...] it works fine”? I suspect not, because the age numbers include values like “< 0.001“, “0.004 ± 0.001”, “0.0054± 0.0015”, “~ 0.0066”, “> 0.05“, “>5, <36”, and “3-95”. Being able to circumvent the clunkiness of presenting data in HTML tables doesn’t eliminate the problem of the ad hoc human-readable-but-machine-unreadable notations.
By that standard, an excel spreadsheet is “machine-unreadable.”
Debatable.