Here is a kernel density estimate of the “true” distribution, with bootstrapped) pointwise 95% confidence bands from 999 resamples:
It looks plausibly bimodal, though one might want to construct a suitable hypothesis test for unimodality versus multimodality. Unfortunately, as you noted, we cannot distinguish between the hypothesis that the bimodality is due to rounding (at 500 M) versus the hypothesis that the bimodality is due to ambiguity between Europe and the EU. This holds even if a hypothesis test rejects a unimodal model, but if anyone is still interested in testing for unimodality, I suggest considering Efron and Tibshirani’s approach using the bootstrap.
Edit: Updated the plot. I switched from adaptive bandwidth to fixed bandwidth (because it seems to achieve higher efficiency), so parts of what I wrote below are no longer relevant—I’ve put these parts in square brackets.
Plot notes: [The adaptive bandwidth was achieved with Mathematica’s built-in “Adaptive” option for SmoothKernelDistribution, which is horribly documented; I think it uses the same algorithm as ‘akj’ in R’s quantreg package.] A Gaussian kernel was used with the bandwidth set according to Silverman’s rule-of-thumb [and the sensitivity (‘alpha’ in akj’s documentation) set to 0.5]. The bootstrap confidence intervals are “biased and unaccelerated” because I don’t (yet) understand how bias-corrected and accelerated bootstrap confidence intervals work. Tick marks on the x-axis represent the actual data with a slight jitter added to each point.
This sounds plausible, but from looking at the data, I don’t think this is happening in our sample. In particular, if this were the case, then we would expect the SAT scores of those who did not submit IQ data to be different from those who did submit IQ data. I ran an Anderson–Darling test on each of the following pairs of distributions:
SAT out of 2400 for those who submitted IQ data (n = 89) vs SAT out of 2400 for those who did not submit IQ data (n = 230)
SAT out of 1600 for those who submitted IQ data (n = 155) vs SAT out of 1600 for those who did not submit IQ data (n = 217)
The p-values came out as 0.477 and 0.436 respectively, which means that the Anderson–Darling test was unable to distinguish between the two distributions in each pair at any significance.
As I did for my last plot, I’ve once again computed for each distribution a kernel density estimate with bootstrapped confidence bands from 999 resamples. From visual inspection, I tend to agree that there is no clear difference between the distributions. The plots should be self-explanatory:
(More details about these plots are available in my previous comment.)
Edit: Updated plots. The kernel density estimates are now fixed-bandwidth using the Sheather–Jones method for bandwidth selection. The density near the right edge is bias-corrected using an ad hoc fix described by whuber on stats.SE.