As for ‘beyond a reasonable doubt’, a couple of years ago I found a study had been done of court cases, and in general, this standard of proof seems to be met when people are 75% confident. That is, on average, if a jury thinks there’s at least a 25% chance someone is innocent, they don’t convict.
This is depressing.
So assuming that juries accurately estimate the probability of guilt, up to a quarter of convicts could be innocent?
I’m quite sure that’s not what was intended by the original writers of the phrase.
I wrote a short article here, which suggests that using that level of belief may have resulted due to societal evolution towards an optimal justice system, regardless of what the surface reasoning for that level might be.
societal evolution towards an optimal justice system
I don’t see anything significant that would push society towards an optimal justice system, and the justice system does not seem as optimal as it would be if there was such a force.
Why do you think society is evolving towards an optimal justice system?
Perhaps I should have said ‘local-optimal’, and it was more of a retro-diction about societal evolution than a prediction about future justice systems. (Although such a prediction could still be made on this line of reasoning.)
If it’s possible to convict and imprison a person on too little evidence—say, the sworn testimony of a single person—then there will be a significant number of innocent people who end up in prison, which removes them from the pool of productive workers, which reduces the overall economy of the society, which reduces the ability of that economy to support its military, which reduces the odds that that society can defend itself in wars, which increases the odds that that society will be replaced, by force, by a different society.
If it a conviction requires more evidence than is ‘optimal’, then a larger number of criminals will remain in society performing criminal acts, which reduces the overall economy, with similar overall effects.
As mentioned in the article I linked to, by having conviction set at ‘probable cause’, which in practice works out to 75% certainty; then Laplace’s Sunrise Formula implies that such a conviction implies that there was at least 50% confidence that the convict would perform a criminal act in the future—and, thus, convicting at such a level of confidence could be considered ‘optimal’ for preventing future criminal acts, thus maximizing the economy, thus maximizing the odds that that society will be able to defend itself and continue existing.
The surface reasoning for how a justice system works may not have much to do with why that system continues to exist, any more than kissing being ‘romantic’ has much to do with why people started kissing at all.
From what I understand, the study you’re referencing says that a given juror must be on average 25% certain of innocence for a defendant to not be declared guilty. However, them going to jail requires that the entire jury agree that they’re guilty. Some of the jurors will need higher certainty, and there will be variations in how they interpret the evidence. As such, it sounds like someone would actually need a lot more evidence to be declared guilty.
They can be tried repeatedly until they get all guilty or all innocent, but I don’t think that actually happens much. The courts are overtaxed as it is.
For one—IIRC, the 25% figure was based on the jury as a whole, rather than individual jurors.
And for another—I would actually be more surprised if my ‘evolution of society’ reasoning turned out to be true, then if I were mistaken. It’s mainly an attempted backfill to try to figure out just how the jury system would end up using ‘beyond a reasonable doubt’ and resulting in getting as close as possible to convicting ‘people who are more likely than not to break the law in the future’.
(I can’t seem to find the study in question with some quick Googling. I have a memory that the 75% certainty figure was for low-level crimes, such as theft, and that when the sentence was higher, such as for capital crimes, then juries tend to need to have a higher level of confidence before they’ll issue a guilty verdict; but that might have been a different study.)
That’s a misleading way to put it. If they’re right 75% of the time, they’ll think their probability of being right is over 90%. Trying to convince people that it’s 75% in practice will make them uncomfortable, so it might be better to choose examples where the outcomes aren’t important.
1:1 is 50%. If I’m 50% certain of something, then finding out whether or not it’s true would give me one bit of information. 0 bits is no information. That would be finding out something you were 100% certain of, with 1:0 odds.
I wrote the chart some time ago, so I don’t have the reference handy anymore; but IIRC, the translation was that a 15⁄16 (93.75%) probability corresponded to 4 bits of information; a 7⁄8 (87.5%) probability to 3 bits; 3⁄4 (75%) to 2 bits; and 1⁄2 (50%) to 1 bit.
I might argue that that’s not what “Beyond reasonable doubt” means, but more concretely: Shouldn’t 1:1 be 0 bits?
Maybe I’m misunderstanding what the bits map to here.
As for ‘beyond a reasonable doubt’, a couple of years ago I found a study had been done of court cases, and in general, this standard of proof seems to be met when people are 75% confident. That is, on average, if a jury thinks there’s at least a 25% chance someone is innocent, they don’t convict.
This is depressing. So assuming that juries accurately estimate the probability of guilt, up to a quarter of convicts could be innocent? I’m quite sure that’s not what was intended by the original writers of the phrase.
I wrote a short article here, which suggests that using that level of belief may have resulted due to societal evolution towards an optimal justice system, regardless of what the surface reasoning for that level might be.
I don’t see anything significant that would push society towards an optimal justice system, and the justice system does not seem as optimal as it would be if there was such a force.
Why do you think society is evolving towards an optimal justice system?
Perhaps I should have said ‘local-optimal’, and it was more of a retro-diction about societal evolution than a prediction about future justice systems. (Although such a prediction could still be made on this line of reasoning.)
If it’s possible to convict and imprison a person on too little evidence—say, the sworn testimony of a single person—then there will be a significant number of innocent people who end up in prison, which removes them from the pool of productive workers, which reduces the overall economy of the society, which reduces the ability of that economy to support its military, which reduces the odds that that society can defend itself in wars, which increases the odds that that society will be replaced, by force, by a different society.
If it a conviction requires more evidence than is ‘optimal’, then a larger number of criminals will remain in society performing criminal acts, which reduces the overall economy, with similar overall effects.
As mentioned in the article I linked to, by having conviction set at ‘probable cause’, which in practice works out to 75% certainty; then Laplace’s Sunrise Formula implies that such a conviction implies that there was at least 50% confidence that the convict would perform a criminal act in the future—and, thus, convicting at such a level of confidence could be considered ‘optimal’ for preventing future criminal acts, thus maximizing the economy, thus maximizing the odds that that society will be able to defend itself and continue existing.
The surface reasoning for how a justice system works may not have much to do with why that system continues to exist, any more than kissing being ‘romantic’ has much to do with why people started kissing at all.
You’re saying that it evolves towards a local optimum because the nations that do it badly fall?
I think No Evolutions for Corporations or Nanodevices applies here.
From what I understand, the study you’re referencing says that a given juror must be on average 25% certain of innocence for a defendant to not be declared guilty. However, them going to jail requires that the entire jury agree that they’re guilty. Some of the jurors will need higher certainty, and there will be variations in how they interpret the evidence. As such, it sounds like someone would actually need a lot more evidence to be declared guilty.
They can be tried repeatedly until they get all guilty or all innocent, but I don’t think that actually happens much. The courts are overtaxed as it is.
For one—IIRC, the 25% figure was based on the jury as a whole, rather than individual jurors.
And for another—I would actually be more surprised if my ‘evolution of society’ reasoning turned out to be true, then if I were mistaken. It’s mainly an attempted backfill to try to figure out just how the jury system would end up using ‘beyond a reasonable doubt’ and resulting in getting as close as possible to convicting ‘people who are more likely than not to break the law in the future’.
(I can’t seem to find the study in question with some quick Googling. I have a memory that the 75% certainty figure was for low-level crimes, such as theft, and that when the sentence was higher, such as for capital crimes, then juries tend to need to have a higher level of confidence before they’ll issue a guilty verdict; but that might have been a different study.)
That’s a misleading way to put it. If they’re right 75% of the time, they’ll think their probability of being right is over 90%. Trying to convince people that it’s 75% in practice will make them uncomfortable, so it might be better to choose examples where the outcomes aren’t important.
Could you offer any more specific thoughts on what those examples might be?
1:1 is 50%. If I’m 50% certain of something, then finding out whether or not it’s true would give me one bit of information. 0 bits is no information. That would be finding out something you were 100% certain of, with 1:0 odds.
Ah, that makes sense. You’ll want to make that clear. :-)
I can do that. :)
I wrote the chart some time ago, so I don’t have the reference handy anymore; but IIRC, the translation was that a 15⁄16 (93.75%) probability corresponded to 4 bits of information; a 7⁄8 (87.5%) probability to 3 bits; 3⁄4 (75%) to 2 bits; and 1⁄2 (50%) to 1 bit.