I’m in the process of writing up an Internet Draft for a file-format, part of which involves assigning logarithmic confidence values measured in decibans. There’s still a good ways to go before it’ll be in good enough shape to even have a chance at being considered for an RFC; part of that is describing “decibans”, and how to use them, to people who’ve never heard of the things before. I’d like to get a good introductory text for that nailed down before submitting the next revision; and since LW is where I first learned of decibans, I’d like to evoke as much constructive criticism as I can get here.
Here’s my current draft of the text to replace the relevant section of the current revision:
3.1. Parameter: CONFIDENCE
Namespace:
Parameter name: CONFIDENCE
Purpose: To specify the confidence of the authority that the information of the given parameter is accurate, measured in decibans.
Value type: A single number, usually an integer.
Description:
A CONFIDENCE value of 0 decibans indicates odds of 1:1 (ie, 50%) that the information is correct. A change of 10 decibans changes the odds by a factor of 10; 10 decibans means 1:10 odds (~90%), 100 decibans 1:100 odds (~99%), −10 decibans 10:1 odds against (~10%). A change of 1 deciban is roughly equivalent to changing the odds by a factor of 5:4.
Here is a table covering enough integer deciban values to allow for easy reference.
Decibans / Level of belief / Rough Odds / notes
Given human factors, it is rare for hand-typed data to be able to have a CONFIDENCE that every single bit is accurate of more than 50 decibans. Without getting into the details of recursion, and given that at least one out of roughly ten billion people is thoroughly disconnected from reality, it’s very difficult for a human to have more than 100 decibans of confidence in anything, even that H2O is a useful description of water or that the subjective reality they are experiencing is connected to the same subjective reality experienced by other humans.
If a user wishes to manually generate signed vCards, but does not have much experience with mathematics, then one option to get rough estimates of what CONFIDENCE values are appropriate could be to use Laplace’s Sunrise Formula, also known as the Rule of Succession. This takes two pieces of input: the number of times in which something might have gone one way or the other; and the number of times it went one way. For example, it might be used with the number of times an email has been received from a particular address, and the number of times that email has been from the owner of that address instead of viral spam. The formula produces an estimate of the odds that future trials will go the same way, by calculating:
For the example, if one has received 1,000 emails from an address, out of which 1 was spam, then the formula says that the future probability will be on the order of (999+1) / (1,000+2) = 1,000⁄1,002. This implies that a CONFIDENCE value based on this data would be on the order of 30 decibans—but, barring other forms of evidence, it would take around 10,000 such emails before a claim of 40 decibans of confidence would be warranted.
Note that this is an extremely simple formula, and there are many better ones that can provide more accurate results, and take into account more kinds of evidence. Any user who knows of a better method to generate CONFIDENCE values should use those ways; the Sunrise Formula is provided as a basis for users who have nothing else to create estimates with.
More sophisticated Bayesian analyses can be used to create ad-hoc certificate authority systems. This would involve one vCard with an authority describing itself, and signing it; another vCard where that authority issues a card describing a second entity and its key, using the CONFIDENCE parameter to give its Bayesianically-generated level of belief; and a third card where that second entity describes a third, offering its CONFIDENCE level. A user with access to all the vCards could then determine, based on its own trust-level of the root authority, how much to trust the other entities. This trust of the root authority could be generated either with the Sunrise Formula, or with an analysis of web-of-trust data.
Help describing decibans?
I’m in the process of writing up an Internet Draft for a file-format, part of which involves assigning logarithmic confidence values measured in decibans. There’s still a good ways to go before it’ll be in good enough shape to even have a chance at being considered for an RFC; part of that is describing “decibans”, and how to use them, to people who’ve never heard of the things before. I’d like to get a good introductory text for that nailed down before submitting the next revision; and since LW is where I first learned of decibans, I’d like to evoke as much constructive criticism as I can get here.
Here’s my current draft of the text to replace the relevant section of the current revision: