Ah I see what you’re saying. Given a non-uniform password generation algorithm (like 99.9% “password”, 0.1% random 1000-bit ASCII string), taking the average entropy of the schema is misleading, since the average password has 50 bits of entropy but 99.9% of the time the password will have 1 bit of entropy, and in this case taking the worst-entropy result (1 bit) is more useful than the average.
I think you’re right, but this doesn’t seem very useful, since any password generation scheme where this is relevant is a bad idea (since if you switched to a uniform distribution, you could either have stronger passwords with the same length, or just as strong passwords with a shorter length).
any password generation scheme where this is relevant is a bad idea
I disagree; as the post mentions, sometimes considerations such as memorability come into play. One example might be choosing random English sentences as passwords. You might do that by choosing a random parse tree of a certain size. But some English sentences have ambiguous parses, i.e. they’ll have multiple ways to generate them. You *could* try to sample to avoid this problem, but it becomes pretty tricky to do that carefully. If you instead find the “most ambiguous sentence” in your set, you can get a lower bound on the safety of your scheme.
One feature of good password schemes is that you have some way to recover lost passwords.
Let’s say I chose as my password: “Tithptacsp,aiwwitcwwcaelp”. That password has plenty of entropy if you just look at it.
Then I might write down in some not “Entropy isn’t sufficient to measure − 3-1”. This allows me to go back to this post to look up the third paragraph and take the first sentence of it. Then I find the sentence “This is typically how people think about choosing strong passwords, and it works well in the case where we’re choosing among equally likely passwords” and can reconstruct “Tithptacsp,aiwwitcwwcaelp”. Sentences are also generally good mnemonics.
If someone would however know that I’m using LessWrong as my source for passwords this way, that would allow them to just go through all sentences on LessWrong posts which radically reduces the entropy.
Ah I see what you’re saying. Given a non-uniform password generation algorithm (like 99.9% “password”, 0.1% random 1000-bit ASCII string), taking the average entropy of the schema is misleading, since the average password has 50 bits of entropy but 99.9% of the time the password will have 1 bit of entropy, and in this case taking the worst-entropy result (1 bit) is more useful than the average.
I think you’re right, but this doesn’t seem very useful, since any password generation scheme where this is relevant is a bad idea (since if you switched to a uniform distribution, you could either have stronger passwords with the same length, or just as strong passwords with a shorter length).
I disagree; as the post mentions, sometimes considerations such as memorability come into play. One example might be choosing random English sentences as passwords. You might do that by choosing a random parse tree of a certain size. But some English sentences have ambiguous parses, i.e. they’ll have multiple ways to generate them. You *could* try to sample to avoid this problem, but it becomes pretty tricky to do that carefully. If you instead find the “most ambiguous sentence” in your set, you can get a lower bound on the safety of your scheme.
One feature of good password schemes is that you have some way to recover lost passwords.
Let’s say I chose as my password: “Tithptacsp,aiwwitcwwcaelp”. That password has plenty of entropy if you just look at it.
Then I might write down in some not “Entropy isn’t sufficient to measure − 3-1”. This allows me to go back to this post to look up the third paragraph and take the first sentence of it. Then I find the sentence “This is typically how people think about choosing strong passwords, and it works well in the case where we’re choosing among equally likely passwords” and can reconstruct “Tithptacsp,aiwwitcwwcaelp”. Sentences are also generally good mnemonics.
If someone would however know that I’m using LessWrong as my source for passwords this way, that would allow them to just go through all sentences on LessWrong posts which radically reduces the entropy.