An adversary would need to try about 2H(W)−1 passwords in order to guess yours, on average.
This statement is wrong. If we take your example of choosing the password “password” with probability 19/20 and a uniform random 1000-character lowercase with probability 1/20 (which represents about 24700 different passwords), then the average number of tries needed to guess your password is about 12024700−1, indeed it will take the attacker 24700−1 tries on average in the unlucky situation where you have not chosen the obvious password. This is extremely far from 2235−1.
Um, huh? There are 2^1000 1000-character passwords, not 2^4700. Where is the 4700 coming from?
(added after realizing the above was super wrong): Whoops, that’s what I get for looking at comments first thing in the morning. log2(26^1000) = 4700 Still, the following bit stands:
I’d also like to register that, in my opinion, if it turns out that your comment is wrong and not my original statement, it’s really bad manners to have said it so confidently.
(I’m now not sure if you made an error or if I did, though)
Update: I think you’re actually totally right. The entropy gives a lower bound for the average, not the average itself. I’ll update the post shortly.
As for why I was confident, well, this was a clear (in my eye at least) example of Jensen’s inequality: we are comparing the mean of the log to the log of the mean. And, if you see this inequality come up often, you know that the inequality is always strict, except for a constant distribution. That’s how I knew.
As a last note, I must praise the fact that you left your original comments (while editing them of course) instead of removing them. I respect you a lot for that.
This statement is wrong. If we take your example of choosing the password “password” with probability 19/20 and a uniform random 1000-character lowercase with probability 1/20 (which represents about 24700 different passwords), then the average number of tries needed to guess your password is about 12024700−1, indeed it will take the attacker 24700−1 tries on average in the unlucky situation where you have not chosen the obvious password. This is extremely far from 2235−1.
Um, huh? There are 2^1000 1000-character passwords, not 2^4700. Where is the 4700 coming from?(added after realizing the above was super wrong): Whoops, that’s what I get for looking at comments first thing in the morning. log2(26^1000) = 4700 Still, the following bit stands:
I’d also like to register that, in my opinion, if it turns out that your comment is wrong and not my original statement, it’s really bad manners to have said it so confidently.
(I’m now not sure if you made an error or if I did, though)
Update: I think you’re actually totally right. The entropy gives a lower bound for the average, not the average itself. I’ll update the post shortly.
I apologize, my wording was indeed rude.
As for why I was confident, well, this was a clear (in my eye at least) example of Jensen’s inequality: we are comparing the mean of the log to the log of the mean. And, if you see this inequality come up often, you know that the inequality is always strict, except for a constant distribution. That’s how I knew.
As a last note, I must praise the fact that you left your original comments (while editing them of course) instead of removing them. I respect you a lot for that.