Vigeneres are potentially crackable, but they are very hard.
Unless your text is very short or your key very long, they are much easier than you apparently think. I have written programs that do a pretty good job of it.
(As MrMind says, if your key is as long as your text then this is a one-time pad, which indeed is uncrackable provided you have a reliable channel for sharing the one-time pad and don’t make operational mistakes like reusing the key. But that’s not a very common situation.)
I finally did get round to taking a look at this, and it isn’t yielding to the usual methods. In fact, it’s sufficiently unyielding that I am wondering whether it’s really the result of Vigenere-ciphering something that resembles ordinary English text with a key of reasonable length. I find the following:
The text is 146 characters long. I would expect that, with “ordinary” plaintext, to be enough to identify a key of at most about 7 characters without too much pain. I actually tried lengths up to 10.
For no key length k does the distribution of { (a-b) mod 26 : a,b are characters at the same position mod k } much resemble the distribution of differences mod 26 of randomly chosen letters with typical English letter frequencies.
For no key length k does the result of the following procedure produce something that looks as if most of the shifts are correct:
For each offset, count letter frequencies, and choose a shift that maximizes the sum of log(typical_prob) where typical_prob are those typical English letter frequencies again.
JRMayne, if you happen to be reading this and happen to have either a record or a memory of the message and how you encrypted it: can you confirm that (1) this really is the output of a Vigenere cipher, (2) the plaintext isn’t (statistically) outrageously unlike ordinary English, and (3) the key length is at most, let’s say, 10?
(If the key is much longer than that, there isn’t enough information for these simple methods to work with. Perhaps, e.g., doing a more expensive search and looking at digraph frequencies might be worth considering, but I’m too lazy.)
Gah. I don’t remember the solution or the key. And I just last week had a computer crash (replaced it), so, I’ve got lots of nothing. Sorry.
I am sure of (1) and (2). I don’t remember (3), and it’s very possible it’s longer than 10 (though probably not much longer.) But I don’t remember clearly. That’s the best I can do.
Offsets 10, 8, 25, 6, 6, 14, 21, 11, 12, 13, 11, 12, 15, 21, 1, 7 (that’s a key length of 16) yield the following decrypted plaintext: “failurearrivedquicklyatgjmshandsastheirprogramdefeatedmedepressinglyquicklyimustconcedethepointiwasoverconfidentandunderinformedandgjmfreakingwins”.
The actual process involved manual intervention, though I think a smarter program could have done it automatically. It went like this:
Consider key lengths up to 20.
Instead of picking the single “best” shift at each offset within the key, consider all shifts that come within 1 unit of best. (Meaning a factor of e in estimated probability.)
Display all the decryptions resulting from these options, unless there are more than 300 in which case display 50 random ones.
This produced a big long list of candidates. None of them was right, but some of them looked more credible than others. Keylength 16 seemed to let us do the best, so I focused on that and picked one of the better-looking candidates, which looked like this:
which has lots of plausible bits in it. It looks like it should maybe start with “failure”, a plausible enough word in this context, so I tweaked the fifth offset (yes! lots of other things got better too), and continued tweaking one by one as likely words and word-fragments jumped out at me. After a few tweaks I had the decryption above.
A program clever enough to do this by itself would need to have some inkling about digraphs or trigraphs or something, and then it would need some kind of iterative search procedure rather than just picking best offsets independently for each letter. Not terribly difficult, but it seemed easier to do something simpler-minded.
[Edited to add: I think the sequence of giveaway words for the tweaks was: “failure”, “quickly” #2, “overconfident”, done.]
I won’t actually try to crack this since writing the program would take more time than I’m inclined to spend here, but the principle is simple: Do frequency analysis based on assuming the key is a particular length. If taking every 5th character gives you a letter frequency that resembles English, but taking every 4th or 6th character does not, then the key is 5 letters long. Proceed from there.
Yup. (One handy trick is to take message—shift(message,k) mod 26 for candidate values of k, and see which has a histogram that looks most like that of (English text minus other English text). But it’s probably actually better to find good candidate shifts for each evenly-spaced subset of the text and look for ones that match up.)
JRMayne’s latest sample is still pretty short. It might require human cleverness to solve. If I have time tomorrow (and unless someone else has done it first) I may have a go.
Unless your text is very short or your key very long, they are much easier than you apparently think. I have written programs that do a pretty good job of it.
(As MrMind says, if your key is as long as your text then this is a one-time pad, which indeed is uncrackable provided you have a reliable channel for sharing the one-time pad and don’t make operational mistakes like reusing the key. But that’s not a very common situation.)
I agree that I made my key too long so it’s a one-time pad. You’re right.
“Much easier”? With or without word lengths?OK, no obligation, but I didn’t make this too brutal:
Vsjfodjpfexjpipnyulfsmyvxzhvlsclqkubyuwefbvflrcxvwbnyprtrrefpxrbdymskgnryynwxzrmsgowypjivrectssbmst ipqwrcauwojmmqfeohpjgwauccrdwqfeadykgsnzwylvbdk.
(Again, no obligation to play, and no inference should be taken against gjm’s hypothesis if they decline.)
I finally did get round to taking a look at this, and it isn’t yielding to the usual methods. In fact, it’s sufficiently unyielding that I am wondering whether it’s really the result of Vigenere-ciphering something that resembles ordinary English text with a key of reasonable length. I find the following:
The text is 146 characters long. I would expect that, with “ordinary” plaintext, to be enough to identify a key of at most about 7 characters without too much pain. I actually tried lengths up to 10.
For no key length k does the distribution of { (a-b) mod 26 : a,b are characters at the same position mod k } much resemble the distribution of differences mod 26 of randomly chosen letters with typical English letter frequencies.
For no key length k does the result of the following procedure produce something that looks as if most of the shifts are correct:
For each offset, count letter frequencies, and choose a shift that maximizes the sum of log(typical_prob) where typical_prob are those typical English letter frequencies again.
JRMayne, if you happen to be reading this and happen to have either a record or a memory of the message and how you encrypted it: can you confirm that (1) this really is the output of a Vigenere cipher, (2) the plaintext isn’t (statistically) outrageously unlike ordinary English, and (3) the key length is at most, let’s say, 10?
(If the key is much longer than that, there isn’t enough information for these simple methods to work with. Perhaps, e.g., doing a more expensive search and looking at digraph frequencies might be worth considering, but I’m too lazy.)
Gah. I don’t remember the solution or the key. And I just last week had a computer crash (replaced it), so, I’ve got lots of nothing. Sorry.
I am sure of (1) and (2). I don’t remember (3), and it’s very possible it’s longer than 10 (though probably not much longer.) But I don’t remember clearly. That’s the best I can do.
Drat.
Offsets 10, 8, 25, 6, 6, 14, 21, 11, 12, 13, 11, 12, 15, 21, 1, 7 (that’s a key length of 16) yield the following decrypted plaintext: “failurearrivedquicklyatgjmshandsastheirprogramdefeatedmedepressinglyquicklyimustconcedethepointiwasoverconfidentandunderinformedandgjmfreakingwins”.
The actual process involved manual intervention, though I think a smarter program could have done it automatically. It went like this:
Consider key lengths up to 20.
Instead of picking the single “best” shift at each offset within the key, consider all shifts that come within 1 unit of best. (Meaning a factor of e in estimated probability.)
Display all the decryptions resulting from these options, unless there are more than 300 in which case display 50 random ones.
This produced a big long list of candidates. None of them was right, but some of them looked more credible than others. Keylength 16 seemed to let us do the best, so I focused on that and picked one of the better-looking candidates, which looked like this:
which has lots of plausible bits in it. It looks like it should maybe start with “failure”, a plausible enough word in this context, so I tweaked the fifth offset (yes! lots of other things got better too), and continued tweaking one by one as likely words and word-fragments jumped out at me. After a few tweaks I had the decryption above.
A program clever enough to do this by itself would need to have some inkling about digraphs or trigraphs or something, and then it would need some kind of iterative search procedure rather than just picking best offsets independently for each letter. Not terribly difficult, but it seemed easier to do something simpler-minded.
[Edited to add: I think the sequence of giveaway words for the tweaks was: “failure”, “quickly” #2, “overconfident”, done.]
Fantastic!
Well done, sir.
So I think the conclusion is that solving Vigenere ciphers is easier than you thought it was but harder than you thought I thought it was :-).
I won’t actually try to crack this since writing the program would take more time than I’m inclined to spend here, but the principle is simple: Do frequency analysis based on assuming the key is a particular length. If taking every 5th character gives you a letter frequency that resembles English, but taking every 4th or 6th character does not, then the key is 5 letters long. Proceed from there.
Yup. (One handy trick is to take message—shift(message,k) mod 26 for candidate values of k, and see which has a histogram that looks most like that of (English text minus other English text). But it’s probably actually better to find good candidate shifts for each evenly-spaced subset of the text and look for ones that match up.)
JRMayne’s latest sample is still pretty short. It might require human cleverness to solve. If I have time tomorrow (and unless someone else has done it first) I may have a go.
(No, didn’t have time. Sorry. Probably no time on Monday either...)