Cryptanalysis as Epistemology? (paging cryptonerds)
Short version: Why can’t cryptanalysis methods be carried over to science, which looks like a trivial problem by comparison, since nature doesn’t intelligently remove patterns from our observations? Or are these methods already carried over?
Long version: Okay, I was going to spell this all out with a lot of text, but it started ballooning, so I’m just going to put it in chart form.
Here is what I see as the mapping from cryptography to science (or epistemology in general). I want to know what goes in the ”???” spot, and why it hasn’t been used for any natural phenomenon less complex than the most complex broken cipher. (Sorry, couldn’t figure out how to center it.)
EDIT: Removed “(cipher known)” requirement on 2nd- and 3rd-to-last rows because the scientific analog can be searching for either natural laws or constants.
Could one possible answer be that we still don’t understand the plaintext behind the encryption? In other words, attempting various decryption seems to rely on the fact that you know what you’re looking for underneath. You know what the end result should be.
The end result should be bits that are ordered and make up 1) some sort of filesystem and 2) files on that filesystem that are coherent as to be interpreted by something as intelligible data (e.g., something that a text editor, image viewer, or audio player could then interpret for you into words, sights, or sounds).
But… what if I did the following:
create one partition, /dev/sda1
cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/sda1
open and mount the encrypted partion
dd if=/dev/urandom of=/mnt bs=1M
unmount and shutdown
Now you use whatever tools you want to “decrypt” my drive. Imagine you are able to identify my algorithm and even my actual password. Groovy; but now you go about studying the data and begin vigorously banging your head against a wall. Why doens’t it make sense even though it’s decrypted??
Perhaps bad analogy. My suspicion is that unless we both successfully decrypt and know what we were looking for, “decryption” wouldn’t help make the underlying information any more intelligible, even though nature doesn’t destroy it’s patterns.
After a re-read just to make sure no “stupid-alarms” went off, maybe I don’t understand the ”???”, either. I guess the other examples struck me as cases where we do know what we’re looking for—some correlation, a difference between a control and experimental group while varying just one thing between them, watching objects to see what they do and how to describe their behavior using set values.
I took the last box to be “how do we run advanced decryption on the remaining unknowns” and I’m saying that it might come down to knowing what we would be looking for if we successfully decrypted the information before using those techniques would be useful. And this seems like the case in the rest of the examples.
I wonder if it is even meaningful to ask what the ‘plain text’ might look like.
Suppose we somehow managed to decrypt the universe and obtain its Theory of Everything. Then we notice that there’s a pattern in the Theory which can be easily compressed (maybe a repeated constant bitstring). Wouldn’t we then appeal to Kolmogorov and say the ‘real’ Theory of Everything is the shorter program which generates our larger Theory with its repeating constants?
And so on, for all the patterns we find, until we wind up with a Theory which looks like a substring of Chaitin’s Omega—all completely random bits? At which point, why do we think we actually decrypted the random observations into this random Theory-string?
But maybe I’m just saying something someone else has said here or is implied by the general line of thought?
I said more about this in the longer version of the article, but regarding your analogy, that’s related to the distinction between the known-cipher and unknown-cipher case. Your analogy, to the extent that only you understand what remains after applying the standardized decryption, actually has one more cipher at the end that the attackers don’t know, which puts it into the “unknown cipher” case. (EDIT: Yikes! That previously said “unknown cipher-text” and has since been corrected.)
And this indeed is a harder kind of cryptanalysis, and I expect cryptanalytic methods to be less productive in the corresponding areas of science—to the extent that the attacker must first learn the cipher, they have a lot more work to do. But if it’s a simple cipher that, say, falls prey to simple frequency analysis, then even that can be broken.
For a further explanation, here’s an excerpt of what I was had in the longer-longer version:
Now, there are two kinds of cryptanalysis I’ll focus on, and for which I’ll give the mapping to a problem in science:
Known-cipher cryptanalysis: Cryptographers usually focus on this kind of attack, because it’s usually assumed that “the enemy knows the cipher” (but not the key), or at least, they must make the system safe even in the case where the enemy does know the cipher. This is very much like the problem of parameter estimation. …
Unknown-cipher cryptanalysis: This is, of course, tougher than the known cipher kind, although (for reasons I won’t go into here), cryptographers advice against choosing a cipher based on “the enemy doesn’t know of it!” However, over the history of codes and codebreaking, more general methods were developed that allow you to find regularities in the ciphertext from which you can infer the plaintext, even if you didn’t originally know what cipher was being used.
[end excerpt]With that in mind, I don’t think your analogy carries over as an explanation for why cryptanalysis would fail on scientific problems. Remember, the plaintext being sought is in the form of observations. To the extent that we can infer the plaintext, where the plaintext is yet-to-be-observed data, then we have cryptanalytically solved a scientific problem because we can predict the data, even if we can’t ascribe any deeper meaning to it.
In short:
1) The plaintext refers to the meaningful plaintext, and so your example implicitly adds one more unknown cipher, a case generally ignored by cryptographers because of Kerchoff’s law.
2) In science, once you can consistently predict future data, you have the (analog of the) plaintext.
Maybe. But I don’t know that at that point it’s a question about what cipher was used… just how to make the resultant data useful for, as you said, predict future data.
You seem to be specifically asking about how to “crack the code”—I’m saying that even if you “cracked the code”, you would still need to understand what’s going on with the underlying plaintext.
Maybe put it this way… assume that the underlying text is a data dump of (x,y) coordinates and to “predict future data”, you need the equation of the best fit line. We very well might be saying the same thing, but here’s how I see things:
Me: You arrive at this record encrypted with a cipher. You decrypt it. But then you still don’t know how to make the data useful because mathematics is not currently advanced enough to produce a curve for the best fit line.
You: You arrive at this record encrypted with a cipher. You decrypt it. The fact that you can’t make it useful means it’s still encrypted.
Does that seem like an accurate representation?
If so, then perhaps it’s not exactly encryption, but “writing an interpretation program that can take what we don’t yet see as a useful data-stream and making it useful.”
Thus, we could look at whatever it is in nature you want to look at, run some program on it, and have an increase in prediction abilities.
Put one last way, imagine that MS Word has not been invented yet but “nature” writes her secrets in a .doc file and then encrypts it. I think we’re disagreeing primarily on whether or not once it’s decrypted if the resultant file is still “encrypted” or not.
I get the sense that you’re saying it is, and I’m saying it’s not. According to nature, it’s right there for the taking. According to you, we still need one more level of “decryption”—inventing MS Word.
Yes, I think that’s a fair representation of my position, and let me say a few words in defense of it. Messages (and, in the realm of security, assets in general) always exist within the context of a larger system that gives them significance. For example, to evaluate the effectiveness of a lock on a shed, you must be mindful of the value of the assets in the shed and the cost of breaking the lock.
Likewise with messages: if you’ve “decrypted” a message, but don’t understand its relationship to the real-world assets the message is predicated on, your decryption is not complete. Hence the problem of steganography: you may be able to plainly read that a postcard says “the dolls are on hold because of a problem in the north, but the scarves are in production”, but it’s still employing a species of encryption to the extent that you don’t recognize that “dolls are on hold … in the north” means that the San Franciso harbor lost production capacity.
Likewise with foreign-language plaintexts. If your guys don’t know Japanese, and don’t know how to translate, the Japanese plaintext is still effectively encrypted. And whether or not you want to still claim that it’s not “really” encrypted, your team will have to do the exact same things to extract its meaning that you would do if it were “really encrypted”—i.e., discern the relationship between “ciphertexts” (the Japanese language) and the desired form of plaintext (English).
Btw, some followups: I’ve corrected an error in the previous post that fortunately didn’t cause a misunderstanding.
Also, I’ve argued the same thing you have—that you can improve encryption by adding an additional layer of plaintext uselessness. (One idea I presented in another forum was to talk in l33tsp34k so that you waste a human’s time deciphering the message even if they can get to the plaintext. It bothered me that my idea was dismissed without explanation.)
One of my goals in learning crypto is to know if and when this actually would improve security, and as best as I can tell, it doesn’t—as ciphergoth suggested to me, it makes your cryptosystem weaker where it’s already the weakest and stronger where it’s already strong enough. And it’s the weakest link that matters.
I think you’ve done well at that, and it makes sense. As a result… I’m just not sure I know at present how one might apply these principles to eliminate any forms of encryption and make anything in nature useful for improving prediction!
Could you explain if this was sort of an “aside continuation” re. the leetspeak comment or about encryption in general? In other words, are you saying that it doesn’t help to add an “underlying” layer of additional encryption, or that encryption, in general, doesn’t doe anyone much good?
First, just to make clear, those were separate events, far removed in space and time. Ciphergoth’s remark was not directed at my leetspeak idea, but the principle applies just the same.
Encryption does accomplish the goal of raising the cost of accessing your messages to the point of infeasibility, and I wasn’t trying to deny that. To expand on the point about the underlying encryption: generally, the published cipher you use to encrypt your data is far stronger than it needs to be to keep your data safe; any succesful method of attacking it would go after the implementation of it, and in particular the people that need to make it work. Hence this comic.
So let’s look at your idea. Another relevant weakpoint of a cryptosystem would be the difficulty for the user of not messing it up, and this critically relies on you being able to access your (full) key. If you have to simultaneously do the work of remembering this secret language, this can mentally exhaust you and increase your rate of error in implementing the protocol—so it adds a weakness to the weakest part of the system (the human) just to strengthen the part that’s already the strongest (the encryption itself).
Does that make sense?
Yup.
Also have seen that comic. So, you were basically saying:
encryption helps, and the actual encryption is the strongest point
the weak points are the people (prone to messing up) and (filling in my own list) things like leaving your computer on and unattended, only locking your screen vs. shutting down, having non-encrypted swap/var/etc, and the like.
further, trying to strengthen the weak points is generally accomplished via a method that further increases susceptibility to human error, whereas the encryption itself was fine as it its.
Sound good?
That’s correct, except your last bullet (to be clearer, if this is what you meant) should say that you can strengthen the weak points by making the system less susceptible to human error.
I was about to say no, but realize that I did write that incorrectly. To re-write, it would have redundantly said:
Does that help? I did have it wrong, thinking that implementing some form of hand-done encryption was an attempt to strengthen the weak points… but such activity would actually be in the “already-strong” category.
You have it correct now. :)
Where did you get this? In modern cryptanalysis, it’s counted as a win for the cryptanalyst if he or she can find any pattern in the ciphertext (i.e., give an algorithm that can distinguish a ciphertext from a random string with probability > .5). But even when this is true, it’s usually not the case that one can then go on to infer the full key or plaintext.
I oversimplified there [1], but even with your corrected phrasing of the situation, that suggests some room for cryptanalysis-based improvement in science, because finding any pattern distinguishable from randomness is a win for science as well.
Given that nature does not deliberately or intelligently add the entropy that halfway-decent ciphers add, then the patterns belonging to the weaker, lower-entropy set of observable data that nature is limited to should be a much easier problem for cryptanalysts than the ciphers they normally try to attack.
In other words, when it comes to injecting entropy into ciphertexts, nature couldn’t hold a candle to even the easily-broken ciphers. Right?
[1] To be precise, it would have to be something more tautologous like, “For a given cryptanalytic goal, the best cryptanalytic methods can meet that goal for a cipher with the level of complexity in the most complex cipher broken for that goal and all ciphers of lower complexity.”
I’d say that science and cryptanalysis both share the ideal of (or can be viewed as) trying to approximate Bayesian updating on a Solomonoff prior. In cryptanalysis, you have the disadvantage that your opponent is intelligent and malicious (from your perspective), but you have the advantage that the encryption process doesn’t have much computing power (since ciphers have to be fast to minimize computing costs). In science, nature is not malicious, but it’s not limited in computing power either. Yes, they are basically similar, but each is specialized to its domain of inquiry, so I doubt science can learn much from cryptanalysis.
I just remembered EY’s post that alien message, and I don’t think I’m making too dissimilar an argument: In EY’s story, the aliens aren’t as smart as humans, so we can infer the patterns in their messages, predict new ones, and inject new code.
As aliens were to those humans, so is nature to us.
Nature’s also not very intelligent, as elaborated on here and here.
I explained one historical instance of scientists going through the same process as cryptanalysis in my reply to Nancy.
That’s a good distinction.
Cryptanalysis is absolutely dependent on the forward direction being tractable. That is, given a candidate cipher, key and plaintext, we can quickly compute what the resulting ciphertext would be.
Nature does not possess this property. For example, to compute the shape of a single protein molecule from quantum electrodynamics would take longer than the estimated lifespan of the universe.
To deal with this, we have to use an arsenal of tricks: shortcuts and approximations, intermediate observations, partial understanding gathered in simple cases that are more tractable etc. But these tricks depend on nature being impartial. To put it bluntly, there’s no point in approaching it as a problem of cryptanalysis, because if a cryptographic adversary has exponentially more computing power than you have, it’s game over no matter what you do; conversely, most techniques of science don’t work in cryptanalysis because ciphers are designed to make sure they don’t work.
I don’t understand—if you want to compute anything, even “tractable” problems, by modeling their implementation down to the QED level, it takes too long—even the Casesar cipher would (because I’d either have to model my neurons carrying out the operation, and their subatomic interactions, or do the same for a semiconductor).
Yes, nature has exponentially more computing power, but that doesn’t make her resistant to e.g. differential cryptanalyis on at least some problems. (An experiment in which you gradually vary one parameter is just a chosen-plaintext attack using differential analysis).
What matters is whether that computing power is directed in a way that “covers up its tracks”, and thus, to what extent it has obscured the relationship between the input and the output. Just as evolution, being restricted to using small local improvements, cannot refactor a system the way an engineer of human intelligence could, nature cannot direct her resources to design better one-way functions the way human cryptographers can.
Yes, what happened there is this: to solve a scientific problem, scientists use some method to detect patterns that is isomorphic to a cryptanalytic attack. Because later ciphers were designed, by human intelligences, to be resistant to known mathematical techniques, the cipher designers, as you mention, destroyed this pattern.
But how does this prove that the problems of science and cryptanalysis are unrelated? It proves just the opposite: the only reason scientific pattern-finding heuristics can’t be carried over to ciphers (contemporary with that science) is because an intelligence designs the ciphers that way. There is no intelligent designer working for nature that can prevent it from working in the opposite direction.
Question to anyone who knows something about cryptanalysis:
I write a message in Klingon to a friend. You intercept it. You’ve never heard of the Klingon language before and have no information about it whatsoever. Is it possible to “decrypt” the message and produce an English translation, assuming that the message is long enough?
In other words, is it possible “in principle” for the Imperial Japanese to have deciphered the messages sent by the Navajo code talkers without having access to a Navajo speaker?
The answer to this is far from clear. Part of the question is more linguistic and philosophical than cryptanalytic. There are historical examples of languages that have been successfully deciphered. The two most famous are probably Egyptian hieroglyphics and Linear B. While the first example we had additional texts (especially the Rosetta stone) that allowed us to connect it to an existing language, in the case of Linear B, there was no similar linking text. Linear B turned out to be a form of Greek, but this wasn’t used at all in the decipherment until the very last stages when this had become very apparent (as I understand it, most of the researchers at the time thought that it was not a form of Greek). But there would probably be many words in Linear B that we would not be able to translate today if not for the fact that they have recognizable Greek cognates and counterparts.
But the case of Linear B is very different than the hypothetical case of “Klingon”. Actual Klingon is very similar to a variety of human languages. It isn’t obvious that a language belonging to a genuinely different species with absolutely no cultural or physical context would be decipherable in any useful way. Human languages generally share some basic aspects and it isn’t obvious that those basics would be shared by a language from another species.
If one had more than just language one might be able to decipher things based on the connections to physical reality. So for example, one might be able to recognize a version of the periodic table even if it were arranged in a very different fashion (humans have made a large number of different forms, some three dimensional). And if the text contained material designed to assist in understanding then the situation might be easier even if the language is radically different from anything humans have encountered before. For examples it might have something like “1 . 2 … 3 … 4 …. 5 …..” up to some large number and then having something like “Primes 2.. 3… 5..… 7.......” going out to some distance. Note here I’ve assumed that primes are a concept that another species would even find to be interesting enough to consider. But many simple sequences would be reasonable starters, such as squares or powers of 2. This doesn’t address the situation you cared about which was addressing the cryptanalytic analog between language and the universe. Presumably the universe isn’t trying to be deciphered. And the statement of your remark seems to imply that the message isn’t intended to be deciphered.
You mention a direct historical example which suggests that cryptanalysis of unknown languages can be very tough. In World War II, the United States employed so called “code talkers” who used obscure, generally Native American languages, as secret codes. The use of Navajo in this fashion is the most famous although some other languages were used as well. In this case, even when the Japanese knew towards the end of the war what languages were being used they were unable to crack the codes. However, by the end of the war the codes were not just simple spoken Navajo but had (somewhat weak) cryptography added in and the combination seems to have been what really created the trouble. Note also that in this case the Japanese had a large amount of physical context for the messages since they knew that they were military messages and even knew which messages corresponded (very roughly) to which events.
So the bottom line is that there’s evidence both ways, and it might depend a lot on how different an alien language would be from humans and whether or not the text is intended to be deciphered.
By “Klingon” I was literally referring to the artificial language Klingon as invented by humans, but I really meant it as a stand-in for “any natural language you both don’t know and don’t have any reference texts for.”
Oh. Well that renders most of my response irrelevant. Then the answer is “probably yes”. Getting the basics of the grammar won’t take much effort. So one can tentatively identify which words are verbs, nouns, adjectives, etc. Assuming that you aren’t going out of your way to be terse with your speech, there will be a fair bit of redundancy that should in a large enough text become obvious. For example, if we’ve identified how one says “and” in the language or at least the version for nouns, then might be able to identify the plural form for verbs, or something close to that. Moreover, if we see a word that frequently shows up near the word for “and” we could tentatively guess that that was a word for two as a cardinal number. Similarly, one might be able to get three as a cardinal number. This gets a handle on your number system.
In the direct context of Navajo which you used as your other example, one also has a correspondence with physical events which if one had that data could potentially help a lot.
So if a time-traveling mischief maker gave the NSA a copy of “The Klingon Hamlet” in 1980 (minus all the English text), would they have been able to “decrypt” it?
That’s a difficult case. It would depend on the resources thrown into it. If it were formatted as a play it wouldn’t take long for someone to notice that, and the five acts would to an English speaker suggest Shakespeare. (This isn’t a hypothesis someone would immediately hit upon but it is the sort of thing that someone would eventually think of.) At that point things will be much easier since we have an effective Rosetta stone. However, note that even with the Rosetta stone, the deciphering of Egyptian hieroglyphs took a very long time even after the main breakthroughs.
If the text were not the text of Hamlet but were a similar random text of the same length, then it is almost certainly not long enough to be decipherable in any reasonable span of time. I don’t however have any idea how much longer the text would need to be.
Seconded, I’ve wondered the same thing myself and voiced similar concerns here.
It has always perplexed me how WWII US cryptographers managed to get anything done, when the plaintext still looks like gibberish—further complicated by a novel encoding betweeen a non-western script and EM signals.
My partial answer is this: you would not be able to accomplish anything unless you know something about the real-world referents of the code. So you’ll have to do something like a known-plaintext attack. For example:
Send 3 planes to Island A, listen to enemy’s chatter.
Send 5 planes to Island A, listen to enemy’s chatter.
Send 3 planes to island B, listen … .
Send 3 boats to island B, listen…
From the differences between the chatter at those times, you can figure out the symbols for island a, island B, plane, boat, 3, and 5.
(Incidentally, I remember you comparing foreign language learning to English lit in that everything’s arbitrary and not model based. But it helps to think of learning a foreign language as cryptanalysis through a “known plaintext” attack and differential analysis. Given a bunch of phrases and their foreign translations, find the differences and infer how the language works.)
I believe the Germans had a policy of starting every message with some standard boilerplate, so Allied cryptographers were usually able to perform known-plaintext attacks with only passive monitoring as long as they observed any one message while it was still unencrypted.
Also the British cryptographers made a practice of “gardening”; before a German expedition was to depart, they’d mine an area so that they’d have known plaintext to work with. I imagine that helped a lot too.
Do you have any scientific questions in mind which you think would be especially susceptible to cryptoanalysis?
I’m embarrassed to admit this, but I didn’t. I guess it could go with whatever intractable scientific problems currently require large computational brute-force approaches [1]: fluid dynamics (including combined free/forced convective heat transfer), protein folding, weather, electron wave functions, … .
The general approach would be:
1) Base a cipher on the difficult aspect of some scientific problem.
2) Publish the cipher.
3) Wait until someone finds a shortcut to decrypting the cipher without the private key.
4) Simplest explanation that fits the data!
Remaining problem: It’s not enough for a scientific problem to be difficult, it must have a trapdoor aspect to it: something that, for the entire problem class, makes it easy to solve if you know it. I’d have to think about how to get that to work for the one domain I’m most familiar with (fluid dynamics).
[1] I guess they’re not technically brute force, because they don’t try a bunch of solutions in parallel, but you have to do a lot of grind to find the solution that fits all the constraints.
The one which nagged at me was whether cryptographic methods would have been useful in deducing the periodic table. I have no idea whether they would have helped, but it’s a relatively simple pattern underlying a lot of data.
More generally, are there past problems where cryptographic methods would have helped?
Arguably, the entire history of classical astronomy is one big case of frequency analysis: people noticed the repeating patterns in the observable cosmos (ciphertext) to infer future positions of celestial bodies (the plaintext) and the relative period lengths to determine the relative positions of them all and our position/view direction within it (private key).
First, they noticed that light and dark cycle, and called those days. They noticed that moon phases and seasons cycle and came up with years and moon charts. They noticed “wanderers” like mars and venus, and the cycling of those observations led them to postulate future appearances and the kind of cosmos that would lead to our observations. And so on.
I think the answers are different for reversible physics versus irreversible physics. Reversible stuff like Newton’s laws applied to astronomy should be easy enough over short enough time periods, and there would be some long-solved particle physics that is similar.
In larger systems a model using the reversible physics would be intractable to simulate or record in complete detail.
Now that I think about it, there are two types of irreversibility of models I was thinking of: The ones where information is lost over time, for example differences in temperature being lost once they have evened out, and the ones where apparently random “information” is gained, for example when immeasurably small initial conditions in a chaotic system have a measurably large effect later, or when something outside the modelled system perturbs it.
This basically corresponds to deleting bits or randomising bits.
Another building block for crypto is mixing information up together without losing or gaining bits—this doesn’t require irreversibility.
In cryptography, you need to leave enough information in the message so the receiver can decode it with the private key, and in such a way that it can be done efficiently. This constraint doesn’t apply to nature.
One thing I wonder about is whether there’s a natural equivalent to cryptographic hashes—a class of system where two people who know the initial state will end up with the same final state, but you can’t easily compute the initial state from a final state, or find two different initial states that end in the same final state.
Perhaps it would be instructive to look at both Science and Cryptoanalysis as special cases + approximations of bayesian updating. Then compare the assumptions that make the special cases work well in both cases and see if they share lots of important assumptions. If they do, that suggests there might be crossover lessons; if not then whatever the big different assumptions will help you understand how such crossovers might fail to exist.
Well, trying to understand Mother Nature is like ‘crypto-analysis’ in some metaphorical sense, but the methods of real crypto-analysis do not carry over well to, say, physics, or? Except for ‘rubber hose crypto-analysis’ which is better known as ‘experimentation’ :)
What can be done is the kind advanced statistical analysis, where you take all you measurements, tabulate them and calculate the correlations between them. Current statistical tools may not be too good at the kind of relations there exists, that may be an area for research...You take your AI capable of doing that, feed it the LHC-data, and a new quantum theory rolls out...
“as many patterns in the ciphertext as you can” is too vague to be useful.
If it were specific, I doubt we have any such methods. What makes you think that we do?
I was basically saying (tautologically) that we can break any cipher except for the ones where we can’t. So if you use any of those broken ciphers, an attacker can infer the cipher, key, or plaintext, to the extent that it’s possible.
My non-tautological inference, then, is that nature isn’t able to intelligently design ciphers, and so her patterns should be much easier for a cryptographer to discern than those that exist in many human-designed ciphers. A good cipher destroys the patterns that would otherwise clue in the attacker on the key or plaintext, and nature should be a lot worse at this, and a lot more limited in her “cipher design”. (For example, unlike with AES, she can’t layer a cipher so that it first destroys linearity, then permutes the whole thing to resist differential analysis, etc.)
Furthermore, cryptographers warn against using a secret cipher you designed, and this is partly because you won’t be able to find all the possible attacks on your own. In other words, for the average person, even if you intelligently design a cipher, an attacker can infer the cipher (by finding its patterns) and plaintext, even if they didn’t know the cipher to begin with. Since nature’s “designs” will be even less intelligent, they probably aren’t resistant to the cryptanalytic methods used in the unknown-cipher case.
Also see my reply to Nancy where I argue that discoveries in astronomy followed the same pattern as frequency analysis.
An interesting thought. Do you think that code-breakers are likely to have anything to teach scientists? I am having visions of taking code-breaking software, inputting scientific data, and unraveling the secrets of the universe.
So am I (albeit maybe in a more limited sense). I created this topic to find out if any crypto experts here noticed any of the parallels I did. Seeing as they are skeptical about the possibility that code-breakers have anything to teach scientists, I think I’ll have to develop this idea more before I can be more justified in believing they do or don’t.
In particular, I’ll want to make the mappings between the plaintext and ciphertext to their science analogs more explicit. Also, I’ll want to design ciphers based on physical laws and see how code-breakers would infer the cipher (both in cases scientists have solved and those they haven’t).
I first thought that this would require (the very difficult task of) basing a trapdoor one-way function on a physical law, but now I don’t think so, because I needn’t make it a public key algorithm—a pure private key cryptosystem (on the assumption Alice and Bob have securely shared the cipher and key) would work as well, as cryptanalysts can break many of these kinds of system. And those kinds (like the Caesar cipher) don’t involve a trapdoor one-way function.
I don’t think this is true for ciphers that are anywhere near as complicated as the physical world is. For inferring models of limited complexity on limited numbers of variables in constrained form, however, there’re some pretty good algorithms in Pearl’s book Causality, based on conditional correlation and independence.
Right—I tried to make clear that my concern is only about those phenomena that are less complex that the most complex broken cipher. For unknown-physical-law cases (i.e. we don’t even know the dynamics of the phenomenon), the comparison is to ciphers that can be broken even if you don’t know which cipher or public key is being used; for unknown-constants cases (where we know the form of the equations), that also includes ciphers requiring knowledge of the cipher and public key to break.
True, but there are broken ciphers that are resistant to a cryptanalytic version of this, which suggests that Pearl isn’t covering all the ways to find regularity in data.
Indeed. I haven’t finished Pearl yet, but from what I’ve read so far, it doesn’t look like his models can handle iteration, vector-typed variables, model priors other than one particular interpretation of Occam’s razor, or hidden variables more complicated than a two-variable correlation. So there’s a lot of theory left to build, and cryptanalysis may have some lessons for that theory.