Someone as clever, powerful, and rich as yourself can likely find a collision if you get to choose both source texts (which is easier than finding a collision with one of the two inputs determined by someone else).
To increase our confidence, I suggest you post hashes of the same prediction text made by several different algorithms, e.g. SHA2-512 and each of the SHA3 finalists. I also suggest you commit to the hashed prediction text beginning with the words, “I, Quirinus Quirrel (on LW) predict that: ”—so you can’t choose your entire source text.
I made a prediction with sha1sum 0000000000000000000000000000000000000000. It’s the prediction that sha1sum will be broken. I’ll only reveal the exact formulation once I know whether it was true or false.
Someone as clever, powerful, and rich as yourself can likely find a collision if you get to choose both source texts (which is easier than finding a collision with one of the two inputs determined by someone else).
This is actually much harder than you’d think. A hash function is considered broken if any collision is found, but a mere collision is not sufficient; to be useful, a collision must have chosen properties. In the case of md5sum, it is possible to generate collisions between files which differ in a 128-byte aligned block, with the same prefix and suffix. This works well for any file format that is scriptable or de-facto scriptable—wrap the colliding block in a comparison statement, and behave differently depending on its result. However, even for md5sum, it is still impossible to generate a collision between plain-text files with two separate chosen texts; nor is it possible to generate collisions between files that have no random-seeming sections, or that have random sections that are too small, not block-aligned, or are drawn from a constrained alphabet. (Snowyowl’s joke would require a preimage attack, which is harder still, and which won’t be available at first even if sha1sum is broken, so he will not be able to fulfill his promise to reveal a message with that sha1sum.)
Anyways, since you asked, here are a few more hashes of the same thing. I didn’t bother with the SHA3 finalists, since they don’t seem to have made convenient command-line utilities yet and I don’t want to force people to fiddle too much to verify my hashes.
impossible that is, not currently publically known to be possible—MD5 gets more broken all the time, so I wouldn’t want to be very confident about what is impossible.
Anyways, since you asked, here are a few more hashes of the same thing. I didn’t bother with the SHA3 finalists, since they don’t seem to have made convenient command-line utilities yet and I don’t want to force people to fiddle too much to verify my hashes.
There’s another reason not to do so. No one has thought strongly about how the different hashes would interact together. It wouldn’t surprise me if there were some way given the various hashes to extract information that would not otherwise be extractable given any single hash scheme. This is all the more plausible given that you’ve given the hash for the fairly weak md5. The multiple hashes you have given make it implausible that you could have multiple texts that lead to the same result; adding more hash types has more of an effect now of making it conceivable that a sufficiently interested individual could identify your text.
It’s extremely unlikely that a useful collision exists. The number of short paragraphs in English which make sense and describe a prediction is much, much smaller than the number of SHA1 values. However, if Quirrel’s prediction turns out to be very wordy, or comes in a form other than plain text, your suspicion will be confirmed.
sha1 seems likely to be broken sooner later than later:
http://en.wikipedia.org/wiki/SHA-1#SHA-1
http://code.google.com/p/hashclash/
Someone as clever, powerful, and rich as yourself can likely find a collision if you get to choose both source texts (which is easier than finding a collision with one of the two inputs determined by someone else).
To increase our confidence, I suggest you post hashes of the same prediction text made by several different algorithms, e.g. SHA2-512 and each of the SHA3 finalists. I also suggest you commit to the hashed prediction text beginning with the words, “I, Quirinus Quirrel (on LW) predict that: ”—so you can’t choose your entire source text.
I made a prediction with sha1sum 0000000000000000000000000000000000000000. It’s the prediction that sha1sum will be broken. I’ll only reveal the exact formulation once I know whether it was true or false.
This is actually much harder than you’d think. A hash function is considered broken if any collision is found, but a mere collision is not sufficient; to be useful, a collision must have chosen properties. In the case of md5sum, it is possible to generate collisions between files which differ in a 128-byte aligned block, with the same prefix and suffix. This works well for any file format that is scriptable or de-facto scriptable—wrap the colliding block in a comparison statement, and behave differently depending on its result. However, even for md5sum, it is still impossible to generate a collision between plain-text files with two separate chosen texts; nor is it possible to generate collisions between files that have no random-seeming sections, or that have random sections that are too small, not block-aligned, or are drawn from a constrained alphabet. (Snowyowl’s joke would require a preimage attack, which is harder still, and which won’t be available at first even if sha1sum is broken, so he will not be able to fulfill his promise to reveal a message with that sha1sum.)
Anyways, since you asked, here are a few more hashes of the same thing. I didn’t bother with the SHA3 finalists, since they don’t seem to have made convenient command-line utilities yet and I don’t want to force people to fiddle too much to verify my hashes.
sha512sum: 85cf46426d025843d6b0f11e3232380c6fac6cae88b66310ee8fbcd3f81722d08b2154c6388ecb1ee9cebc528e0f56e3be7a057cd67531cfda442febe0132418 sha384sum: 400d47bf97b6a3ccd662e0eb1268820c57d10e2a623c3a007b297cc697ed560862dda19b74638f92a3550fbbfe14d485 md5sum: 8fec2109c85f622580e1a78c9cabdab4
impossible that is, not currently publically known to be possible—MD5 gets more broken all the time, so I wouldn’t want to be very confident about what is impossible.
sha512sum: 85cf46426d025843d6b0f11e3232380c6fac6cae88b66310ee8fbcd3f81722d08b2154c6388ecb1ee9cebc528e0f56e3be7a057cd67531cfda442febe0132418 sha384sum: 400d47bf97b6a3ccd662e0eb1268820c57d10e2a623c3a007b297cc697ed560862dda19b74638f92a3550fbbfe14d485 md5sum: 8fec2109c85f622580e1a78c9cabdab4
There’s another reason not to do so. No one has thought strongly about how the different hashes would interact together. It wouldn’t surprise me if there were some way given the various hashes to extract information that would not otherwise be extractable given any single hash scheme. This is all the more plausible given that you’ve given the hash for the fairly weak md5. The multiple hashes you have given make it implausible that you could have multiple texts that lead to the same result; adding more hash types has more of an effect now of making it conceivable that a sufficiently interested individual could identify your text.
I don’t think you get notified when people reply to you top-level, so I’ll ask here in case you forgot—any update on this?
It’s extremely unlikely that a useful collision exists. The number of short paragraphs in English which make sense and describe a prediction is much, much smaller than the number of SHA1 values. However, if Quirrel’s prediction turns out to be very wordy, or comes in a form other than plain text, your suspicion will be confirmed.