I don’t think that attack is practical, as long as Decoy leaves the metadata alone and works only on the image data. You’d need to reproduce the inputs to a particular implementation of the image encoding exactly, which is impossible unless you’re snooping the raw data—my phone camera produces images in JPEG format (high quality, but it’s still lossy compression) and does the conversion before the raw image data even leaves RAM.
If you’re dealing with images originating off the device, things get both easier and more difficult. Easier because there will typically be unchanged images in the wild to compare against; more difficult because there will typically be several different copies of an image floating around, and I don’t think it’s practical to reconstruct every possible chain of encodings. Many popular image-hosting sites, for example, reencode everything they get their grubby little paws on. Send an image as a text, that’s another reencoding. And so forth.
As I’ve mentioned elsewhere, though, decoy images may be statistically distinguishable from an untouched JPEG even if you can’t conclusively match it to an origin or e.g. validate against its EXIF tags—though I could be proven wrong here with the right analysis, and I’d like to be.
Your first paragraph nails it. Unless your phone is both jail broken and seriously compromised, there is no means of viewing the “original” version of either picture. Also re: the second paragraph. The app forces you to take a picture from your device to use as the “Decoy”, it will not allow you to use an off-device image. (You CAN use an off-device image as the hidden picture).
As for the statistical analysis, it’s mostly irrelevant. The encoding algorithm is both reversible and published. So you can extract “Decoy data” from ANY picture that you find, Decoy or no. The only thing that will confirm it one way or the other is a successful decryption. The best you could do is say, “Based on certain telltales, there’s a 10% chance this image is a Decoy” or whatever the odds may be.
Such an attack has little to no value. If you are an attacker with a specific target, isolating which pictures are decoys removes a trivial amount of entropy from the equation, especially compared to the work of trying to brute-force an AES-encrypted ciphertext.
As for the statistical analysis, it’s mostly irrelevant. The encoding algorithm is both reversible and published. So you can extract “Decoy data” from ANY picture that you find, Decoy or no.
I understand that, and I understand that it should be impractical to decrypt the hidden image without its key given that strong attacks on AES have not yet been publicly found (key exchange difficulties, which are always considerable, aside). But I think you’re being far too nonchalant about detection here. The fact that you can extract “decoy data” from any image is wholly irrelevant; it’s the statistical properties of those bits that I’m interested in, and with maybe a million bits of data to play with, the bias per bit does not have to be very high for an attacker to be very confident that some kind of steganography’s going on.
That does not, of course, prove that it’s being used to hide anything interesting from an attacker’s point of view; but that was never the point of this objection.
Well, my point has never been that it’s impossible for an attacker to be confident that you’re using steganography. Rather it’s that an attacker cannot prove with certainty.
The “decoy picture” aspect of the protocol is intended to provide social protection and ensure plausible deniability can be maintained. It is not intended as cryptographic protection, that is what the AES is for.
“Confidence” is only useful to an attacker when it comes to determining a target. But an attacker has to already be confident in order to perform such a test in the first place. Which means you’ve already been selected as a target. Furthermore they would have to compromise enough of your security to access your image data. If that happens, then the benefit of gaining further confidence is marginal at best.
Incidentally, regarding the specific details of such a detection method:
We (and the attacker) already know that the distribution of base64 characters in an AES-encrypted ciphertext is approximately random and follows no discernible pattern. We also know that the ciphertext is encoded into the last 2 bits of each 8-bit pixel. So, we can, with X amount of confidence, show that an image is not a Decoy if we extract the last 2 bits of each pixel and discover the resulting data is non-randomly distributed.
However, because it is possible for normal, non-Decoy, compressed JPEGs to exhibit a random distribution of the data in the last 2 bits of each pixel, the presence of randomness does not confirm that an image is a Decoy.
The only viable attack here would be to pull images which are “visually similar” (a trivial task by simply using Google image search), reduce them to the same size, compress them heavily, and then examine the last 2 bits of each of their pixels. If there is a significant difference in the randomness of the control images vs. the randomness of the suspected image, you could then suggest with X% confidence that the suspected image has been tampered with.
However, because it is possible for an image to be tampered with and yet NOT be a Decoy image, even then you could still not, with any legitimate amount of confidence, use such a test to state that an image is a Decoy.
If the amount is less than $50,000, I suggest you just offer it all as prize to whoever proves you wrong. The value to your reputation will be more than $5, and due to transaction costs people are unlikely to bet with you directly with less than $5 to gain.
I’d be willing to bet 50% of the market value of a feasible distinguishing-attack against AES. Under the condition that whoever proves me wrong discloses their method to me and only me.
In other words: a shitload. Such an attack would be far more valuable than any sum I’d possibly be able to offer.
Wrong on what count? I intended that sentence to refer only to the last paragraph of my post, and I’d expect that to be very implementation-dependent. Generally speaking, the higher the compression ratio the more perfectly random I’d expect the low bits to be—but even at low ratios I’d expect them to be pretty noisy. I’m fairly confident that some JPEG implementations would leave distinguishable patterns when fed some inputs, but I don’t have any good way of knowing how many or how easily distinguishable. To take a shot in the dark, I’m guessing there’s maybe a 30% chance that an arbitrarily chosen implementation with arbitrarily chosen parameters would be easily checked in this way? That’s mostly model uncertainty, though, so my error bars are pretty wide.
If we exclude that sort of statistical analysis, I’d estimate on the order of a 10 or 20% chance that Decoy images are distinguishable as such by examining metadata or other non-image traces—but that comes almost entirely from the fact that I haven’t read Nanashi’s code, I’m not a JPEG expert, and security is hard. A properly done implementation should not be vulnerable to such an attack; I just don’t know if this is properly done.
I don’t think that attack is practical, as long as Decoy leaves the metadata alone and works only on the image data. You’d need to reproduce the inputs to a particular implementation of the image encoding exactly, which is impossible unless you’re snooping the raw data—my phone camera produces images in JPEG format (high quality, but it’s still lossy compression) and does the conversion before the raw image data even leaves RAM.
If you’re dealing with images originating off the device, things get both easier and more difficult. Easier because there will typically be unchanged images in the wild to compare against; more difficult because there will typically be several different copies of an image floating around, and I don’t think it’s practical to reconstruct every possible chain of encodings. Many popular image-hosting sites, for example, reencode everything they get their grubby little paws on. Send an image as a text, that’s another reencoding. And so forth.
As I’ve mentioned elsewhere, though, decoy images may be statistically distinguishable from an untouched JPEG even if you can’t conclusively match it to an origin or e.g. validate against its EXIF tags—though I could be proven wrong here with the right analysis, and I’d like to be.
Your first paragraph nails it. Unless your phone is both jail broken and seriously compromised, there is no means of viewing the “original” version of either picture. Also re: the second paragraph. The app forces you to take a picture from your device to use as the “Decoy”, it will not allow you to use an off-device image. (You CAN use an off-device image as the hidden picture).
As for the statistical analysis, it’s mostly irrelevant. The encoding algorithm is both reversible and published. So you can extract “Decoy data” from ANY picture that you find, Decoy or no. The only thing that will confirm it one way or the other is a successful decryption. The best you could do is say, “Based on certain telltales, there’s a 10% chance this image is a Decoy” or whatever the odds may be.
Such an attack has little to no value. If you are an attacker with a specific target, isolating which pictures are decoys removes a trivial amount of entropy from the equation, especially compared to the work of trying to brute-force an AES-encrypted ciphertext.
I understand that, and I understand that it should be impractical to decrypt the hidden image without its key given that strong attacks on AES have not yet been publicly found (key exchange difficulties, which are always considerable, aside). But I think you’re being far too nonchalant about detection here. The fact that you can extract “decoy data” from any image is wholly irrelevant; it’s the statistical properties of those bits that I’m interested in, and with maybe a million bits of data to play with, the bias per bit does not have to be very high for an attacker to be very confident that some kind of steganography’s going on.
That does not, of course, prove that it’s being used to hide anything interesting from an attacker’s point of view; but that was never the point of this objection.
Well, my point has never been that it’s impossible for an attacker to be confident that you’re using steganography. Rather it’s that an attacker cannot prove with certainty.
The “decoy picture” aspect of the protocol is intended to provide social protection and ensure plausible deniability can be maintained. It is not intended as cryptographic protection, that is what the AES is for.
“Confidence” is only useful to an attacker when it comes to determining a target. But an attacker has to already be confident in order to perform such a test in the first place. Which means you’ve already been selected as a target. Furthermore they would have to compromise enough of your security to access your image data. If that happens, then the benefit of gaining further confidence is marginal at best.
Incidentally, regarding the specific details of such a detection method:
We (and the attacker) already know that the distribution of base64 characters in an AES-encrypted ciphertext is approximately random and follows no discernible pattern. We also know that the ciphertext is encoded into the last 2 bits of each 8-bit pixel. So, we can, with X amount of confidence, show that an image is not a Decoy if we extract the last 2 bits of each pixel and discover the resulting data is non-randomly distributed.
However, because it is possible for normal, non-Decoy, compressed JPEGs to exhibit a random distribution of the data in the last 2 bits of each pixel, the presence of randomness does not confirm that an image is a Decoy.
The only viable attack here would be to pull images which are “visually similar” (a trivial task by simply using Google image search), reduce them to the same size, compress them heavily, and then examine the last 2 bits of each of their pixels. If there is a significant difference in the randomness of the control images vs. the randomness of the suspected image, you could then suggest with X% confidence that the suspected image has been tampered with.
However, because it is possible for an image to be tampered with and yet NOT be a Decoy image, even then you could still not, with any legitimate amount of confidence, use such a test to state that an image is a Decoy.
--moved to previous comment8
If you would put a probability on it, how likely would you expect a proper security audit to prove you wrong?
.01%
How much money are you willing to bet on that?
If the amount is less than $50,000, I suggest you just offer it all as prize to whoever proves you wrong. The value to your reputation will be more than $5, and due to transaction costs people are unlikely to bet with you directly with less than $5 to gain.
I’d be willing to bet 50% of the market value of a feasible distinguishing-attack against AES. Under the condition that whoever proves me wrong discloses their method to me and only me.
In other words: a shitload. Such an attack would be far more valuable than any sum I’d possibly be able to offer.
Wrong on what count? I intended that sentence to refer only to the last paragraph of my post, and I’d expect that to be very implementation-dependent. Generally speaking, the higher the compression ratio the more perfectly random I’d expect the low bits to be—but even at low ratios I’d expect them to be pretty noisy. I’m fairly confident that some JPEG implementations would leave distinguishable patterns when fed some inputs, but I don’t have any good way of knowing how many or how easily distinguishable. To take a shot in the dark, I’m guessing there’s maybe a 30% chance that an arbitrarily chosen implementation with arbitrarily chosen parameters would be easily checked in this way? That’s mostly model uncertainty, though, so my error bars are pretty wide.
If we exclude that sort of statistical analysis, I’d estimate on the order of a 10 or 20% chance that Decoy images are distinguishable as such by examining metadata or other non-image traces—but that comes almost entirely from the fact that I haven’t read Nanashi’s code, I’m not a JPEG expert, and security is hard. A properly done implementation should not be vulnerable to such an attack; I just don’t know if this is properly done.