anonymousaisafety comments on Eavesdropping on Aliens: A Data Decoding Challenge

anonymousaisafety 27 Jul 2022 4:20 UTC
3 points
0
(1) The first thing I did when approaching this was think about how the message is actually transmitted. Things like the preamble at the start of the transmission to synchronize clocks, the headers for source & destination, or the parity bits after each byte, or even things like using an inversed parity on the header so that it is possible to distinguish a true header from bytes within a message that look like a header, and even optional checksum calculations.
(2) I then thought about how I would actually represent the data so it wasn’t just traditional 8-bit bytes—I created encoders & decoders for 36/24/12/6 bit unsigned and signed ints, and 30 / 60 bit non-traditional floating point, etc.
Finally, I created a mock telemetry stream that consisted of a bunch of time-series data from many different sensors, with all of the sensor values packed into a single frame with all of the data types from (2), and repeatedly transmitted that frame over the varying time series, using (1), until I had >1 MB.
And then I didn’t submit that, and instead swapped to a single message using the transmission protocol that I designed first, and shoved an image into that message instead of the telemetry stream.
- To avoid the flaw where the message is “just” 1-byte RGB, I viewed each pixel in the filter as being measured by a 24-bit ADC. That way someone decoding it has to consider byte-order when forming the 24-bit values.
- Then, I added only a few LSB of noise because I was thinking about the type of noise you see on ADC channels prior to more extensive filtering. I consider it a bug that I only added noise in some interval [0, +N], when I should have allowed the noise to be positive or negative. I am less convinced that the uniform distribution is incorrect. In my experience, ADC noise is almost always uniform (and only present in a few LSB), unless there’s a problem with the HW design, in which case you’ll get dramatic non-uniform “spikes”. I was assuming that the alien HW is not so poorly designed that they are railing their ADC channels with noise of that magnitude.
- I wanted the color data to be more complicated than just RGB, so I used a Bayer filter, that way people decoding it would need to demosiac the color channels. This further increased the size of the image.
- The original, full resolution image produced a file much larger than 1 MB when it was put through the above process (3 8-bit RGB → 4 24-bit Bayer), so I cut the resolution on the source image until the output was more reasonably sized. I wasn’t thinking about how that would impact the image analysis, because I was still thinking about the data types (byte order, number of bits, bit ordering) more so than the actual image content.
- “Was the source image actually a JPEG?” I didn’t check for JPEG artifacts at all, or analyze the image beyond trying to find a nice picture of bismuth with the full color of the rainbow present so that all of the color channels would be used. I just now did a search for “bismuth png” on Google, got a few hits, opened one, and it was actually a JPG. I remember scrolling through a bunch of Google results before I found an image that I liked, and then I just remember pulling & saving it as a BMP. Even if I had downloaded a source PNG as I intended, I definitely didn’t check that the PNG itself wasn’t just a resaved JPEG.