I have something in the pipeline, but it’ll take a while… if it’s trying to be “actually” alien, it’s kinda important that it’s internally consistent. “Add some arbitrary bytes to [...represent] metadata” is exactly what you don’t want to do. Because if you do, sure, it’ll be hard, it’ll (probably) be eventually solvable, but it’ll be… somewhat dissatisfying. Same for using stuff like NTSC, it’s just… why would they come up with exactly that? It just doesn’t make any sense!
So, in case anyone else wants to also make a good challenge in this style, here’s some guidelines. The first and most important one:
Ensure that every. single. layer. has some clearly recognizable artifact / regularity. Blocks + checksums, pilot tones, carrier waves, whatever. It can occasionally be rather obscure / hard to identify, but never completely absent. If you have to get through ≈10 layers of stuff to solve the whole thing, getting stuck for a few hours (or even days) with no indication of whether you’re even heading in the right direction is very unsexy and will likely kill the mood. (It’s also internally consistent w.r.t. the back story of an alien message: For an intentional transmission, they want you to understand what they’re sending. Even an accidental or amateur message would likely use (their) standard protocols / file formats, which will likely have some regularity – headers, blocks, trailers, … – because that makes it easier to debug problems.)
The rest falls into the “do some world building” and “ensure internal consistency” buckets:
Weird senses are cool, not everyone needs to have a focus on vision, especially not on exactly 3 types of receptors focusing on exactly the same 400–700 nm range as we do – the electromagnetic spectrum is way larger. (Even animals already have pretty crazy stuff going on!) Vision doesn’t have to be the primary sense (with low-resolution “eyes”, other senses may be preferable.) Having a two-dimensional arrangement of identical building blocks (a.k.a. “images” made of “pixels”) isn’t necessarily ideal for all senses. Different senses likely also result in other things being obvious, considered likely or unlikely, or being missed entirely when trying to find message formats that are most widely understood. (Beings whose senses revolve entirely around sound and movements may not have static things like books or writing at all, a cave-dwelling culture near a very aggressive star may consider all EM / “light” exposure as dangerous or deadly (compare: the popular image of radioactivity for us) and strongly avoid developments in that direction.)
Try not to copy “historical accidents” blindly. Most existing media formats are harshly optimized towards being barely good enough to fool human senses. (Especially lossy compression generally exploits shortcomings of human senses to remove stuff that we would be unable to sense in the presence of stronger surrounding stimuli, but even simple “lossless” formats are generally human-centric. Historically, memory/storage was not cheap at all, and you really wanted to get the most out of your bits.) An alien message will likely try to avoid optimizing in the wrong way, but may well fail at that in interesting ways.
RGB bakes material properties and lighting into just three numbers and ignores all other things, even though we are able to see e.g. metamerism effects in the real world. Stereo audio bakes sound waves into two channels, even though we can detect much more finely where sounds are coming from in the real world. (Your very personal ear shape influences how things sound coming from different directions, so while you can shove approximate directions and distances into two channels, it sounds quite different from what you’re used to in the real world.) Different priorities will result in different trade-offs.
Even for computers, binary isn’t required. (There were decimal computers, that’s just way more complex than 2 states and so eventually that won out. If the prevailing number base is 3 or 4 instead of 10, that’s much more doable.) Same goes for 8-bit groups as the next building block. (16 is “more square” and behaves nicer in some contexts. But we also had what’s basically 9-bit computers, that were using a 36-bit word size, so the next level of grouping doesn’t have to cleanly divide at all. It also provides space for e.g. a parity bit if you do want to stick to “base-powered” block sizes.)
Look at how things would actually transmit – you don’t get a convenient blob of bits. (That would already lock in a binary base, potentially bytes as the next level too, making it much harder to do actual fun stuff with weird senses.) Actually, you’d get some noisy recording (likely with burst errors blotting out information here and there) from which you have to recover the frequency at which symbols are coded, then recover that sequence and try to recognize and fix errors, and so on… beyond parity bits, there’s full error correction codes, interleaving, and all the rest of coding theory. You may want to repeat the same message in a loop, or have a sequence of messages to make it easier to decode the main message (e.g. stylized geometric shapes, simple but anisotropic patterns, or chirps to help fix directions and scales before the “messy” recorded data like images or sound.)
...and so on like that, but I hope that’s enough to send anyone considering to make such a challenge at the “actual” alien message level in approximately the right direction. Eagerly anticipating silly alien selfies! ;)
I have something in the pipeline, but it’ll take a while… if it’s trying to be “actually” alien, it’s kinda important that it’s internally consistent. “Add some arbitrary bytes to [...represent] metadata” is exactly what you don’t want to do. Because if you do, sure, it’ll be hard, it’ll (probably) be eventually solvable, but it’ll be… somewhat dissatisfying. Same for using stuff like NTSC, it’s just… why would they come up with exactly that? It just doesn’t make any sense!
So, in case anyone else wants to also make a good challenge in this style, here’s some guidelines. The first and most important one:
Ensure that every. single. layer. has some clearly recognizable artifact / regularity. Blocks + checksums, pilot tones, carrier waves, whatever. It can occasionally be rather obscure / hard to identify, but never completely absent. If you have to get through ≈10 layers of stuff to solve the whole thing, getting stuck for a few hours (or even days) with no indication of whether you’re even heading in the right direction is very unsexy and will likely kill the mood. (It’s also internally consistent w.r.t. the back story of an alien message: For an intentional transmission, they want you to understand what they’re sending. Even an accidental or amateur message would likely use (their) standard protocols / file formats, which will likely have some regularity – headers, blocks, trailers, … – because that makes it easier to debug problems.)
The rest falls into the “do some world building” and “ensure internal consistency” buckets:
Weird senses are cool, not everyone needs to have a focus on vision, especially not on exactly 3 types of receptors focusing on exactly the same 400–700 nm range as we do – the electromagnetic spectrum is way larger. (Even animals already have pretty crazy stuff going on!) Vision doesn’t have to be the primary sense (with low-resolution “eyes”, other senses may be preferable.) Having a two-dimensional arrangement of identical building blocks (a.k.a. “images” made of “pixels”) isn’t necessarily ideal for all senses. Different senses likely also result in other things being obvious, considered likely or unlikely, or being missed entirely when trying to find message formats that are most widely understood. (Beings whose senses revolve entirely around sound and movements may not have static things like books or writing at all, a cave-dwelling culture near a very aggressive star may consider all EM / “light” exposure as dangerous or deadly (compare: the popular image of radioactivity for us) and strongly avoid developments in that direction.)
Try not to copy “historical accidents” blindly. Most existing media formats are harshly optimized towards being barely good enough to fool human senses. (Especially lossy compression generally exploits shortcomings of human senses to remove stuff that we would be unable to sense in the presence of stronger surrounding stimuli, but even simple “lossless” formats are generally human-centric. Historically, memory/storage was not cheap at all, and you really wanted to get the most out of your bits.) An alien message will likely try to avoid optimizing in the wrong way, but may well fail at that in interesting ways.
RGB bakes material properties and lighting into just three numbers and ignores all other things, even though we are able to see e.g. metamerism effects in the real world. Stereo audio bakes sound waves into two channels, even though we can detect much more finely where sounds are coming from in the real world. (Your very personal ear shape influences how things sound coming from different directions, so while you can shove approximate directions and distances into two channels, it sounds quite different from what you’re used to in the real world.) Different priorities will result in different trade-offs.
Even for computers, binary isn’t required. (There were decimal computers, that’s just way more complex than 2 states and so eventually that won out. If the prevailing number base is 3 or 4 instead of 10, that’s much more doable.) Same goes for 8-bit groups as the next building block. (16 is “more square” and behaves nicer in some contexts. But we also had what’s basically 9-bit computers, that were using a 36-bit word size, so the next level of grouping doesn’t have to cleanly divide at all. It also provides space for e.g. a parity bit if you do want to stick to “base-powered” block sizes.)
Look at how things would actually transmit – you don’t get a convenient blob of bits. (That would already lock in a binary base, potentially bytes as the next level too, making it much harder to do actual fun stuff with weird senses.) Actually, you’d get some noisy recording (likely with burst errors blotting out information here and there) from which you have to recover the frequency at which symbols are coded, then recover that sequence and try to recognize and fix errors, and so on… beyond parity bits, there’s full error correction codes, interleaving, and all the rest of coding theory. You may want to repeat the same message in a loop, or have a sequence of messages to make it easier to decode the main message (e.g. stylized geometric shapes, simple but anisotropic patterns, or chirps to help fix directions and scales before the “messy” recorded data like images or sound.)
...and so on like that, but I hope that’s enough to send anyone considering to make such a challenge at the “actual” alien message level in approximately the right direction. Eagerly anticipating silly alien selfies! ;)