I think the standard technical term for what you’re talking about is “unsupervised machine translation”. Here’s a paper on that, for example, although it’s not using the LLM approach you propose. (I have no opinion about whether the LLM approach you propose would work or not.)
Interesting reference! So an unsupervised approach from 2017/2018, presumably somewhat primitive by today’s standards, already works quite well for English/French translation. This provides some evidence that the (more advanced?) LLM approach, or something similar, would actually work for English/Alienese.
Of course English and French are historically related, and arose on the same planet while being used by the same type of organism. So they are necessarily quite similar in terms of the concepts they encode. English and Alienese would be much more different and harder to translate.
But if it worked, it would mean that sufficiently long messages, with enough effort, basically translate themselves. A spiritual successor to the Pioneer plaque and the Arecibo message, instead of some galaxy brained hopefully-universally-readable message, would simply consist of several terabytes of human written text. Smart aliens could use the text to train a self-supervised Earthling/Alienese translation model, and then use this model to translate our text.
I think the standard technical term for what you’re talking about is “unsupervised machine translation”. Here’s a paper on that, for example, although it’s not using the LLM approach you propose. (I have no opinion about whether the LLM approach you propose would work or not.)
Interesting reference! So an unsupervised approach from 2017/2018, presumably somewhat primitive by today’s standards, already works quite well for English/French translation. This provides some evidence that the (more advanced?) LLM approach, or something similar, would actually work for English/Alienese.
Of course English and French are historically related, and arose on the same planet while being used by the same type of organism. So they are necessarily quite similar in terms of the concepts they encode. English and Alienese would be much more different and harder to translate.
But if it worked, it would mean that sufficiently long messages, with enough effort, basically translate themselves. A spiritual successor to the Pioneer plaque and the Arecibo message, instead of some galaxy brained hopefully-universally-readable message, would simply consist of several terabytes of human written text. Smart aliens could use the text to train a self-supervised Earthling/Alienese translation model, and then use this model to translate our text.