Without regard to anything specific to LLMs… Math works the same for all conceivable beings. Beings that live in our universe, of sufficient advancedness, will almost certainly know about hydrogen and other elements, and fundamental constants like Planck lengths. So there will exist commonalities. And then you can build everything else on top of those. If need be, you could describe the way things looked by giving 2D pixel-grid pictures, or describe an apple by starting with elements, molecules, DNA, and so on. (See Contact and That Alien Message for explorations of this type of problem.)
It’s unlikely that any LLM resembling those of today would translate the word for an alien fruit into a description of their own DNA-equivalent and their entire biosphere… But maybe a sufficiently good LLM would have that knowledge inside it, and repeatedly querying it could draw that out.
I guess my question would then be whether the translation would work if neither language contained any information on microphysics or advanced math. Would the model be able to translate e.g. “z;0FK(JjjWCxN” into “fruit”?
The chances of the LLM being able to do this depend heavily on how similar the subjects discussed in the alien language are to things humans discuss. Removing areas where there is most likely to be similarity would reduce the chance that the LLM would find matching patterns in both. Indeed, that we’re imagining aliens for the example already probably greatly increases the difficulty for the LLM.
Without regard to anything specific to LLMs… Math works the same for all conceivable beings. Beings that live in our universe, of sufficient advancedness, will almost certainly know about hydrogen and other elements, and fundamental constants like Planck lengths. So there will exist commonalities. And then you can build everything else on top of those. If need be, you could describe the way things looked by giving 2D pixel-grid pictures, or describe an apple by starting with elements, molecules, DNA, and so on. (See Contact and That Alien Message for explorations of this type of problem.)
It’s unlikely that any LLM resembling those of today would translate the word for an alien fruit into a description of their own DNA-equivalent and their entire biosphere… But maybe a sufficiently good LLM would have that knowledge inside it, and repeatedly querying it could draw that out.
I guess my question would then be whether the translation would work if neither language contained any information on microphysics or advanced math. Would the model be able to translate e.g. “z;0FK(JjjWCxN” into “fruit”?
The chances of the LLM being able to do this depend heavily on how similar the subjects discussed in the alien language are to things humans discuss. Removing areas where there is most likely to be similarity would reduce the chance that the LLM would find matching patterns in both. Indeed, that we’re imagining aliens for the example already probably greatly increases the difficulty for the LLM.