It works for AIs very easily. Just feed the patents from AI 1 into AI 2. No need for special engineering of the two AIs.
It also works for humans, at least somewhat. E.g., the Eyeronman vests I mentioned translate 3-D scene representations into vibrations. After enough time with one, people can pick up a sense of what the environment around them is like through the vibrations from the vest.
Translating LLM patents into visual input wouldn’t look like normal diagrams. It would look like a random-seeming mishmash of colors and shapes which encode the LLM’s latents. A person would then be shown many pairs of text and the encoded latents the model generated for the text. In time, I expect the person would gain a “text sense” where they can infer the meaning of the text from just the visual encoding of the model’s latents.
I think I’m lacking some jargon here. What’s a latent/patent in the context of a large language model? “patent” is ungoogleable if you’re not talking about intellectual property law.
The Eyeronman link didn’t seem very informative. No explanation of how it works. I already knew sensory substitution was a thing, but is this different somehow? Is there some neural net pre-digesting its outputs? Is it similarly a random-seeming mismash? Are there any other examples of this kind of thing working for humans? Visually?
Would the mismash from a smaller text model be any easier/faster for the human to learn?
It works for AIs very easily. Just feed the patents from AI 1 into AI 2. No need for special engineering of the two AIs.
It also works for humans, at least somewhat. E.g., the Eyeronman vests I mentioned translate 3-D scene representations into vibrations. After enough time with one, people can pick up a sense of what the environment around them is like through the vibrations from the vest.
Translating LLM patents into visual input wouldn’t look like normal diagrams. It would look like a random-seeming mishmash of colors and shapes which encode the LLM’s latents. A person would then be shown many pairs of text and the encoded latents the model generated for the text. In time, I expect the person would gain a “text sense” where they can infer the meaning of the text from just the visual encoding of the model’s latents.
I think I’m lacking some jargon here. What’s a latent/patent in the context of a large language model? “patent” is ungoogleable if you’re not talking about intellectual property law.
The Eyeronman link didn’t seem very informative. No explanation of how it works. I already knew sensory substitution was a thing, but is this different somehow? Is there some neural net pre-digesting its outputs? Is it similarly a random-seeming mismash? Are there any other examples of this kind of thing working for humans? Visually?
Would the mismash from a smaller text model be any easier/faster for the human to learn?
My money’s on: typo.