I apologize. After seeing this post, A—approached me and said almost word for word your initial comment. Seeing as the topic of whether in-context learning counts as learning isn’t even very related to the post, and this being your first comment on the site, I was pretty suspicious. But it seems it was just a coincidence.
If physics was deterministic, we’d do the same thing every time if you started with the same state. Does that mean we’re not intelligent? Presumably not, because in this case the cause of the intelligent behavior clearly lives in the state which is highly structured and not the time evolution rule, which seems blind and mechanistic. With GPT, the time evolution rule is clearly responsible for proportionally more, and does have the capacity to deploying intelligent-appearing but static memories. I don’t think this means there’s no intelligence/learning happening at runtime. Others in this thread have given various reasons, so I’ll just respond to a particular part of your comment that I find interesting, about the RNG.
I actually think the RNG is actually an important component for actualizing simulacra that aren’t mere recordings in a will. Stochastic sampling enables symmetry breaking at runtime, the generation of gratuitously specific but still meaningful paths. A stochastic generator can encode only general symmetries that are much less specific than individual generations. If you run GPT on temp 1 for a few words usually the probability of the whole sequence will be astronomically low, but it may still be intricately meaningful, a unique and unrepeatable (w/o the rand seed) “thought”.
I apologize. After seeing this post, A—approached me and said almost word for word your initial comment. Seeing as the topic of whether in-context learning counts as learning isn’t even very related to the post, and this being your first comment on the site, I was pretty suspicious. But it seems it was just a coincidence.
If physics was deterministic, we’d do the same thing every time if you started with the same state. Does that mean we’re not intelligent? Presumably not, because in this case the cause of the intelligent behavior clearly lives in the state which is highly structured and not the time evolution rule, which seems blind and mechanistic. With GPT, the time evolution rule is clearly responsible for proportionally more, and does have the capacity to deploying intelligent-appearing but static memories. I don’t think this means there’s no intelligence/learning happening at runtime. Others in this thread have given various reasons, so I’ll just respond to a particular part of your comment that I find interesting, about the RNG.
I actually think the RNG is actually an important component for actualizing simulacra that aren’t mere recordings in a will. Stochastic sampling enables symmetry breaking at runtime, the generation of gratuitously specific but still meaningful paths. A stochastic generator can encode only general symmetries that are much less specific than individual generations. If you run GPT on temp 1 for a few words usually the probability of the whole sequence will be astronomically low, but it may still be intricately meaningful, a unique and unrepeatable (w/o the rand seed) “thought”.