I think for the limit of infinite, truthful training data, with sufficient abstraction, it would not be necessarily different. We too form our beliefs from “training data” after all, we’re just highly multimodal and smart enough to know the distinction between a science textbook and a fantasy novel. An LLM doesn’t have maybe that distinction perfectly clear—though it does grasp it to some point.
I just don’t really understand in what way “token prediction” is anything less than “literally any possible function from a domain of all possible observations to a domain of all possible actions”. At least if your “tokens” cover extensively enough all the space of possible things you might want to do or say.
I think a significant part of the problem is not the LLMs trouble of distinguishing truth from fiction, it’s rather to convince it through your prompt that the output you want is the former and not the latter.
I think for the limit of infinite, truthful training data, with sufficient abstraction, it would not be necessarily different. We too form our beliefs from “training data” after all, we’re just highly multimodal and smart enough to know the distinction between a science textbook and a fantasy novel. An LLM doesn’t have maybe that distinction perfectly clear—though it does grasp it to some point.
There’s no evidence that we do so based solely on token prediction, so that’s irrelevant.
I just don’t really understand in what way “token prediction” is anything less than “literally any possible function from a domain of all possible observations to a domain of all possible actions”. At least if your “tokens” cover extensively enough all the space of possible things you might want to do or say.
I think a significant part of the problem is not the LLMs trouble of distinguishing truth from fiction, it’s rather to convince it through your prompt that the output you want is the former and not the latter.