Nassim Taleb: If a chatbot writes a complete essay from a short prompt, the entropy of the essay must be exactly that of the initial prompt, no matter the length of the final product.
If the entropy of the output > prompt, you have no control over your essay.
In the Shannon sense, with a temperature of 0, you send the prompt and the reveivers will recreate the exact same message. In a broader sense, with all BS being the same ornament, receivers get the same message in different but equivalent wrapping.
So...how would this not be a fully general claim that answering questions is impossible? Wouldn’t it apply equally well to a dictionary, encyclopedia, or search engine? Or heck, asking an expert in person?
It’s wrong for the same reason in any of these cases: the thing being given the prompt is not max entropy, it contains information correlated with that in the prompt but not present in the prompt-generator, and returns a response based both on the prompt and on what it already contained. Whether it understands what it contains is irrelevant.
Concretely: If I pull out the old Worldbook encyclopedias in my parents’ basement and look up “Elephant,” I get back a whole essay of information. Some of it I probably knew, some I probably didn’t, some is probably wrong or obsolete. Sure, in some sense the word “elephant” may be said to “contain” all that information as part of its definition, but so what? My elephant concept didn’t previously contain that essay, and I’m the one who generated the prompt and read the output. I learned, therefore the system generated output I did not input.
Or I guess that’s the point? I don’t want to control every aspect of the output in the case of an encyclopedia, but I want everything in an LLM-generated essay to be correct in a way I can verify and reasoned according to logical principles? Well, sure, then, that’s fine. I just have to accept a small amount of checking facts I didn’t know and confirming the validity of reasoning in exchange for saving time writing and researching. How is this different from checking multiple sources when writing an essay myself, or from how my boss reads what I write and asks clarifying questions instead of just passing it off verbatim as his own knowledge?
As Zvi points out, what counts as “new information” depends on the person reading the message
Taking all the knowledge and writing and tendencies of the entire human race and all properties of the physical universe as a given, sure, this is correct. The response corresponds to the prompt, all the uniqueness has to be there.
Presumably when you look something up in an encyclopedia, it is because there is something that other people (the authors of the encyclopedia) know that you don’t know. When you write an essay, the situation should be exactly opposite: there is something you know that the reader of the essay doesn’t know. In this case the possibilities are:
The information I am trying to convey in the essay is strictly contained in the prompt (in which case GPT is just adding extra unnecessary verbiage)
GPT is adding information known to me but unknown to the reader of the essay (say I am writing a recipe for baking cookies and steps 1-9 are the normal way of making cookies but step 10 is something novel like “add cinnamon”)
GPT is hallucinating facts unknown to the writer of the essay (theoretically these could still be true facts, but the writer would have to verify this somehow)
Nassim Taleb’s point is that cases 1. and 3. are bad. Zvi’s point is that case 2. is frequently quite useful.
I would add that 1 is also useful, as long as the prompter understands what they’re doing. Rewriting the same information in many styles for different uses/contexts/audiences with minimal work is useful.
Hallucinating facts is bad, but if they’re the kind of things you can easily identify and fact check, it may not be too bad in practice. And the possibility of GPT inserting true facts is actually also useful, again as long as they’re things you can identify and check. Where we get into trouble (at current and near-future capability levels) is when people and companies stop editing and checking output before using it.
So...how would this not be a fully general claim that answering questions is impossible? Wouldn’t it apply equally well to a dictionary, encyclopedia, or search engine? Or heck, asking an expert in person?
It’s wrong for the same reason in any of these cases: the thing being given the prompt is not max entropy, it contains information correlated with that in the prompt but not present in the prompt-generator, and returns a response based both on the prompt and on what it already contained. Whether it understands what it contains is irrelevant.
Concretely: If I pull out the old Worldbook encyclopedias in my parents’ basement and look up “Elephant,” I get back a whole essay of information. Some of it I probably knew, some I probably didn’t, some is probably wrong or obsolete. Sure, in some sense the word “elephant” may be said to “contain” all that information as part of its definition, but so what? My elephant concept didn’t previously contain that essay, and I’m the one who generated the prompt and read the output. I learned, therefore the system generated output I did not input.
Or I guess that’s the point? I don’t want to control every aspect of the output in the case of an encyclopedia, but I want everything in an LLM-generated essay to be correct in a way I can verify and reasoned according to logical principles? Well, sure, then, that’s fine. I just have to accept a small amount of checking facts I didn’t know and confirming the validity of reasoning in exchange for saving time writing and researching. How is this different from checking multiple sources when writing an essay myself, or from how my boss reads what I write and asks clarifying questions instead of just passing it off verbatim as his own knowledge?
As Zvi points out, what counts as “new information” depends on the person reading the message
Presumably when you look something up in an encyclopedia, it is because there is something that other people (the authors of the encyclopedia) know that you don’t know. When you write an essay, the situation should be exactly opposite: there is something you know that the reader of the essay doesn’t know. In this case the possibilities are:
The information I am trying to convey in the essay is strictly contained in the prompt (in which case GPT is just adding extra unnecessary verbiage)
GPT is adding information known to me but unknown to the reader of the essay (say I am writing a recipe for baking cookies and steps 1-9 are the normal way of making cookies but step 10 is something novel like “add cinnamon”)
GPT is hallucinating facts unknown to the writer of the essay (theoretically these could still be true facts, but the writer would have to verify this somehow)
Nassim Taleb’s point is that cases 1. and 3. are bad. Zvi’s point is that case 2. is frequently quite useful.
Thanks, that’s a really useful summing up!
I would add that 1 is also useful, as long as the prompter understands what they’re doing. Rewriting the same information in many styles for different uses/contexts/audiences with minimal work is useful.
Hallucinating facts is bad, but if they’re the kind of things you can easily identify and fact check, it may not be too bad in practice. And the possibility of GPT inserting true facts is actually also useful, again as long as they’re things you can identify and check. Where we get into trouble (at current and near-future capability levels) is when people and companies stop editing and checking output before using it.