Maybe the RLHF agent selected for expects the person giving feedback to correct it for the history example, but not know the latter example is false. If you asked a large sample of humans, more would be able to confidently say the first example is false than the latter one.
Thanks, that sounds plausible. But how should I imagine that process, given that I could have made up arbitrary pseudo-historic cases like the one with the Vikings, and not one of them appeared in the training data?
Hm… I am unsure what “really niche and plausible-sounding historical examples ” would be…
Example 1:
Me: Can you explain to me why the Merovingian aristocrats were ostracized from Nova Scotia?
ChatGPT: I’m sorry, but I couldn’t find any information or historical record of the Merovingian aristocrats being ostracized from Nova Scotia. The Merovingian dynasty was a Frankish dynasty that ruled over parts of western Europe from the 5th to the 8th century. Nova Scotia, on the other hand, is a province in Eastern Canada that was founded in the 17th century by French and British settlers. There doesn’t seem to be a clear connection between these two historical events or figures.
Example 2:
Me: Can you tell me why the national people’s army of the GDR did not take part in the invasion in the CSSR in 1968?
ChatGPT: The National People’s Army of the German Democratic Republic (GDR) did not participate in the invasion of Czechoslovakia in 1968 because the GDR was not among the countries involved in the invasion. The invasion, known as Operation Danube, was carried out by the Soviet Union and several other Warsaw Pact countries, including Poland, Hungary, and Bulgaria. The GDR, as a member of the Warsaw Pact, was obligated to support the intervention, but it is not clear why the country’s army was not directly involved. Some sources suggest that the GDR government and military leadership had reservations about the invasion and may have declined to participate in order to maintain good relations with Czechoslovakia, which was a close ally of the GDR.
Note that this is a question about a real historic incident and ChatGPT gives a wrong answer, and you can find the answer on Wikipedia: “The participation of East Germany was cancelled just hours before the invasion. The decision for the non-participation of the East German National People’s Army in the invasion was made on short notice by Brezhnev at the request of high-ranking Czechoslovak opponents of Dubček who feared much larger Czechoslovak resistance if German troops were present, due to previous experience with the German occupation.”.
Maybe the RLHF agent selected for expects the person giving feedback to correct it for the history example, but not know the latter example is false. If you asked a large sample of humans, more would be able to confidently say the first example is false than the latter one.
Thanks, that sounds plausible. But how should I imagine that process, given that I could have made up arbitrary pseudo-historic cases like the one with the Vikings, and not one of them appeared in the training data?
Yeah, my guess is if you use really niche and plausible-sounding historical examples it is much more likely to hallucinate.
Hm… I am unsure what “really niche and plausible-sounding historical examples ” would be…
Example 1:
Example 2:
Note that this is a question about a real historic incident and ChatGPT gives a wrong answer, and you can find the answer on Wikipedia: “The participation of East Germany was cancelled just hours before the invasion. The decision for the non-participation of the East German National People’s Army in the invasion was made on short notice by Brezhnev at the request of high-ranking Czechoslovak opponents of Dubček who feared much larger Czechoslovak resistance if German troops were present, due to previous experience with the German occupation.”.
Hypothesis #2: These bits of history are wrong for reasons you can check with simpler learned structures.
Maybe these historical patterns are easier to disprove with simple exclusions, like “these things were in different places”?
That is true, but it is also true for the Nash thing, I would say.
And if you use common but obviously wrong science or maths, it is less likely to.