Judea Pearl proposes a mini-Turing test for seeing if a machine understands causality.
The test is mini, because before the conversation begins, you encode the causal relations in advance, perhaps through fine-tuning, or through prompts. Then you ask associational, causal, and counterfactual questions to see if the LLM gives the right answers. Causal knowledge is claimed by PaLM, but the examples they provide were hardly rigorous:https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLmQjS3gOQ2x7ru3xovYjVw-Yr2fKDCqhDHByQZitD92Yu4L-v2BBa5f_VMfpWM4D0930Dmk35EY1TqGrYUtMQqJO41hkLqXuu51eOpXZ3PvYPSjf5stfEJNJn2idWnRYCCEgBiJuLDTXX5Fgt-Mk13kCKdO12JShGvDO_cArtLKv8U8obJaHiL5ASQg/s1320/Big%20Bench%20Sped%20Up%20Cropped.gif
Judea Pearl proposes a mini-Turing test for seeing if a machine understands causality.
The test is mini, because before the conversation begins, you encode the causal relations in advance, perhaps through fine-tuning, or through prompts. Then you ask associational, causal, and counterfactual questions to see if the LLM gives the right answers. Causal knowledge is claimed by PaLM, but the examples they provide were hardly rigorous:
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLmQjS3gOQ2x7ru3xovYjVw-Yr2fKDCqhDHByQZitD92Yu4L-v2BBa5f_VMfpWM4D0930Dmk35EY1TqGrYUtMQqJO41hkLqXuu51eOpXZ3PvYPSjf5stfEJNJn2idWnRYCCEgBiJuLDTXX5Fgt-Mk13kCKdO12JShGvDO_cArtLKv8U8obJaHiL5ASQg/s1320/Big%20Bench%20Sped%20Up%20Cropped.gif