An Opinionated Look at Inference Rules
If you ask around what are the typical ways to infer information, most people will answer: Deductions, Inductions, and Abductions. Of course, there are more ways than that, but there is no unified approach in their classification. I want to challenge that.
The reason why I am unhappy with the current status quo is because it does not take advantage of the expressive capabilities of Large Language Models such as ChatGPT. There have been many attempts to understand what kind of inferences can be correctly stated by LLMs (example here) - however, I believe that a more thorough classification of the inference rules would empower future models and provide advanced ways for extracting coherent and useful information from them.
Disclaimer: this post is highly opinionated, but I am confident that it will provide the reader with interesting insights about what it means to “argument” rationally. Even if you disagree with me, you will be exposed to stimulating ideas (in my opinion).
Inference Rules
Let’s start with a definition: what is a “rule of inference”[1]? It is a “discursive” computational process[2] denoted by the symbol ↪ having the following signature.
It takes in input:
a context, aka “knowledge” (in the form of assertions in natural language) from which we can freely extract any number of known facts we need. The context can, in principle, be infinite and it mostly serves a theoretical purpose, as a pool of information content that we trust to a certain degree[3]. It is usually represented by some LLM, by some human expert, or by the corpus of knowledge of some well-established discipline (e.g. the context of biology)
a purpose, aka a goal that we’d like to achieve. It has a form similar to: “Deduce that Socrates is mortal”. It usually consists of a prompt for some LLM, if we choose an LLM as our context.
It returns in output:
some premises, aka a finite and consistent subset of the context, whose assertions are acting as the preliminary assumptions for giving some conclusion
a conclusion, aka a statement that achieves the given purpose
a proof, aka an argument that makes the achievement believable. It doesn’t need to be a mathematical proof: it only needs to appear believable.
In symbols:
Context, Purpose ↪ Premises, Conclusion, Proof
The conclusion of an inference is not necessarily true, but it can be used to generate true knowledge once its validity is tested by some other means (e.g. empirically).
This approach is significantly different in respect to the typical definitions: in fact, I never focus on the argumentation method[4] of the rule but I only focus on its purpose. The method is only a practical tool that can be used to improve the quality of the inference; moreover, most methods can be applied to any kind of rule—regardless if it is a deduction, an induction, an abduction, or else.
Basic Inference Rules
I am going to present now the four simplest inference rules:
Deduction
Reduction
Induction
Abduction
You should already know all of them, with the possible exception of reduction (that is very atypical, but commonly used in sciences such as chemistry).
Deduction
Etymology: deduction means “bring down”, as in reaching something.
Purpose: deductions are used to infer some desired claim.
Intuitive definition: a deduction claims a desired conclusion (=the claim) by discovering relations with a finite set of trusted premises (=the basis).
While the claim may be quite remarkable, it won’t contain more information than its premises.
Formal definition: Context, Deduce this Claim ↪ Basis, Claim, Proof
where Proof: Basis ⇒ Claim
and Context ⊃ Basis.
Example: Deduce that Socrates is mortal ↪ All men are mortals and Socrates is a man; therefore, Socrates must likewise be mortal.
Keyword: must.
Counterfactual: while deductions (especially in proof calculus) are usually considered bullet-proof, they have some limits when applied naïvely: in the Liar Paradox, for example, it is impossible to deduce any irrefutable conclusion.
Reduction
Etymology: reduction means “bring back”, as in returning something.
Purpose: reductions are used to obtain new results.
Intuitive definition: a reduction evaluates a given assumed premise (=the assumption) by showing relations with some original conclusion (=the consequence).
The consequence may be ingenious, but it will represent just one of many possible outcomes.
Formal definition: Context, Reduce this Assumption ↪ Basis, Assumption→Consequence, Proof
where Proof: Basis + Assumption ⇒ Consequence
and Basis ⇏ Consequence
and Context ⊃ Basis
Example: Reduce what happens if I release Hydrogen in the air ↪ Since our atmosphere contains Oxygen, a chemical reaction will produce water and heat.
Keyword: will.
Counterfactual: as for deductions, a naïve application may lead to paradoxical or unreliable conclusions.
Induction
Etymology: induction means “take in”, as in including something.
Purpose: inductions are used to justify a new idea.
Intuitive definition: an induction posits a desired assertion (=the hypothesis) by showing relations with some known fact (=the observation).
The hypothesis may in principle be unprovable but, until not falsified, it should predict interesting observations.
Formal definition: Context, Induce this Hypothesis ↪ Basis, Hypothesis→Observation, Proof
where Proof: Basis + Hypothesis ⇒ Observation
and Basis ⇏ Observation
and Context ⊃ Basis + Observation
Example: Induce that the sun will set tomorrow ↪ The sun has set every single day of your life without fault; analogously, it should set tomorrow as well.
Keyword: should
Counterfactual: despite allegedly it was believed that all swans were white, that was falsified in 1697 by discovering black swans.
Abduction
Etymology: abduction means “take away”, as in removing something.
Purpose: abductions are used to interpret a fact, usually by discarding implausible cases.
Intuitive definition: an abduction explains some given enigmatic assertion (=the enigma) by proposing relations with some interpretative assertion (=the explanation).
The explanation may not be certain but, if sound, it might represent the correct solution to the enigma.
Formal definition: Context, Abduce this Enigma ↪ Basis, Explanation→Enigma, Proof
where Proof: Basis + Explanation ⇒ Enigma
and Basis ⇏ Enigma
and Context ⊃ Basis
Example: Abduce the reason why I hear hoofbeats ↪ Hoofbeats are usually caused by horses, consequently some might be close-by.
Keyword: might
Counterfactual: at a circus, the hoofbeats may be caused by zebras, not horses.
Recap
Before proceeding and defining more exotic inference rules, I want to remark a few points:
Reductions are typically not considered as inference rules. That is the correct perspective in the context of mathematical logic because, in the end, both reductions and deductions share a similar nature; however, I don’t think that is the correct perspective in the context of discursive argumentation: asking for some proof of a given statement (aka a deduction) is not the same as asking for some ingenious consequence of a given assumption (aka a reduction). Those are very different exercises.
The typical definition of deduction will explicitly mention that the end result must be certain (assuming that all its premises are true); in my definition, that is not important at all! Certainty is a nice bonus, but the purpose and form of the argument is more important. For example: while the Liar Paradox does not provide any certain result, it is still classified as a valid example of deduction according to my definition.
Similarly, the typical definition of induction will explicitly mention unprovability[5] while the typical definition of abduction will explicitly mention plausibility. Such approaches focus on the argumentative methods rather than the purposes! In my opinion, the main difference between induction and abduction is that the first is trying to convince you to accept a given explanation, while the second is generically looking for any possible explanation (similarly to the difference between deduction and reduction).
Depending on the chosen argumentative method, some statements that are considered “abductions” according to the typical definition may instead be considered “inductions” according to my definition. That is painful, but I still believe that my definition provides a powerful standard to classify such rules.
Advanced Inference Rules
I am going to present now some exotic inference rules—the “next level” in terms of complexity of their formal definitions:
Reproduction
Introduction
Retroduction
The rules above are not present in the literature, but I believe that you will find their examples quite familiar nonetheless.
Reproduction
Etymology: reproduction means “bring forth again”, as in repeating something.
Purpose: reproductions are used to validate a reduction.
Intuitive definition: a reproduction replicates through a test (=the test) a given assertion (=the result) by setting appropriate controlled variables (=the setup).
The test may fail but, if otherwise, it can reproduce the expected result.
Formal definition: Context, Reproduce this Result ↪ Basis,
Setup→Test→Result, Proof
where Proof: Basis + Setup + Test ⇒ Result
and Basis + Setup ⇏ Result
and Basis + Result ⇒ Setup (reduction)
and Basis ⇏ Setup
and Context ⊃ Basis
Example: Reproduce a boiled egg ↪ As the name implies, you need an egg and boiling water. So, do submerge an egg into boiling water for some time.
Keyword: do
Counterfactual: if the only working setup is irreproducible, you won’t be able to generate a test.
Introduction
Etymology: introduction means “lead inside”, as in introducing something.
Purpose: introductions are used to confirm an induction.
Intuitive definition: an introduction examines through a test (=the confirmation) a desired assertion (=the supposition) with the support of some known fact (=the clue).
The confirmation may not succeed but, if otherwise, it would substantiate the supposition and explain the clues.
Formal definition: Context, Introduce this Supposition ↪ Basis,
Clue→Confirmation→Supposition, Proof
where Proof: Basis + Clue + Confirmation ⇒ Supposition
and Basis + Clue ⇏ Supposition
and Basis + Supposition ⇒ Clue (induction)
and Basis ⇏ Clue
and Context ⊃ Basis + Clue
Example: Introduce a way to implicate the main suspect ↪ Confirming the alibi of all the secondary suspects would implicate the main one.
Keyword: would
Counterfactual: if two suspects were identical twins and they decided to switch, it could be practically impossible to devise some investigation to identify the criminal sibling.
Retroduction
Etymology: retroduction means “lead backward”, as in retreating from something.
Purpose: retroductions are used to retrospect an abduction.
Intuitive definition: a retroduction refines a given unsatisfactory assertion (=the inadequacy) by devising relations with some testable assertion (=a conditional sentence in the form Inspection→Clarification).
The inspection may be inconclusive but, if otherwise, it could address and refine the inadequacy.
Formal definition: Context, Retroduce this Inadequacy ↪ Basis,
Inadequacy→Inspection→Clarification, Proof
where Proof: Basis + Inadequacy + Inspection ⇒ Clarification
and Basis + Inadequacy ⇏ Clarification
and Basis + Clarification ⇒ Inadequacy (abduction)
and Basis ⇏ Inadequacy
and Context ⊃ Basis
Example: Retroduce some classification for this unknown specimen, considering that it bears some resemblance to a crab ↪ The presence of some non-crustacean characteristic could confirm it is a false crab.
Keyword: could
Counterfactual: as for introductions, there are hypothetical scenarios where it is impossible to devise any reliable test to refine the current knowledge.
Recap
A few pointers:
Some authors use the term “retroduction” as a synonym of “abduction”. That is not the way I use it in this post: the two rules are distinct, although related.
The advanced rules are a way to “invert” the basic rules: in other terms, they were created by replicating the structure of some basic rule while adding some form of test in the middle.
Reproductions invert some reduction.
Introductions invert some induction.
Retroductions invert some abduction.
It is easy to see that deductions are not “invertible” in that way.
In general, there is no limit to the complexity of an inference rule: it is possible to create as many new rules as desired! The difficult part is finding a meaningful interpretation to describe how they can be applied in practice.
Types of Knowledge
To explain the reason why I believe that the classification above is important (even more so to extract knowledge from some LLM!), I am going to explain the role it plays in the generation of new reliable information.
To do so, let me first categorize all knowledge in four general schools: Exact, Experimental, Empirical, and Evidential. This categorization is not new, but I am adding a few twists.
Exact Type
Exact, aka logical. Exact knowledge is related to formal sciences such as:
Mathematics
Theoretical Statistics
Information Theory
Computer Science
This type of knowledge employs deductions (for publication) and reductions (for research) at its core, by virtue of definitions, designations, postulates, and proofs. All the other types of inferences are used as well, especially when dealing with conjectures, when trying to hypothesise axioms, or when interpreting semantics of models.
This approach is not always able to extract knowledge (see this list of paradoxes) - but at least it can provide certainty about its conclusions.
Experimental Type
Experimental, aka scientific. Experimental knowledge is related to natural sciences such as:
Physics
Chemistry
Astronomy
Biology
This type of knowledge employs inductions (for speculation) and reproductions (for validation) at its core, by virtue of physical laws, measurements, plausibility, and quality assurance. However, all the other types of inferences are used as well: for example, deductions are at the core of theoretical physics.
This approach is not always able to prove its knowledge since, in principle, it is impossible to divine the future—but at least its conclusions can be subject to falsification[5].
Empirical Type
Empirical, aka observational. Empirical knowledge is related to social sciences such as:
Sociology
Political Science
Economics
Anthropology
This type of knowledge employs abductions (for interpretation) and reproductions (for confirmation) at its core, by virtue of modelling, sampling, causation, and observation. However, all the other types of inference are used as well: for example, reductions are employed by using computer simulations.
This approach is not always able to verify or falsify its knowledge since it is hard to pinpoint specific cause/effect relationships—but at least its conclusions can be cross-checked by using statistics.
Evidential Type
Evidential, aka factual or documentarian. Evidential knowledge is related to human studies such as:
History
Archeology
Medicine
Forensic Science.
This type of knowledge employs introductions (for diagnostics) and retroductions (for clarification) at its core, by virtue of facts, recordings, findings, and examples. However, all the other types of inference are used as well since such disciplines are inbred with scientific studies.
This approach is not always able to verify or falsify its knowledge since it is prone to fabrications, misinterpretations, or red herrings—but at least its conclusions can be confirmed to be compatible with the current knowledge.
Scientific Research & Literature
To complete this post, I am going to explain the research & publishing cycles, as commonly used nowadays in science. You can see that the inference rules and their classification are going to play a key role here.
Scientific Research
How does science advance over time? Through the continuous cycle of research! Let me quote D. I. Spivak: «In the context of a scientific model, a hypothesis assumed by a person produces a prediction, which motivates the specification of an experiment, which when executed results in an observation, which analysed by a person yields a hypothesis[6]».
The loop shifts around but never ends: that’s the reason why progress can be continuously made, and knowledge constantly discovered.
Each phase of the cycle involves a specific type of reasoning that is strongly represented by a specific type of inference (see below).
In the context of a scientific model:
a hypothesis assumed by a person produces a prediction →
REDUCTION: reduce my hypothesis to produce some compelling prediction[the prediction] motivates the specification of an experiment →
INTRODUCTION: introduce my prediction to specify some supporting experiment[the experiment] when executed results in an observation →
REPRODUCTION: reproduce my experiment to return some verified observation[the observation when] analysed by a person yields a hypothesis →
ABDUCTION: abduce my observation to yield some explaining hypothesis.
Let’s clarify the process by using an example: consider the hypothesis “all the odd integers are prime numbers” (that’s clearly false, but suppose we don’t know that yet).
Reduce that 9 is prime since it is odd.
Introduce a primality test for 9, aka it must not be divisible by any integer from 2 to 8.
Reproduce the primality test. Notice that it fails since 9 is divided by 3 (the loop did not close, so the cycle must be attempted again! New insights were generated in the process).
Abduce that all the odd integers are prime, except 9 (such hypothesis is still incorrect, but it gets closer to the truth).
At some point, we will consider the idea that there are infinite odd integers that are not primes—thus obtaining a better understanding on the concept of “primality”, and maybe we will even be able to prove such intuition with a deduction. Such a process will be shown in the next paragraph.
Scientific Literature
In the context of the publication of scientific content, the cycle above is actually inverted since its purpose is to build confidence in some result, rather than searching for one. Moreover, the loop must start and stop at the very same place.
The process looks like this[7]: «In the context of a scientific study, an enigmatic observation investigated by a person motivates the specification of an experiment, which when executed strengthens or weakens some belief, which when reformulated develops a new law, which when proposed by a person leads to a possible solution of the original enigma».
In the context of a scientific study:
an enigma motivates the specification of an experiment →
RETRODUCTION: retroduce my enigma so to yield some testable clarification[the experiment] strengthens or weakens a belief →
REPRODUCTION: reproduce my experiment so to build up confidence about some intuition[the belief] is reformulated by a person as a law[8] →
DEDUCTION: deduce some general law inspired by my belief[the law] proposed by a person solves the original enigma →
INDUCTION: induce my law to justify this interpretation of the original enigma.
Example: assume that a new crab-like specimen has just been found.
Retroduce that we can test for typical crustacean characteristics, such as the presence of claws and antennae.
Reproduce the test and observe that the specimen exhibits every known crustacean characteristic. This fact builds confidence in the idea that it is a true crab.
Deduce that this specimen is either a false crab with every known crustacean characteristic or an actual crustacean, and that the latter is the most plausible alternative.
Induce that the new specimen is actually a crustacean. That naturally explains the crab-like appearance we noticed initially.
A Scent of Logical Calculus
I want to conclude this post by showing an example that, in my opinion, looks very close to being a formal derivation in some modal calculus modelling scientific discoveries. There are many pieces that don’t fit as nicely as I’d like, but I hope to receive support from the community into making the following argument rigorous and consistent.
Context: imagine this is the year 1697 and that Australia is still largely unexplored. Consider the following hypothesis.
Hypothesis: all swans are white.
Use the Hypothesis as your assumption to reduce some valid consequence. The consequence is going to represent your Prediction.
REDUCTION: (Assumption) All swans are white → (Consequence) If you explore a new land and you find a swan, it will be a white swan.
Prediction: if you explore a new land and you find a swan, it will be a white swan.
Use the Prediction as your supposition and introduce a clue and some possible confirmation. The confirmation is going to represent your Experiment.
INTRODUCTION: (Clue) Australia is largely unexplored → (Confirmation) If you were to organise an expedition and find swans, you may confirm they are always white → (Supposition) If you explored a new land and you found a swan, it would be a white swan.
Experiment: fund an expedition to Australia and confirm the colour of any found swan specimen.
Use the Experiment as your expected result. Prepare some appropriate setup and reproduce your test. The test is going to represent your Observation.
REPRODUCTION: (Setup) Travel to Australia with a swan expert → (Test) Swan specimen 1 is white, swan specimen 2 is white, … → (Result) During the expedition to Australia, it is confirmed that all found swan specimens are white.
Observation: swan specimen 10 is actually black!
Use the surprising Observation as your enigma, so to abduce some new explanation. This may represent a possible new starting Hypothesis.
ABDUCTION: (Explanation) All swans are white except in Australia, where they might be black → (Enigma) Australian swan specimen 10 may be white or black.
Conclusion
Let’s get back to the starting problems.
Is there a better way to classify the inference rules?
My answer is: yes, there is a better way to classify inference rules! Such way is purpose-based rather than method-based. That is “better” because it standardizes the definitions, it provides conceptual tools to invent new inference rules, it fits nicely into the existing knowledge bases, and it provides additional insights into the four steps of the scientific cycles.
If so, can that improve the reasoning capabilities of large language models?
My answer is: possibly yes, because it makes the chain of thought very structured and quite close to a formal derivation (as shown in the last section). That is very promising, although not rigorous at the moment.
I hope you enjoyed reading this, let me know if you are interested into joining this discussion and provide your feedback.
Addenda
If this post was of interest to you, I recommend to view this presentation from Gabriele Carcassi, where he explains how to systematize physics in solid logical grounds—a topic he has been working on for quite some time. You may be surprised to look at some of his examples: they show that many common statements are not scientifically verifiable! E.g. “the mass of the photon is exactly 0 eV” may be a physical truth, but it is impossible to test it since it demands infinite precision.
A paper recently introduced a concept called transduction that is applied to LLMs. It describes the act of guessing the correct output of some given predictable inputs, as opposed to an induction (that the paper described as the act of guessing the entire distribution of some given predictable inputs).
Further Links
Control Vectors as Dispositional Traits (my first post)
All the Following are Distinct (my previous post)
Who I am
My name is Gianluca Calcagni, born in Italy, with a Master of Science in Mathematics. I am currently (2024) working in IT as a consultant with the role of Salesforce Certified Technical Architect. My opinions do not reflect the opinions of my employer or my customers. Feel free to contact me on Twitter or Linkedin.
Revision History
[2024-09-03] Post published.
[2024-09-05] Changed title from “Inference Rules and AI” to “An Opinionated Look at Inference Rules”.
[2024-09-10] Changed preview picture.
[2024-10-01] Included Addenda about Gabriele Carcassi.
[2024-11-11] Included Addenda about Transduction.
[2024-11-26] Included a few recap screenshots.
[2024-11-28] Included reference to conversational game theory.
Footnotes
- ^
In mathematical logic, the question received a very structured answer. However, I am attempting to define “inference rules” in a looser way on purpose.
- ^
I am aware the there is no such thing as a “discursive” computational process, but this term captures my intuition about what we need to discuss here. It somehow relates to conversational game theory.
- ^
But the pool may not be complete and not even consistent! Perfection is not required, we only need levels of confidence.
- ^
Examples of argumentation methods: formal derivations, pattern abstractions, statistical likelihood, documented observations, intuitive analogies, etc.
- ^
See falsifiability (as per Popper’s epistemology).
- ^
Category Theory for Scientists, by David Isaac Spivak, 2013
- ^
The following is not a quote from Spivak!
- ^
In physics, an “enigma” can be some puzzling observation—but in mathematics, an “enigma” is just some mathematical problem. In physics, a “law” can be unprovable in principle—but in mathematics, a “law” shall always be rigorously proved as a theorem.
Something quite unexpected happened in the past 19 hours: since I published this post, I received over 12 downvotes! I wasn’t expecting lots of feedback anyway, but this time I was definitely caught by surprise by looking at a negative result.
It’s okay if my point of view doesn’t resonate with the community (being popular is not the reason why I write here), however I am intrigued by this reaction and I’d like to investigate it.
If you happen to read my post and you decide to downvote it, please proceed—but I’d appreciate it if you could explain the reason why. I m happy to be challenged and I will accept even harsh judgements, if that’s how you feel.
I did not read the whole post, but on a quick skim I see it does not say much about AI, except for a paragraph in the conclusion. Maybe people felt clickbaited. Full disclosure: I did neither upvote nor downvote this post.