If we expand out an arbitrary program, we get a (usually infinite) expression tree. For instance, we can expand
fact(n) := (n == 0) ? 1 : n*fact(n-1)
into
fact(n) = (n == 0) ? 1 : n*(((n-1) == 0) ? 1 : (n-1) * ( (((n-1)-1) == 0) ? … ))))
Let’s call two programs “expression-equivalent” if they expand into the same expression tree (allowing for renaming of input variables). Two interesting problems:
Write an efficient algorithm to decide expression-equivalence of two given programs.
Write a program M which decides whether a given program is expression-equivalent to M itself.
I’m pretty sure these are both tractable, as well as some relaxations of them (e.g. allowing for more kinds of differences between the expressions).
Does anybody know of an existing name for “expression-equivalence”, an existing algorithm for solving either of the above problems, or any relevant literature?
In lambda calculus, this is called beta-equivalence, and is undecidable. (Renaming of variables is called alpha-equivalence, and is typically assumed implicitly.) If you know that two expressions both have beta normal forms, then by the Church-Rosser theorem, you can decide their equivalence by computing their normal forms and comparing for identity.
In some systems of typed lambda calculus, every expression has a normal form, so equivalence is decidable for the whole language, but the cost of that is that the language will not be able to express all computable functions. In particular, you may have difficulty shoehorning a program that can talk about itself into such a system. Self-reference is a basilisk for systems of logic, ever since Russell knocked down Frege’s system by saying, “what about the set of sets that are not members of themselves?”
There are results along these lines for term rewriting systems and graph rewriting systems also.
Decidability of equivalence is broken somewhere between simply typed lambda calculus and System-F. Without recursive types you are strongly normalizing and thus “trivially” decidable. However, just adding recursive types does not break decidability (e.g. see http://www.di.unito.it/~felice/pdf/ictcs.pdf). Similarly, just adding some higher-order functions or parametric polymorphism does also not necessarily break decidability (e.g. see Hindley-Milner). In my (admittedly limited) experience, when making a type system stronger, it is usually some strange, sometimes subtle interaction of these “type system features” that break decidability.
So the first problem raised in the OP is probably tractable for many, quite expressive type systems, even including recursive types.
Though I fully agree with you that the second problem is usually how undecidability proofs start and I’m more skeptical towards that one.
I’m not sure if this is exactly the same but it reminds me a lot of recursive types and checking if two such recursive types are equal (see https://en.wikipedia.org/wiki/Recursive_data_type#Equirecursive_types). I looked into that a few years ago and it seems to be decidable with a relatively easy algorithm: http://lucacardelli.name/Papers/SRT.pdf (the paper is a bit longer but it also shows algorithms for subtyping)
To map this onto your expression problem maybe one can just take expression symbols as “type terminals” and use the same algorithm.
I think this is related to the word problem for the rewriting system defined by your programming language. When I first read this question I was thinking “Something to do with Church-Rosser?”—but you can follow the links to see for yourself if that literature is what you’re after.
I’d call it an instance of https://en.wikipedia.org/wiki/Equivalence_problem—although unusually, your language class only admits one word per language, and admits infinite words.
I’m not convinced f(n) := f(n) should be considered inequivalent from f(n) := f(n+1) - neither coterminates.
I agree that these look tractable.
Given a program O for the first problem, a sufficient condition for M would be M(x) = O(M, x). This can be implemented as M(x) = O(M’(M’),x), where M’(M″,x) = O(M″(M″),x).