Name of Problem?
post by johnswentworth · 20200309T20:15:11.760Z · score: 9 (2 votes) · LW · GW · 25 commentsThis is a question post.
Contents
Answers 5 Richard_Kennaway 5 PhilipTrettner 3 Ramana Kumar 3 Gurkenglas None 25 comments
If we expand out an arbitrary program, we get a (usually infinite) expression tree. For instance, we can expand
fact(n) := (n == 0) ? 1 : n*fact(n1)
into
fact(n) = (n == 0) ? 1 : n*(((n1) == 0) ? 1 : (n1) * ( (((n1)1) == 0) ? ... ))))
Let's call two programs "expressionequivalent" if they expand into the same expression tree (allowing for renaming of input variables). Two interesting problems:
 Write an efficient algorithm to decide expressionequivalence of two given programs.
 Write a program M which decides whether a given program is expressionequivalent to M itself.
I'm pretty sure these are both tractable, as well as some relaxations of them (e.g. allowing for more kinds of differences between the expressions).
Does anybody know of an existing name for "expressionequivalence", an existing algorithm for solving either of the above problems, or any relevant literature?
Answers
In lambda calculus, this is called betaequivalence, and is undecidable. (Renaming of variables is called alphaequivalence, and is typically assumed implicitly.) If you know that two expressions both have beta normal forms, then by the ChurchRosser theorem, you can decide their equivalence by computing their normal forms and comparing for identity.
In some systems of typed lambda calculus, every expression has a normal form, so equivalence is decidable for the whole language, but the cost of that is that the language will not be able to express all computable functions. In particular, you may have difficulty shoehorning a program that can talk about itself into such a system. Selfreference is a basilisk for systems of logic, ever since Russell knocked down Frege's system by saying, "what about the set of sets that are not members of themselves?"
There are results along these lines for term rewriting systems and graph rewriting systems also.
Decidability of equivalence is broken somewhere between simply typed lambda calculus and SystemF. Without recursive types you are strongly normalizing and thus "trivially" decidable. However, just adding recursive types does not break decidability (e.g. see http://www.di.unito.it/~felice/pdf/ictcs.pdf). Similarly, just adding some higherorder functions or parametric polymorphism does also not necessarily break decidability (e.g. see HindleyMilner). In my (admittedly limited) experience, when making a type system stronger, it is usually some strange, sometimes subtle interaction of these "type system features" that break decidability.
So the first problem raised in the OP is probably tractable for many, quite expressive type systems, even including recursive types.
Though I fully agree with you that the second problem is usually how undecidability proofs start and I'm more skeptical towards that one.
I'm not sure if this is exactly the same but it reminds me a lot of recursive types and checking if two such recursive types are equal (see https://en.wikipedia.org/wiki/Recursive_data_type#Equirecursive_types). I looked into that a few years ago and it seems to be decidable with a relatively easy algorithm: http://lucacardelli.name/Papers/SRT.pdf (the paper is a bit longer but it also shows algorithms for subtyping)
To map this onto your expression problem maybe one can just take expression symbols as "type terminals" and use the same algorithm.
I think this is related to the word problem for the rewriting system defined by your programming language. When I first read this question I was thinking "Something to do with ChurchRosser?"  but you can follow the links to see for yourself if that literature is what you're after.
I'd call it an instance of https://en.wikipedia.org/wiki/Equivalence_problem  although unusually, your language class only admits one word per language, and admits infinite words.
I'm not convinced f(n) := f(n) should be considered inequivalent from f(n) := f(n+1)  neither coterminates.
I agree that these look tractable.
Given a program O for the first problem, a sufficient condition for M would be M(x) = O(M, x). This can be implemented as M(x) = O(M'(M'),x), where M'(M'',x) = O(M''(M''),x).
25 comments
Comments sorted by top scores.
Isn't separability of arbitrary Turing machines equivalent to the Halting problem and therefore undecidable?
Yes, but that's for a functional notion of equivalence  i.e. it's about whether the two TMs have the same inputoutput behavior. The notion of equivalence I'm looking at is not just about same inputoutput, but also structurallysimilar computations. Intuitively, I'm asking whether they're computing the same function in the same way.
(In fact, circumventing the undecidability issue is a key part of why I'm formulating the problem like this in the first place. So you're definitely asking the right question here.)
Not sure I understand the question. Consider these two programs:

f(n) := f(n)

f(n) := f(n+1)
Which expression trees do they correspond to? Are these trees equivalent?
The first would generate a stick: ((((((((...)))))))))
The second would generate: (((((...) + 1) + 1) + 1) + 1)
These are not equivalent.
Does that make sense?
I don't understand why the second looks like that, can you explain?
Oh, I made a mistake. I guess they would look like ...((((((((...)))))))))... and ...(((((...) + 1) + 1) + 1) + 1)..., respectively. Thanks for the examples, that's helpful  good examples where the fixed point of expansion is infinite "on the outside" as well as "inside".
Was that the confusion? Another possible point of confusion is why the "+ 1"s are in the expression tree; the answer is that addition is usually an atomic operator of a language. It's not defined in terms of other things; we can't/don't betareduce it. If it were defined in terms of other things, I'd expand it, and then the expression tree would look more complicated.
Then isn't it possible to also have infinite expansions "in the middle", not only "inside" and "outside"? Something like this:
f(n) := f(g(n))
g(n) := g(n+1)
Maybe there's even some way to have infinite towers of infinite expansions. I'm having trouble wrapping my head around this.
Yup, that's right.
I tentatively think it's ok to just ignore cases with "outside" infinities. Examples like f(n) = f(n+1) should be easy to detect, and presumably it would never show up in a program which halts. I think programs which halt would only have "inside" infinities (although some nonhalting programs would also have inside infinities), and programs with noninside infinities should be detectable  i.e. recursive definitions of a function shouldn't have the function itself as the outermost operation.
Still not sure  I could easily be missing something crucial  but the whole problem feels circumventable. Intuitively, Turing completeness only requires infinity in one timelike direction; inside infinities should suffice, so syntactic restrictions should be able to eliminate the other infinities.
Ok, if we disallow cycles of outermost function calls, then it seems the trees are indeed infinite only in one direction. Here's a halfbaked idea then: 1) interpret every path from node to root as a finite word 2) interpret the tree as a grammar for recognizing these words 3) figure out if equivalence of two such grammars is decidable. For example, if each tree corresponds to a regular grammar, then you're in luck because equivalence of regular grammars is decidable. Does that make sense?
Yeah, that makes sense. And off the top of my head, it seems like they would indeed be regular grammars  each node in the tree would be a state in the finite state machine, and then copies of the tree would produce loops in the state transition graph. Symbols on the edges would be the argument names (or indices) for the inputs to atomic operations. Still a few i's to dot and t's to cross, but I think it works.
Elegant, too. Nice solution!
I'm actually not sure it's a regular grammar. Consider this program:
f(n) := n+f(n1)
Which gives the tree
n+(n1)+((n1)1)+...
The path from any 1 to the root contains a bunch of minuses, then at least as many pluses. That's not regular.
So it's probably some other kind of grammar, and I don't know if it has decidable equivalence.
Just to make sure I understand, the first few expansions of the second one are:
 f(n)
 f(n+1)
 f((n+1) + 1)
 f(((n+1) + 1) + 1)
 f((((n+1) + 1) + 1) + 1)
Is that right? If so, wouldn't the infinite expansion look like f((((...) + 1) + 1) + 1) instead of what you wrote?
Yes, that's correct. I'd view "f((((...) + 1) + 1) + 1)" as an equivalent way of writing it as a string (along with the definition of f as f(n) = f(n + 1)). "...(((((...) + 1) + 1) + 1) + 1)..." just emphasizes that the expression tree does not have a root  it goes to infinity in both directions. By contrast, the expression tree for f(n) = f(n) + 1 does have a root; it would expand to (((((...) + 1) + 1) + 1) + 1).
Does that make sense?
Can you elaborate on what you mean by "expand"? Are you thinking of something analogous to betareduction in the lambda calculus?
Yes, exactly. Anywhere the name of a function appears, replace it with the expression defining the function. (Also, I'm ignoring higherorder functions, function pointers, and the like; presumably the problem is undecidable in languages with those kinds of features, since it's basically just betaequivalence of lambda terms. But we don't need those features to get a Turingcomplete language.)
I'm ignoring higherorder functions, function pointers, and the like;
Ok, I'm still confused.
Does
0
count as a expansion of:
f()
where
f() := (0 == 0) ? 0 : 1
?
No. To clarify, we're not reducing any of the atomic operators of the language  e.g. we wouldn't replace (0 == 0) ? 0 : 1 with 0. As written, that's not a betareduction. If the ternary operator were defined as a function within the language itself, then we could betareduce it, but that wouldn't give us "0"  it would give us some larger expression, containing "0 == 0", "0", and "1".
Actually, thinking about it, here's something which I think is equivalent to what I mean by "expand", within the context of lambda calculus: betareduce, but never drop any parens. So e.g. 2 and (2) and ((2)) would not be equivalent. Whenever we betareduce, we put parens around any term which gets substituted in.
Intuitively, we're talking about a notion of equivalence between programs which cares about how the computation is performed, not just the outputs.
It seems easier to understand if it is expanded from where it terminates:
//Expanded once
function factorial(n)
{
if(n==0)
{return 1}
else if(n==1)
{return 1}
else
{return n*factorial(n)}
}
Cleaned up your code formatting for you.
How did you add the code tag to the html?
Type ``` on an empty line, then press enter.
Write an efficient algorithm to decide expressionequivalence of two given programs.
If the expansion is literally infinite, that isn't going to happen. Although I notice that you have written the expansion as a finite string that indicates infinity with "...".
The expansion is infinite, but it's a repeating pattern, so we can use a finite representation (namely, the program itself). We don't have to write the whole thing out in order to compare.
An analogy: we can represent infinite repeating strings by just writing a finite string, and then assuming it repeats. The analogous problem is then: decide whether two such strings represent the same infinite string. For instance, "abab" and "ababab" would represent the same infinite repeating string: "abababababab...".