Friday, October 29, 2010

That curious termination argument

Quick, does the SML function norm terminate on all tree-valued inputs?
  datatype tree 
= Nil
| Node of tree * int * tree

fun norm Nil = Nil
| norm (Node (Nil, x, t)) =
Node (Nil, x, norm t)
| norm (Node (Node (t1, x, t2), y, t3)) =
norm (Node (t1, x, (Node (t2, y, t3))))
And if so, why?

This came from a post on the Theoretical Computer Science StackExchange asking about the theorem that, if two trees agree on their in-order traversals, then they are equivalent up to rotations. The "simple and intuitive" result mentioned in that post is a constructive proof which relies on the same termination ordering as the norm function. And it's also not terribly difficult to see why the function should terminate - it's a lexicographic termination argument: either the tree loses a node (the second case) or the tree keeps the same number of nodes and the depth of the left-most leaf decreases (the third case).

However, this sort of problem comes up not-terribly-infrequently in settings (like Agda) where termination must be established by structural induction, and the preceding argument is not structurally inductive on trees. Whenever I encounter this it always drives me crazy, so this is something of a note to myself.


The solution that works great for this sort of function in a dependently typed language is to define a judgment over terms that captures precisely the termination ordering that we will want to consider:
  data TM {A : Set} : Tree A -> Set where
Done : TM Nil
Recurse : {x : A}{t : Tree A}
-> TM t
-> TM (Node Nil x t)
Rotate : {x y : A}{t1 t2 t3 : Tree A}
-> TM (Node t1 x (Node t2 y t3))
-> TM (Node (Node t1 x t2) y t3)

Then (and this is really the tricky part) we have to write the proof that for every tree t there is a derivaiton of TM t. The key, and the reason that I always have to bend my brain whenever I encounter a termination argument like this one, is the append helper function.
  metric : {A : Set} (t : Tree A) -> TM t
metric Nil = Done
metric (Node t1 x t2) = helper t1 (metric t2)
append : {A : Set}{t1 t2 : Tree A} -> TM t1 -> TM t2 -> TM (t1 ++> t2)
append Done t2 = t2
append (Recurse t1) t2 = Recurse (append t1 t2)
append (Rotate t1) t2 = Rotate (append t1 t2)

helper : {A : Set} {x : A}{t2 : Tree A}
(t1 : Tree A)
-> TM t2
-> TM (Node t1 x t2)
helper Nil tt = Recurse tt
helper (Node t1 x t2) tt = Rotate (helper t1 (append (helper t2 Done) tt))

Now it's trivial to write a version of the norm function that Agda will treat as terminating, because I just pass in an extra proof of TM t, and the proof proceeds by trivial structural induction on that proof.
  norm : {A : Set} -> Tree A -> Tree A
norm t = helper t (metric t)
helper : {A : Set} -> (t : Tree A) -> TM t -> Tree A
helper Nil Done = Nil
helper (Node Nil x t) (Recurse tm) =
Node Nil x (helper t tm)
helper (Node (Node t1 x t2) y t3) (Rotate tm) =
helper (Node t1 x (Node t2 y t3)) tm

The Agda code for the above is here. Are there other ways of expressing this termination argument in Agda that make as much or more sense? One approach I fiddled with was presenting a tree indexed by (1) the total number of nodes in it and (2) the depth of the left-most leaf:
  data ITree (A : Set) : NatT → NatT → Set where
Nil : ITree A Z Z
Node : ∀{total1 total2 left1 left2}
-> ITree A left1 total1
-> A
-> ITree A left2 total2
-> ITree A (1 +n left1) (1 +n (total1 +n total2))

However, due to the complexity of the dependent equality reasoning, I couldn't get Agda to believe the intuitive termination argument I presented at the beginning.

Trees have normal forms under rotation

Once the above argument works, it's not difficult to prove the theorem mentioned on TCS StackExchange; here's the Agda proof.

[Update Nov 15, 2010] Over at reddit, Conor McBride says "When someone asks "how do I show this non-structural function terminates?", I always wonder what structurally recursive function I'd write instead." and then proceeds to answer that question by giving the appropriate structurally inductive functions grind and rot. Nice! His proof also introduces me to a new Agda built-in, rewrite, whose existence I was previously unaware of. Oh Agda release announcements, when will I learn to read you?


  1. BTW, what you're describing---defining a predicate that gives an inductive definition of the domain/recursive calls of the function, and separately proving that everything satisfies that predicate---is often called the "Bove-and-Capretta" method

  2. Thanks - is the right citation for that "Modelling general recursion in type theory," MSCS 2005?

  3. Simple General Recursion in Type Theory (Bove, 2000, Nordic Journal of Computing) is the earliest cite I can find