Refinement Types via SMT and Predicate Abstraction

Previously we saw how Refinement Reflection can be used to write Haskell functions that prove theorems about other Haskell functions. Today, we will see how Refinement Reflection works on recursive data types. As an example, we will prove that lists are monoids (under nil and append).

Lets see how to express the monoid laws as liquid types, and then prove the laws by writing plain Haskell functions that are checked by LiquidHaskell.

We’ve taught LiquidHaskell a new trick that we call “Refinement Reflection” which lets us turn Haskell into a theorem prover capable of proving arbitrary properties of code. The key idea is to reflect the code of the function into its output type, which lets us then reason about the function at the (refinement) type level. Lets see how to use refinement types to express a theorem, for example that fibonacci is a monotonically increasing function, then write plain Haskell code to reify a paper-and-pencil-style proof for that theorem, that can be machine checked by LiquidHaskell.

I have been preparing an undergraduate course on Compilers in which we build a compiler that crunches an ML-like language to X86 assembly. One of my favorite steps in the compilation is the conversion to A-Normal Form (ANF) where, informally speaking, each call or primitive operation’s arguments are immediate values, i.e. constants or variable lookups whose values can be loaded with a single machine instruction. For example, the expression

25: ((2 + 3) * (12 - 4)) * (7 + 8)

has the A-Normal Form:

31: let anf0 = 2 + 3
32:     anf1 = 12 - 4
33:     anf2 = anf0 * anf1
34:     anf3 = 7 + 8
35: in
36:     anf2 * anf3

The usual presentation of ANF conversion is very elegant but relies upon a clever use of continuations. Lets look at another perhaps simpler approach, where we can use refinements to light the way.

The “Hello World!” example for fancy type systems is probably the sized vector or list append function (“The output has size equal to the sum of the inputs!”). One the one hand, it is perfect: simple enough to explain without pages of code, yet complex enough to show off whats cool about dependency. On the other hand, like the sweater I’m sporting right now, it’s a bit well-worn and worse, was never wholly convincing (“Why do I care what the size of the output list is anyway?”)

Recently, I came across a nice example that is almost as simple, but is also well motivated: Okasaki’s beautiful Lazy Amortized Queues. This structure leans heavily on an invariant to provide fast insertion and deletion. Let’s see how to enforce that invariant with LiquidHaskell.

A large part of the allure of Haskell is its elegant, high-level ADTs that ensure1 that programs won’t be plagued by problems like the infamous SSL heartbleed bug.

However, another part of Haskell’s charm is that when you really really need to, you can drop down to low-level pointer twiddling to squeeze the most performance out of your machine. But of course, that opens the door to the #heartbleeds.

Can we have have our cake and eat it too?

Can we twiddle pointers and still get the nice safety assurances of high-level types?

Yesterday someone asked on Reddit how one might define GHC’s OrdList in a way that statically enforces its three key invariants. The accepted solution required rewriting OrdList as a GADT indexed by a proof of emptiness (which is essentially created by a run-time check), and used the new Closed Type Families extension in GHC 7.8 to define a type-level join of the Emptiness index.

Today, let’s see a somewhat more direct way of tackling this problem in LiquidHaskell, in which we need not change a single line of code (well.. maybe one), and need not perform any dynamic checks.

We’ve seen how, in the presence of lazy evaluation, refinements require termination. Next, we saw how LiquidHaskell can be used to prove termination.

Today, lets see how termination requires refinements.

That is, a crucial feature of LiquidHaskell’s termination prover is that it is not syntactically driven, i.e. is not limited to say, structural recursion. Instead, it uses the wealth of information captured by refinements that are at our disposal, in order to prove termination.

This turns out to be crucial in practice. As a quick toy example – motivated by a question by Elias – lets see how, unlike purely syntax-directed (structural) approaches, LiquidHaskell proves that recursive functions, such as Euclid’s GCD algorithm, terminates.

As explained in the last two posts, we need a termination checker to ensure that LiquidHaskell is not tricked by divergent, lazy computations into telling lies. Happily, it turns out that with very little retrofitting, and a bit of jiu jitsu, we can use refinements themselves to prove termination!

Previously, we caught LiquidHaskell telling a lie. Today, lets try to get to the bottom of this mendacity, in order to understand how we can ensure that it always tells the truth.