LiquidHaskell

Refinement Types via SMT and Predicate Abstraction

We’ve seen all sorts of interesting invariants that can be expressed with refinement predicates. For example, whether a divisor is non-zero, the dimension of lists, ensuring the safety of vector indices and reasoning about the set of values in containers and verifying their uniqueness. In each of these cases, we were working with specific refinement predicates that described whatever property was of interest.

Today, (drumroll please), I want to unveil a brand new feature of LiquidHaskell, which allows us to abstract over specific properties or invariants, which significantly increases the expressiveness of the system, whilst still allowing our friend the SMT solver to carry out verification and inference automatically.

30: 
31: module MickeyMouse where
32: 
33: import Language.Haskell.Liquid.Prelude (isEven)

Pin The Specification On the Function

Lets look at some tiny mickey-mouse examples to see why we may want to abstract over refinements in the first place.

Consider the following monomorphic max function on Int values:

45: maxInt     :: Int -> Int -> Int 
46: forall <p :: (GHC.Types.Int)-> Bool>.
{VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}maxInt {VV : (GHC.Types.Int) | ((papp1 p VV))}x {VV : (GHC.Types.Int) | ((papp1 p VV))}y = if {VV : (GHC.Types.Int) | ((papp1 p VV)) && (VV == x)}x x:{VV : (GHC.Types.Int) | ((papp1 p VV))}
-> y:{VV : (GHC.Types.Int) | ((papp1 p VV))}
-> {VV : (GHC.Types.Bool) | (((Prop VV)) <=> (x <= y))}<= {VV : (GHC.Types.Int) | ((papp1 p VV)) && (VV == y)}y then {VV : (GHC.Types.Int) | ((papp1 p VV)) && (VV == y)}y else {VV : (GHC.Types.Int) | ((papp1 p VV)) && (VV == x)}x 

We could give maxInt many, quite different and incomparable refinement types like

50: maxInt :: {v:Int | v >= 0} -> {v:Int | v >= 0} -> {v:Int | v >= 0}

or

54: maxInt :: {v:Int | v < 10} -> {v:Int | v < 10} -> {v:Int | v < 10}

or even

58: maxInt :: {v:Int | (Even v)} -> {v:Int | (Even v)} -> {v:Int | (Even v)}

All of the above are valid.

But which one is the right type?

At this point, you might be exasperated for one of two reasons.

First, the type enthusiasts among you may cry out – “What? Does this funny refinement type system not have principal types?”

No. Or, to be precise, of course not!

Principal typing is a lovely feature that is one of the many reasons why Hindley-Milner is such a delightful sweet spot. Unfortunately, the moment one wants fancier specifications one must tearfully kiss principal typing good bye.

Oh well.

Second, you may very well say, “Yes yes, does it even matter? Just pick one and get on with it already!”

Unfortunately, it matters quite a bit.

Suppose we had a refined type describing valid RGB values:

87: {-@ type RGB = {v: Int | ((0 <= v) && (v < 256)) } @-}

Now, if I wrote a function that selected the larger, that is to say, the more intense, of two RGB values, I would certainly like to check that it produced an RGB value!

95: {-@ intenser   :: RGB -> RGB -> RGB @-}
96: {VV : (GHC.Types.Int) | (VV < 256) && (0 <= VV)}
-> {VV : (GHC.Types.Int) | (VV < 256) && (0 <= VV)}
-> {VV : (GHC.Types.Int) | (VV < 256) && (0 <= VV)}intenser {VV : (GHC.Types.Int) | (VV < 256) && (0 <= VV)}c1 {VV : (GHC.Types.Int) | (VV < 256) && (0 <= VV)}c2 = forall <p :: (GHC.Types.Int)-> Bool>.
{VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}maxInt {VV : (GHC.Types.Int) | (VV == c1) && (VV < 256) && (0 <= VV)}c1 {VV : (GHC.Types.Int) | (VV == c2) && (VV < 256) && (0 <= VV)}c2

Well, guess what. The first type (with v >= 0) one would tell us that the output was non-negative, losing the upper bound. The second type (with v < 10) would cause LiquidHaskell to bellyache about maxInt being called with improper arguments – muttering darkly that an RGB value is not necesarily less than 10. As for the third type … well, you get the idea.

So alas, the choice of type does matter.

If we were clairvoyant, we would give maxInt a type like

108: maxInt :: RGB -> RGB -> RGB 

but of course, that has its own issues. (“What? I have to write a separate function for picking the larger of two 4 digit numbers?!”)

Defining Parametric Invariants

Lets take a step back from the types, and turn to a spot of handwaving.

What’s really going on with maxInt?

Well, the function returns one of its two arguments x and y.

This means that if both arguments satisfy some property then the output must satisfy that property, regardless of what that property was!

To teach LiquidHaskell to understand this notion of “regardless of property” we introduce the idea of abstracting over refinements or, if you prefer, parameterizing a type over its refinements.

In particular, we type maxInt as

133: {-@ maxInt :: forall <p :: Int -> Prop>. Int<p> -> Int<p> -> Int<p>@-}

Here, the definition says explicitly: for any property p that is a property of Int, the function takes two inputs each of which satisfy p and returns an output that satisfies p. That is to say, Int<p> is just an abbreviation for {v:Int | (p v)}

Digression: Whither Decidability? At first glance, it may appear that these abstract p have taken us into the realm of higher-order logics, where we must leave decidable checking and our faithful SMT companion at that door, and instead roll up our sleeves for interactive proofs (not that there’s anything wrong with that!) Fortunately, that’s not the case. We simply encode abstract refinements p as uninterpreted function symbols in the refinement logic.

Uninterpreted functions are special symbols p which satisfy only the congruence axiom.

150: forall X, Y. if (X = Y) then  p(X) = p(Y)

Happily, reasoning with such uninterpreted functions is quite decidable (thanks to Ackermann, yes, that Ackermann) and actually rather efficient. Thus, via SMT, LiquidHaskell happily verifies that maxInt indeed behaves as advertised: the input types ensure that both (p x) and (p y) hold and hence that the returned value in either branch of maxInt satisfies the refinement {v:Int | p(v)}, thereby ensuring the output type.

By the same reasoning, we can define the maximumInt operator on lists:

163: {-@ maximumInt :: forall <p :: Int -> Prop>. x:[Int <p>] -> Int <p>@-}
164: forall <p :: (GHC.Types.Int)-> Bool>.
[{VV : (GHC.Types.Int)<p> | true}]
-> {VV : (GHC.Types.Int)<p> | true}maximumInt (x:xs) = ({VV : (GHC.Types.Int) | ((papp1 p VV))}
 -> {VV : (GHC.Types.Int) | ((papp1 p VV))}
 -> {VV : (GHC.Types.Int) | ((papp1 p VV))})
-> {VV : (GHC.Types.Int) | ((papp1 p VV))}
-> [{VV : (GHC.Types.Int) | ((papp1 p VV))}]
-> {VV : (GHC.Types.Int) | ((papp1 p VV))}foldr forall <p :: (GHC.Types.Int)-> Bool>.
{VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}
-> {VV : (GHC.Types.Int)<p> | true}maxInt {VV : (GHC.Types.Int) | ((papp1 p VV)) && (VV == x)}x {VV : [{VV : (GHC.Types.Int) | ((papp1 p VV))}] | (VV == xs) && ((len VV) >= 0)}xs

Using Parametric Invariants

Its only useful to parametrize over invariants if there is some easy way to instantiate the parameters.

Concretely, consider the function:

176: {-@ maxEvens1 :: xs:[Int] -> {v:Int | (Even v)} @-}
177: [(GHC.Types.Int)] -> {VV : (GHC.Types.Int) | ((VV mod 2) == 0)}maxEvens1 [(GHC.Types.Int)]xs = forall <p :: (GHC.Types.Int)-> Bool>.
[{VV : (GHC.Types.Int)<p> | true}]
-> {VV : (GHC.Types.Int)<p> | true}maximumInt {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | (VV == xs'') && ((len VV) == (1 + (len xs'))) && ((len VV) >= 0)}xs''
178:   where 
179:     {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | ((len VV) >= 0)}xs'      = [ (GHC.Types.Int)x | x <- {VV : [(GHC.Types.Int)] | (VV == xs) && ((len VV) >= 0)}xs, x:(GHC.Types.Int)
-> {VV : (GHC.Types.Bool) | (((Prop VV)) <=> ((x mod 2) == 0))}isEven (GHC.Types.Int)x]
180:     {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | ((len VV) == (1 + (len xs')))}xs''     = {VV : (GHC.Types.Int) | (VV == (0  :  int))}0 forall <p :: (GHC.Types.Int)-> (GHC.Types.Int)-> Bool>.
y:{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}
-> ys:[{VV : (GHC.Types.Int)<p y> | ((VV mod 2) == 0)}]<p>
-> {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<p> | ((len VV) == (1 + (len ys)))}: {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | (VV == xs') && ((len VV) >= 0)}xs'

where the function isEven is from the Language.Haskell.Liquid.Prelude library:

184: {- isEven :: x:Int -> {v:Bool | (Prop(v) <=> (Even x))} -}
185: isEven   :: Int -> Bool
186: isEven x = x `mod` 2 == 0

where the predicate Even is defined as

192: {-@ predicate Even X = ((X mod 2) = 0) @-}

To verify that maxEvens1 returns an even number, LiquidHaskell

  1. infers that the list (0:xs') has type [{v:Int | (Even v)}], that is, is a list of even numbers.

  2. automatically instantiates the refinement parameter of maximumInt with the concrete refinement {\v -> (Even v)} and so

  3. concludes that the value returned by maxEvens1 is indeed Even.

Parametric Invariants and Type Classes

Ok, lets be honest, the above is clearly quite contrived. After all, wouldn’t you write a polymorphic max function? And having done so, we’d just get all the above goodness from old fashioned parametricity.

That is to say, if we just wrote:

213: max     :: forall a. a -> a -> a 
214: max x y = if x > y then x else y
215: 
216: maximum :: forall a. [a] -> a
217: maximum (x:xs) = foldr max x xs

then we could happily instantiate the a with {v:Int | v > 0} or {v:Int | (Even v)} or whatever was needed at the call-site of max. Sigh. Perhaps we are still pining for Hindley-Milner.

Well, if this was an ML perhaps we could but in Haskell, the types would be

225: (>)     :: (Ord a) => a -> a -> Bool
226: max     :: (Ord a) => a -> a -> a
227: maximum :: (Ord a) => [a] -> a

Our first temptation may be to furtively look over our shoulders, and convinced no one was watching, just pretend that funny (Ord a) business was not there, and quietly just treat maximum as [a] -> a and summon parametricity.

That would be most unwise. We may get away with it with the harmless Ord but what of, say, Num.

Clearly a function

238: numCrunch :: (Num a) => [a] -> a

is not going to necessarily return one of its inputs as an output. Thus, it is laughable to believe that numCrunch would, if given a list of of even (or positive, negative, prime, RGB, …) integers, return a even (or positive, negative, prime, RGB, …) integer, since the function might add or subtract or multiply or do other unspeakable things to the numbers in order to produce the output value.

And yet, typeclasses are everywhere.

How could we possibly verify that

253: {-@ maxEvens2 :: xs:[Int] -> {v:Int | (Even v) } @-}
254: [(GHC.Types.Int)] -> {VV : (GHC.Types.Int) | ((VV mod 2) == 0)}maxEvens2 [(GHC.Types.Int)]xs = [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]
-> {VV : (GHC.Types.Int) | ((VV mod 2) == 0)}maximumPoly {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | (VV == xs'') && ((len VV) == (1 + (len xs'))) && ((len VV) >= 0)}xs''
255:   where 
256:      {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | ((len VV) >= 0)}xs'     = [ (GHC.Types.Int)x | x <- {VV : [(GHC.Types.Int)] | (VV == xs) && ((len VV) >= 0)}xs, x:(GHC.Types.Int)
-> {VV : (GHC.Types.Bool) | (((Prop VV)) <=> ((x mod 2) == 0))}isEven (GHC.Types.Int)x]
257:      {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | ((len VV) == (1 + (len xs')))}xs''    = {VV : (GHC.Types.Int) | (VV == (0  :  int))}0 forall <p :: (GHC.Types.Int)-> (GHC.Types.Int)-> Bool>.
y:{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}
-> ys:[{VV : (GHC.Types.Int)<p y> | ((VV mod 2) == 0)}]<p>
-> {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<p> | ((len VV) == (1 + (len ys)))}: {VV : [{VV : (GHC.Types.Int) | ((VV mod 2) == 0)}]<\_ VV -> ((VV mod 2) == 0)> | (VV == xs') && ((len VV) >= 0)}xs'

where the helpers were in the usual Ord style?

263: maximumPoly :: (Ord a) => [a] -> a
264: forall a <p :: a-> Bool>.
(GHC.Classes.Ord a) =>
[{VV : a<p> | true}] -> {VV : a<p> | true}maximumPoly (x:xs) = ({VV : a | ((papp1 p VV))}
 -> {VV : a | ((papp1 p VV))} -> {VV : a | ((papp1 p VV))})
-> {VV : a | ((papp1 p VV))}
-> [{VV : a | ((papp1 p VV))}]
-> {VV : a | ((papp1 p VV))}foldr {VV : a | ((papp1 p VV))}
-> {VV : a | ((papp1 p VV))} -> {VV : a | ((papp1 p VV))}maxPoly {VV : a | ((papp1 p VV)) && (VV == x)}x {VV : [{VV : a | ((papp1 p VV))}] | (VV == xs) && ((len VV) >= 0)}xs
265: 
266: maxPoly     :: (Ord a) => a -> a -> a 
267: forall a <p :: a-> Bool>.
(GHC.Classes.Ord a) =>
{VV : a<p> | true} -> {VV : a<p> | true} -> {VV : a<p> | true}maxPoly {VV : a | ((papp1 p VV))}x {VV : a | ((papp1 p VV))}y = if {VV : a | ((papp1 p VV)) && (VV == x)}x x:{VV : a | ((papp1 p VV))}
-> y:{VV : a | ((papp1 p VV))}
-> {VV : (GHC.Types.Bool) | (((Prop VV)) <=> (x <= y))}<= {VV : a | ((papp1 p VV)) && (VV == y)}y then {VV : a | ((papp1 p VV)) && (VV == y)}y else {VV : a | ((papp1 p VV)) && (VV == x)}x

The answer: abstract refinements.

First, via the same analysis as the monomorphic Int case, LiquidHaskell establishes that

276: {-@ maxPoly :: forall <p :: a -> Prop>. 
277:                  (Ord a) => x:a<p> -> y:a<p> -> a<p> @-}

and hence, that

283: {-@ maximumPoly :: forall <p :: a -> Prop>. 
284:                      (Ord a) => x:[a<p>] -> a<p>     @-}

Second, at the call-site for maximumPoly in maxEvens2 LiquidHaskell instantiates the type variable a with Int, and the abstract refinement parameter p with {\v -> (Even v)} after which, the verification proceeds as described earlier (for the Int case).

And So

If you’ve courageously slogged through to this point then you’ve learnt that

  1. Sometimes, choosing the right type can be quite difficult!

  2. But fortunately, with abstract refinements we needn’t choose, but can write types that are parameterized over the actual concrete invariants or refinements, which

  3. Can be instantiated at the call-sites i.e. users of the functions.

We started with some really frivolous examples, but buckle your seatbelt and hold on tight, because we’re going to see some rather nifty things that this new technique makes possible, including induction, reasoning about memoizing functions, and ordering and sorting data. Stay tuned.