LiquidHaskell

Refinement Types via SMT and Predicate Abstraction

Hopefully, the previous article gave you a basic idea about what refinement types look like. Several folks had interesting questions, that are worth discussing in a separate post, since they throw a lot of light on the strengths (or weaknesses, depending on your point of view!) of LiquidHaskell.

22: module Refinements101Reax where

How to relate outputs and inputs

Recall the function divide

31: {-@ divide :: Int -> {v: Int | v /= 0 } -> Int @-}
32: divide     :: Int -> Int -> Int
33: (GHC.Types.Int)
-> {VV : (GHC.Types.Int) | (VV != 0)} -> (GHC.Types.Int)divide (GHC.Types.Int)n 0 = [(GHC.Types.Char)] -> {VV : (GHC.Types.Int) | false}error {VV : [(GHC.Types.Char)] | (len([VV]) >= 0)}"divide by zero"
34: divide n d = {VV : (GHC.Types.Int) | (VV = n)}n x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x / y))}`div` {VV : (GHC.Types.Int) | (VV != 0)}d

and abz was the absolute value function

40: abz               :: Int -> Int
41: x:(GHC.Types.Int)
-> {VV : (GHC.Types.Int) | ((VV = 0) <=> (x = 0)),(0 <= VV)}abz (GHC.Types.Int)n | {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 x:(GHC.Types.Int)
-> y:(GHC.Types.Int)
-> {VV : (GHC.Types.Bool) | ((? Prop([VV])) <=> (x < y))}< {VV : (GHC.Types.Int) | (VV = n)}n     = {VV : (GHC.Types.Int) | (VV = n)}n
42:       | otherwise = {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}- {VV : (GHC.Types.Int) | (VV = n)}n

nanothief remarked that LiquidHaskell was unable to verify the safety of the following call to divide (i.e. was unable to show that x was non-zero at the callsite).

50: {-@ f :: Int -> Int @-}
51: (GHC.Types.Int) -> (GHC.Types.Int)f (GHC.Types.Int)x | x:(GHC.Types.Int)
-> {VV : (GHC.Types.Int) | ((VV = 0) <=> (x = 0)),(0 <= VV)}abz {VV : (GHC.Types.Int) | (VV = x)}x x:{VV : (GHC.Types.Int) | (VV >= 0),(0 <= VV)}
-> y:{VV : (GHC.Types.Int) | (VV >= 0),(0 <= VV)}
-> {VV : (GHC.Types.Bool) | ((? Prop([VV])) <=> (x = y))}== {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}3
52:     | otherwise  = {VV : (GHC.Types.Int) | (VV = (3  :  int))}3 (GHC.Types.Int)
-> {VV : (GHC.Types.Int) | (VV != 0)} -> (GHC.Types.Int)`divide` {VV : (GHC.Types.Int) | (VV = x)}x

Nanothief correctly argues that the code is clearly safe as abz x == 0 being false implies x /= 0. Indeed, the code is safe, however, the reason that LiquidHaskell rejected it has nothing to do with its inability to “track the constraints of values based on tests using new values derived from that value” as Nanothief surmised, but instead, because LiquidHaskell supports modular verification where the only thing known about abz at a use site is whatever is specified in its type.

Concretely speaking, the type

64: abz :: Int -> {v: Int | 0 <= v }

is too anemic to verify f above, as it tells us nothing about the relationship between the output and input – looking at it, we have now way of telling that when the output (of abz) is non-zero, the input must also have been non-zero.

Instead, we can write a stronger type which does capture this information, for example

73: abz :: x:Int -> {v:Int | v = (if (x > 0) then x else (0 - x))}

where

77: v = (if p then e1 else e2)

is an abbreviation for the formula

81: (p => v == e1) && ((not p) => v = e2)

With this specification for abz, LiquidHaskell is able to reason that when abz x is non-zero, x is also non-zero. Of course, abz is trivial enough that we can very precisely capture its exact semantics in the refinement type, but thats is rarely the case.

Nevertheless, even here, you could write a somewhat weaker specification, that still had enough juice to allow the verification of the divide call in f. In particular, we might write

94: {-@ abz :: x:Int -> {v:Int | ((0 <= v) && ((v = 0) <=> (x = 0))) } @-}

which states the output is 0 if and only if the input is 0. LiquidHaskell will happily verify that abz implements this specification, and will use it to verify the safety of f above.

(BTW, follow the link above to demo this code yourself.)

How to tell a Fib

Chris Done asked why LiquidHaskell refused to verify the following definition of fib.

110: {-@ fib :: n:Int -> { b:Int | (n >= 0 && b >= n) } @-}
111: fib :: Int -> Int
112: n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}0
113: fib 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
114: fib n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)

Indeed, the both the specification and the implementation look pretty legit, so what gives? It turns out that there are two different reasons why.

Reason 1: Assumptions vs. Guarantees

What we really want to say over here is that the input n is non-negative. However, putting the refinement n >= 0 in the output constraint means that it becomes something that LiquidHaskell checks that the function fib guarantees (or ensures). That is, the type states that we can pass fib any value n (including negative values) and yet, fib must return values b such that b >= n and n >= 0.

The latter requirement is a rather tall order when an arbitrary n is passed in as input. fib can make no such guarantees since it was given the value n as a parameter. The only way n could be non-negative was that if the caller had sent in a non-negative value. Thus, we want to put the burden of proof on the right entity here, namely the caller.

To assign the burden of proof appropriately, we place the non-negative refinement on the input type

142: {-@ fib' :: n:{v:Int | v >= 0} -> {b:Int | (n >= 0 && b >= n) } @-}
143: fib' :: Int -> Int
144: n:{VV : (GHC.Types.Int) | (VV >= 0)}
-> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib' 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}0
145: fib' 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
146: fib' n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ({VV : (GHC.Types.Int) | (VV >= 0)}nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ({VV : (GHC.Types.Int) | (VV >= 0)}nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)

where now at calls to fib' LiquidHaskell will check that the argument is non-negative, and furthermore, when checking fib' LiquidHaskell will assume that the parameter n is indeed non-negative. So now the constraint n >= 0 on the output is somewhat redundant, and the non-negative n guarantee holds trivially.

Reason 2: The Specification is a Fib

If you run the above in the demo, you will see that LiquidHaskell still doth protest loudly, and frankly, one might start getting a little frustrated at the stubbornness and petulance of the checker.

However, if you stare at the implementation, you will see that it in fact, does not meet the specification, as

162: fib' 2  == fib' 1 + fib' 0
163:         == 0 + 1
164:         == 1

LiquidHaskell is reluctant to prove things that are false. Rather than anthropomorphize frivolously, lets see why it is unhappy.

First, recall the somewhat simplified specification

171: fib' :: n:Int -> { b:Int | (b >= n) } 

As we saw in the discussion about abz, at each recursive callsite the only information LiquidHaskell uses about the returned value, is that described in the output type for that function call.

Thus, LiquidHaskell reasons that the expression:

179: fib' (n-1) + fib' (n-2)

has the type

183: {b: Int | exists b1, b2. b  == b1 + b2 
184:                       && b1 >= n-1 
185:                       && b2 >= n-2     }

where the b1 and b2 denote the values returned by the recursive calls — we get the above by plugging the parameters n-1 and n-2 in for the parameter n in output type for fib'.

The SMT solver simplifies the above to

193: {b: Int | b >= 2n - 3}

Finally, to check the output guarantee is met, LiquidHaskell asks the SMT solver to prove that

197: (b >= 2n - 2)  =>  (b >= n)

The SMT solver will refuse, of course, since the above implication is not valid (e.g. when n is 2) Thus, via SMT, LiquidHaskell proclaims that the function fib' does not implement the advertised type and hence marks it unsafe.

Fixing The Code

How then, do we get Chris’ specification to work out? It seems like it should hold (except for that pesky case where n=2. Indeed, let’s rig the code, so that all the base cases return 1.

213: {-@ fibOK :: n:Int -> {b:Int | ((b >= n) && (b >= 1))} @-}
214: fibOK :: Int -> Int
215: n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
216: fibOK 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
217: fibOK n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)

Here’ we specify that not only is the output greater than the input, it is also greater than 1.

Now in the recursive case, LiquidHaskell reasons that the value being output is

224: {b: Int | exists b1, b2. b  == b1 + b2 
225:                       && b1 >= n-1 && b1 >= 1 
226:                       && b2 >= n-2 && b2 >= 1 }

which reduces to

230: {b: Int | b = 2n - 3 && n >= 2 }

which, the SMT solver is happy to verify, is indeed a subtype

234: of (i.e. implies the refinement of) the specified output
235: {b: Int | b >= n && b >= 1 } 

Conclusion

There are several things to take away.

  1. We need to distinguish between assumptions and guarantees when writing specifications for functions.

  2. For modularity, LiquidHaskell, like every type system, uses only the (refinement) type of each function at each use site, and not the function’s body.

  3. Some seemingly intuitive specifications often aren’t; in future work it would be useful to actually generate tests as counterexamples that illustrate when a specification fails.