Refinement Types via SMT and Predicate Abstraction

Hopefully, the previous article gave you a basic idea about what refinement types look like. Several folks had interesting questions, that are worth discussing in a separate post, since they throw a lot of light on the strengths (or weaknesses, depending on your point of view!) of LiquidHaskell.

```22: module Refinements101Reax where
```

## How to relate outputs and inputs

Recall the function `divide`

```31: {-@ divide :: Int -> {v: Int | v /= 0 } -> Int @-}
32: divide     :: Int -> Int -> Int
33: (GHC.Types.Int)
-> {VV : (GHC.Types.Int) | (VV != 0)} -> (GHC.Types.Int)divide (GHC.Types.Int)n 0 = [(GHC.Types.Char)] -> {VV : (GHC.Types.Int) | false}error {VV : [(GHC.Types.Char)] | (len([VV]) >= 0)}"divide by zero"
34: divide n d = {VV : (GHC.Types.Int) | (VV = n)}n x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x / y))}`div` {VV : (GHC.Types.Int) | (VV != 0)}d
```

and `abz` was the absolute value function

```40: abz               :: Int -> Int
41: x:(GHC.Types.Int)
-> {VV : (GHC.Types.Int) | ((VV = 0) <=> (x = 0)),(0 <= VV)}abz (GHC.Types.Int)n | {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 x:(GHC.Types.Int)
-> y:(GHC.Types.Int)
-> {VV : (GHC.Types.Bool) | ((? Prop([VV])) <=> (x < y))}< {VV : (GHC.Types.Int) | (VV = n)}n     = {VV : (GHC.Types.Int) | (VV = n)}n
42:       | otherwise = {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}- {VV : (GHC.Types.Int) | (VV = n)}n
```

nanothief remarked that LiquidHaskell was unable to verify the safety of the following call to `divide` (i.e. was unable to show that `x` was non-zero at the callsite).

```50: {-@ f :: Int -> Int @-}
51: (GHC.Types.Int) -> (GHC.Types.Int)f (GHC.Types.Int)x | x:(GHC.Types.Int)
-> {VV : (GHC.Types.Int) | ((VV = 0) <=> (x = 0)),(0 <= VV)}abz {VV : (GHC.Types.Int) | (VV = x)}x x:{VV : (GHC.Types.Int) | (VV >= 0),(0 <= VV)}
-> y:{VV : (GHC.Types.Int) | (VV >= 0),(0 <= VV)}
-> {VV : (GHC.Types.Bool) | ((? Prop([VV])) <=> (x = y))}== {VV : (GHC.Types.Int) | (VV = (0  :  int))}0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}3
52:     | otherwise  = {VV : (GHC.Types.Int) | (VV = (3  :  int))}3 (GHC.Types.Int)
-> {VV : (GHC.Types.Int) | (VV != 0)} -> (GHC.Types.Int)`divide` {VV : (GHC.Types.Int) | (VV = x)}x
```

Nanothief correctly argues that the code is clearly safe as `abz x == 0` being false implies `x /= 0`. Indeed, the code is safe, however, the reason that LiquidHaskell rejected it has nothing to do with its inability to “track the constraints of values based on tests using new values derived from that value” as Nanothief surmised, but instead, because LiquidHaskell supports modular verification where the only thing known about `abz` at a use site is whatever is specified in its type.

Concretely speaking, the type

```64: abz :: Int -> {v: Int | 0 <= v }
```

is too anemic to verify `f` above, as it tells us nothing about the relationship between the output and input – looking at it, we have now way of telling that when the output (of `abz`) is non-zero, the input must also have been non-zero.

Instead, we can write a stronger type which does capture this information, for example

```73: abz :: x:Int -> {v:Int | v = (if (x > 0) then x else (0 - x))}
```

where

```77: v = (if p then e1 else e2)
```

is an abbreviation for the formula

```81: (p => v == e1) && ((not p) => v = e2)
```

With this specification for `abz`, LiquidHaskell is able to reason that when `abz x` is non-zero, `x` is also non-zero. Of course, `abz` is trivial enough that we can very precisely capture its exact semantics in the refinement type, but thats is rarely the case.

Nevertheless, even here, you could write a somewhat weaker specification, that still had enough juice to allow the verification of the `divide` call in `f`. In particular, we might write

```94: {-@ abz :: x:Int -> {v:Int | ((0 <= v) && ((v = 0) <=> (x = 0))) } @-}
```

which states the output is `0` if and only if the input is `0`. LiquidHaskell will happily verify that `abz` implements this specification, and will use it to verify the safety of `f` above.

(BTW, follow the link above to demo this code yourself.)

## How to tell a Fib

Chris Done asked why LiquidHaskell refused to verify the following definition of `fib`.

```110: {-@ fib :: n:Int -> { b:Int | (n >= 0 && b >= n) } @-}
111: fib :: Int -> Int
112: n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}0
113: fib 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
114: fib n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)
```

Indeed, the both the specification and the implementation look pretty legit, so what gives? It turns out that there are two different reasons why.

Reason 1: Assumptions vs. Guarantees

What we really want to say over here is that the input `n` is non-negative. However, putting the refinement `n >= 0` in the output constraint means that it becomes something that LiquidHaskell checks that the function `fib` guarantees (or ensures). That is, the type states that we can pass `fib` any value `n` (including negative values) and yet, `fib` must return values `b` such that `b >= n` and `n >= 0`.

The latter requirement is a rather tall order when an arbitrary `n` is passed in as input. `fib` can make no such guarantees since it was given the value `n` as a parameter. The only way `n` could be non-negative was that if the caller had sent in a non-negative value. Thus, we want to put the burden of proof on the right entity here, namely the caller.

To assign the burden of proof appropriately, we place the non-negative refinement on the input type

```142: {-@ fib' :: n:{v:Int | v >= 0} -> {b:Int | (n >= 0 && b >= n) } @-}
143: fib' :: Int -> Int
144: n:{VV : (GHC.Types.Int) | (VV >= 0)}
-> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib' 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}0
145: fib' 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
146: fib' n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ({VV : (GHC.Types.Int) | (VV >= 0)}nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= n),(n >= 0)}fib ({VV : (GHC.Types.Int) | (VV >= 0)}nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)
```

where now at calls to `fib'` LiquidHaskell will check that the argument is non-negative, and furthermore, when checking `fib'` LiquidHaskell will assume that the parameter `n` is indeed non-negative. So now the constraint `n >= 0` on the output is somewhat redundant, and the non-negative `n` guarantee holds trivially.

Reason 2: The Specification is a Fib

If you run the above in the demo, you will see that LiquidHaskell still doth protest loudly, and frankly, one might start getting a little frustrated at the stubbornness and petulance of the checker.

However, if you stare at the implementation, you will see that it in fact, does not meet the specification, as

```162: fib' 2  == fib' 1 + fib' 0
163:         == 0 + 1
164:         == 1
```

LiquidHaskell is reluctant to prove things that are false. Rather than anthropomorphize frivolously, lets see why it is unhappy.

First, recall the somewhat simplified specification

```171: fib' :: n:Int -> { b:Int | (b >= n) }
```

As we saw in the discussion about `abz`, at each recursive callsite the only information LiquidHaskell uses about the returned value, is that described in the output type for that function call.

Thus, LiquidHaskell reasons that the expression:

```179: fib' (n-1) + fib' (n-2)
```

has the type

```183: {b: Int | exists b1, b2. b  == b1 + b2
184:                       && b1 >= n-1
185:                       && b2 >= n-2     }
```

where the `b1` and `b2` denote the values returned by the recursive calls — we get the above by plugging the parameters `n-1` and `n-2` in for the parameter `n` in output type for `fib'`.

The SMT solver simplifies the above to

```193: {b: Int | b >= 2n - 3}
```

Finally, to check the output guarantee is met, LiquidHaskell asks the SMT solver to prove that

```197: (b >= 2n - 2)  =>  (b >= n)
```

The SMT solver will refuse, of course, since the above implication is not valid (e.g. when `n` is `2`) Thus, via SMT, LiquidHaskell proclaims that the function `fib'` does not implement the advertised type and hence marks it unsafe.

## Fixing The Code

How then, do we get Chris’ specification to work out? It seems like it should hold (except for that pesky case where `n=2`. Indeed, let’s rig the code, so that all the base cases return `1`.

```213: {-@ fibOK :: n:Int -> {b:Int | ((b >= n) && (b >= 1))} @-}
214: fibOK :: Int -> Int
215: n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK 0 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
216: fibOK 1 = x:(GHC.Prim.Int#) -> {VV : (GHC.Types.Int) | (VV = (x  :  int))}1
217: fibOK n = n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (1  :  int))}1) x:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x + y))}+ n:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV >= 1),(VV >= n)}fibOK ((GHC.Types.Int)nx:(GHC.Types.Int)
-> y:(GHC.Types.Int) -> {VV : (GHC.Types.Int) | (VV = (x - y))}-{VV : (GHC.Types.Int) | (VV = (2  :  int))}2)
```

Here’ we specify that not only is the output greater than the input, it is also greater than `1`.

Now in the recursive case, LiquidHaskell reasons that the value being output is

```224: {b: Int | exists b1, b2. b  == b1 + b2
225:                       && b1 >= n-1 && b1 >= 1
226:                       && b2 >= n-2 && b2 >= 1 }
```

which reduces to

```230: {b: Int | b = 2n - 3 && n >= 2 }
```

which, the SMT solver is happy to verify, is indeed a subtype

```234: of (i.e. implies the refinement of) the specified output
235: {b: Int | b >= n && b >= 1 }
```

## Conclusion

There are several things to take away.

1. We need to distinguish between assumptions and guarantees when writing specifications for functions.

2. For modularity, LiquidHaskell, like every type system, uses only the (refinement) type of each function at each use site, and not the function’s body.

3. Some seemingly intuitive specifications often aren’t; in future work it would be useful to actually generate tests as counterexamples that illustrate when a specification fails.