LiquidHaskell

Refinement Types via SMT and Predicate Abstraction

Last time, we showed how to reason about Unicode and a variable-width encoding of Chars when consuming a Text value, today we’ll look at the same issue from the perspective of building a Text.

We mentioned previously that text uses stream fusion to optimize multiple loops over a Text into a single loop; as a result many of the top-level API functions are simple wrappers around equivalent functions over Streams. The creation of Text values is then largely handled by a single function, unstream, which converts a Stream into a Text.

217: unstream :: Stream Char -> Text
218: (TextWrite.Stream GHC.Types.Char) -> TextWrite.Textunstream (Stream next0 s0 len) = (forall s. (GHC.ST.ST s TextWrite.Text)) -> TextWrite.TextrunST ((forall s. (GHC.ST.ST s TextWrite.Text)) -> TextWrite.Text)
-> (forall s. (GHC.ST.ST s TextWrite.Text)) -> TextWrite.Text$ do
219:   let {x2 : GHC.Types.Int | (x2 >= 0)}mlen = x1:{x6 : GHC.Types.Int | (x6 >= 0)}
-> x2:TextWrite.Size
-> {x3 : GHC.Types.Int | (x3 == (if ((isUnknown x2)) then x1 else (getSize x2))) && (x3 >= 0)}upperBound {x2 : GHC.Types.Int | (x2 == (4  :  int))}4 {x3 : TextWrite.Size | (x3 == len) && ((getSize x3) >= 0)}len
220:   {VV : (TextWrite.MArray a) | ((malen VV) == mlen)}arr0 <- x1:{x5 : GHC.Types.Int | (x5 >= 0)}
-> (GHC.ST.ST a {x3 : (TextWrite.MArray a) | ((malen x3) == x1)})new {x3 : GHC.Types.Int | (x3 == mlen) && (x3 >= 0)}mlen
221:   let x1:(TextWrite.MArray a)
-> x2:{VV : GHC.Types.Int | (VV == (malen x1)) && (VV >= 0) && (VV >= mlen)}
-> b
-> {VV : GHC.Types.Int | (VV >= 0) && (VV <= x2) && (VV <= (malen x1)) && ((mlen + VV) <= (malen x1))}
-> (GHC.ST.ST a TextWrite.Text)outer (TextWrite.MArray a)arr {VV : GHC.Types.Int | (VV == (malen arr)) && (VV >= 0) && (VV >= mlen)}top = a
-> {x6 : GHC.Types.Int | (x6 >= 0) && (x6 <= top) && (x6 <= (malen arr))}
-> (GHC.ST.ST b TextWrite.Text)loop
222:        where
223:         a
-> {VV : GHC.Types.Int | (VV >= 0) && (VV <= top) && (VV <= (malen arr))}
-> (GHC.ST.ST b TextWrite.Text)loop !as !{VV : GHC.Types.Int | (VV >= 0) && (VV <= top) && (VV <= (malen arr))}i =
224:             case a -> (TextWrite.Step a GHC.Types.Char)next0 {VV : a | (VV == s)}s of
225:               Done          -> do
226:                 {VV : TextWrite.Array | ((alen VV) == (malen arr))}arr' <- x1:(TextWrite.MArray a)
-> (GHC.ST.ST a {x3 : TextWrite.Array | ((alen x3) == (malen x1))})unsafeFreeze {x2 : (TextWrite.MArray a) | (x2 == arr)}arr
227:                 TextWrite.Text -> (GHC.ST.ST a TextWrite.Text)return (TextWrite.Text -> (GHC.ST.ST a TextWrite.Text))
-> TextWrite.Text -> (GHC.ST.ST a TextWrite.Text)$! x1:TextWrite.Array
-> x2:{x10 : GHC.Types.Int | (x10 >= 0) && (x10 <= (alen x1))}
-> x3:{x7 : GHC.Types.Int | (x7 >= 0) && ((x7 + x2) <= (alen x1))}
-> {x4 : TextWrite.Text | ((tarr x4) == x1) && ((tlen x4) == x3) && ((toff x4) == x2)}Text {x3 : TextWrite.Array | (x3 == arr') && ((alen x3) == (malen arr))}arr' {x2 : GHC.Types.Int | (x2 == (0  :  int))}0 {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i
228:               Skip s'       -> a
-> {VV : GHC.Types.Int | (VV >= 0) && (VV <= top) && (VV <= (malen arr))}
-> (GHC.ST.ST b TextWrite.Text)loop {VV : a | (VV == s')}s' {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i
229:               Yield x s'
230:                 | {x6 : GHC.Types.Int | (((ord x) >= 65536) => (x6 == (i + 1))) && (((ord x) < 65536) => (x6 == i)) && (x6 == j) && (x6 >= 0) && (x6 >= i)}j x1:GHC.Types.Int
-> x2:GHC.Types.Int
-> {x2 : GHC.Types.Bool | (((Prop x2)) <=> (x1 >= x2))}>= {x5 : GHC.Types.Int | (x5 == top) && (x5 == (malen arr)) && (x5 >= 0) && (x5 >= mlen)}top  -> do
231:                   let {x2 : GHC.Types.Int | (x2 >= 0)}top' = ({x5 : GHC.Types.Int | (x5 == top) && (x5 == (malen arr)) && (x5 >= 0) && (x5 >= mlen)}top x1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+ {x2 : GHC.Types.Int | (x2 == (1  :  int))}1) x1:{x7 : GHC.Types.Int | (x7 >= 0)}
-> x2:{x5 : GHC.Types.Int | (x5 >= 0)}
-> {x3 : GHC.Types.Int | ((x2 == 1) => (x3 == (x1 * 2))) && (x3 >= 0)}`shiftL` {x2 : GHC.Types.Int | (x2 == (1  :  int))}1
232:                   {VV : (TextWrite.MArray a) | ((malen VV) == top') && (VV /= arr)}arr' <- x1:{x5 : GHC.Types.Int | (x5 >= 0)}
-> (GHC.ST.ST a {x3 : (TextWrite.MArray a) | ((malen x3) == x1)})new {x3 : GHC.Types.Int | (x3 == top') && (x3 >= 0)}top'
233:                   x1:(TextWrite.MArray a)
-> x2:{x13 : GHC.Types.Int | (x13 >= 0) && (x13 <= (malen x1))}
-> x3:(TextWrite.MArray a)
-> x4:{x9 : GHC.Types.Int | (x9 >= 0) && (x9 <= (malen x3))}
-> {x6 : GHC.Types.Int | (x6 >= 0) && ((x2 + x6) <= (malen x1)) && ((x4 + x6) <= (malen x3))}
-> (GHC.ST.ST a ())copyM {x4 : (TextWrite.MArray a) | (x4 == arr') && ((malen x4) == top') && (x4 /= arr)}arr' {x2 : GHC.Types.Int | (x2 == (0  :  int))}0 {x2 : (TextWrite.MArray a) | (x2 == arr)}arr {x2 : GHC.Types.Int | (x2 == (0  :  int))}0 {x5 : GHC.Types.Int | (x5 == top) && (x5 == (malen arr)) && (x5 >= 0) && (x5 >= mlen)}top
234:                   x1:(TextWrite.MArray a)
-> x2:{VV : GHC.Types.Int | (VV == (malen x1)) && (VV >= 0) && (VV >= mlen)}
-> b
-> {VV : GHC.Types.Int | (VV >= 0) && (VV <= x2) && (VV <= (malen x1)) && ((mlen + VV) <= (malen x1))}
-> (GHC.ST.ST a TextWrite.Text)outer {x4 : (TextWrite.MArray a) | (x4 == arr') && ((malen x4) == top') && (x4 /= arr)}arr' {x3 : GHC.Types.Int | (x3 == top') && (x3 >= 0)}top' {VV : a | (VV == s)}s {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i
235:                 | otherwise -> do
236:                   {VV : GHC.Types.Int | (VV >= 0) && (VV <= top) && (VV <= (malen arr)) && ((i + VV) <= (malen arr))}d <- x1:(TextWrite.MArray a)
-> x2:{x9 : GHC.Types.Int | (x9 >= 0)}
-> {x7 : GHC.Types.Char | (((ord x7) >= 65536) => ((x2 + 2) <= (malen x1))) && (((ord x7) < 65536) => ((x2 + 1) <= (malen x1)))}
-> (GHC.ST.ST a {x4 : GHC.Types.Int | (x4 >= 0) && ((x2 + x4) <= (malen x1))})writeChar {x2 : (TextWrite.MArray a) | (x2 == arr)}arr {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i {x2 : GHC.Types.Char | (x2 == x)}x
237:                   a
-> {VV : GHC.Types.Int | (VV >= 0) && (VV <= top) && (VV <= (malen arr))}
-> (GHC.ST.ST b TextWrite.Text)loop {VV : a | (VV == s')}s' ({x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}ix1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+{x6 : GHC.Types.Int | (x6 == d) && (x6 >= 0) && (x6 <= top) && (x6 <= (malen arr)) && ((i + x6) <= (malen arr))}d)
238:                 where {VV : GHC.Types.Int | (((ord x) >= 65536) => (VV == (i + 1))) && (((ord x) < 65536) => (VV == i)) && (VV >= 0) && (VV >= i)}j | x1:GHC.Types.Char -> {x2 : GHC.Types.Int | (x2 == (ord x1))}ord {x2 : GHC.Types.Char | (x2 == x)}x x1:GHC.Types.Int
-> x2:GHC.Types.Int
-> {x2 : GHC.Types.Bool | (((Prop x2)) <=> (x1 < x2))}< {x2 : GHC.Types.Int | (x2 == (65536  :  int))}0x10000 = {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i
239:                         | otherwise       = {x5 : GHC.Types.Int | (x5 == i) && (x5 >= 0) && (x5 <= top) && (x5 <= (malen arr))}i x1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+ {x2 : GHC.Types.Int | (x2 == (1  :  int))}1
240:   x1:{x22 : (TextWrite.MArray a) | ((malen x22) == mlen)}
-> x2:{x20 : GHC.Types.Int | (x20 == mlen) && (x20 == (malen arr0)) && (x20 == (malen x1)) && (x20 == (if ((isUnknown TextWrite.Unknown)) then mlen else (getSize TextWrite.Unknown))) && (x20 == (if ((isUnknown len)) then mlen else (getSize len))) && (x20 >= 0)}
-> {VV : b | (VV == s0)}
-> {x13 : GHC.Types.Int | (x13 == 0) && (x13 >= 0) && (x13 <= x2) && (x13 <= mlen) && (x13 <= (malen arr0)) && (x13 <= (malen x1)) && ((x2 + x13) <= (malen arr0)) && ((x2 + x13) <= (malen x1)) && ((mlen + x13) <= (malen arr0)) && ((mlen + x13) <= (malen x1))}
-> (GHC.ST.ST a TextWrite.Text)outer {x3 : (TextWrite.MArray a) | (x3 == arr0) && ((malen x3) == mlen)}arr0 {x3 : GHC.Types.Int | (x3 == mlen) && (x3 >= 0)}mlen {VV : a | (VV == s0)}s0 {x2 : GHC.Types.Int | (x2 == (0  :  int))}0

Since we’re focusing on memory safety here we won’t go into detail about how Streams work. Let’s instead jump right into the inner loop and look at the Yield case. Here we need to write a char x into arr, so we compute the maximal index j to which we will write – i.e. if x >= U+10000 then j = i + 1 – and determine whether we can safely write at j. If the write is unsafe we have to allocate a larger array before continuing, otherwise we write x and increment i by xs width.

Since writeChar has to handle writing any Unicode value, we need to assure it that there will always be room to write x into arr, regardless of xs width. Indeed, this is expressed in the type we give to writeChar.

258: {-@ writeChar :: ma:MArray s -> i:Nat -> {v:Char | (Room v ma i)}
259:               -> ST s {v:Nat | (RoomN v ma i)}
260:   @-}

The predicate aliases Room and RoomN express that a character can fit in the array at index i and that there are at least n slots available starting at i respectively.

268: {-@ predicate Room C MA I = (((One C) => (RoomN 1 MA I))
269:                           && ((Two C) => (RoomN 2 MA I)))
270:   @-}
271: {-@ predicate RoomN N MA I = (I+N <= (malen MA)) @-}

The One and Two predicates express that a character will be encoded in one or two 16-bit words, by reasoning about its ordinal value.

279: {-@ predicate One C = ((ord C) <  65536) @-}
280: {-@ predicate Two C = ((ord C) >= 65536) @-}

As with numchars, we leave ord abstract, but inform LiquidHaskell that the ord function does in fact return the ordinal value of the character.

288: {-@ measure ord :: Char -> Int @-}
289: {-@ ord :: c:Char -> {v:Int | v = (ord c)} @-}

Since writeChar assumes that it will never be called unless there is room to write c, it is safe to just split c into 16-bit words and write them into the array.

297: writeChar :: MArray s -> Int -> Char -> ST s Int
298: forall a.
x1:(TextWrite.MArray a)
-> x2:{v : GHC.Types.Int | (v >= 0)}
-> {v : GHC.Types.Char | (((ord v) >= 65536) => ((x2 + 2) <= (malen x1))) && (((ord v) < 65536) => ((x2 + 1) <= (malen x1)))}
-> (GHC.ST.ST a {v : GHC.Types.Int | (v >= 0) && ((x2 + v) <= (malen x1))})writeChar (TextWrite.MArray a)marr {v : GHC.Types.Int | (v >= 0)}i {v : GHC.Types.Char | (((ord v) >= 65536) => ((i + 2) <= (malen marr))) && (((ord v) < 65536) => ((i + 1) <= (malen marr)))}c
299:     | {x3 : GHC.Types.Int | (x3 == n) && (x3 == (ord c))}n x1:GHC.Types.Int
-> x2:GHC.Types.Int
-> {x2 : GHC.Types.Bool | (((Prop x2)) <=> (x1 < x2))}< {x2 : GHC.Types.Int | (x2 == (65536  :  int))}0x10000 = do
300:         x1:(TextWrite.MArray a)
-> {x6 : GHC.Types.Int | (x6 >= 0) && (x6 < (malen x1))}
-> GHC.Word.Word16
-> (GHC.ST.ST a ())unsafeWrite {x2 : (TextWrite.MArray a) | (x2 == marr)}marr {x3 : GHC.Types.Int | (x3 == i) && (x3 >= 0)}i (x1:GHC.Types.Int -> {x2 : GHC.Word.Word16 | (x2 == x1)}fromIntegral {x3 : GHC.Types.Int | (x3 == n) && (x3 == (ord c))}n)
301:         {x26 : GHC.Types.Int | ((m == 1) => (x26 == (hi * 2))) && ((m == 1) => (x26 == (i * 2))) && ((m == 1) => (x26 == (lo * 2))) && ((m == 1) => (x26 == (m * 2))) && ((m == 1) => (x26 == (n * 2))) && (x26 == 1) && (x26 > 0) && (x26 > m) && (x26 <= (malen marr)) && ((i + x26) <= (malen marr)) && ((m + x26) <= (malen marr))}
-> (GHC.ST.ST a {x26 : GHC.Types.Int | ((m == 1) => (x26 == (hi * 2))) && ((m == 1) => (x26 == (i * 2))) && ((m == 1) => (x26 == (lo * 2))) && ((m == 1) => (x26 == (m * 2))) && ((m == 1) => (x26 == (n * 2))) && (x26 == 1) && (x26 > 0) && (x26 > m) && (x26 <= (malen marr)) && ((i + x26) <= (malen marr)) && ((m + x26) <= (malen marr))})return {x2 : GHC.Types.Int | (x2 == (1  :  int))}1
302:     | otherwise = do
303:         x1:(TextWrite.MArray a)
-> {x6 : GHC.Types.Int | (x6 >= 0) && (x6 < (malen x1))}
-> GHC.Word.Word16
-> (GHC.ST.ST a ())unsafeWrite {x2 : (TextWrite.MArray a) | (x2 == marr)}marr {x3 : GHC.Types.Int | (x3 == i) && (x3 >= 0)}i {x2 : GHC.Word.Word16 | (x2 == lo)}lo
304:         x1:(TextWrite.MArray a)
-> {x6 : GHC.Types.Int | (x6 >= 0) && (x6 < (malen x1))}
-> GHC.Word.Word16
-> (GHC.ST.ST a ())unsafeWrite {x2 : (TextWrite.MArray a) | (x2 == marr)}marr ({x3 : GHC.Types.Int | (x3 == i) && (x3 >= 0)}ix1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+{x2 : GHC.Types.Int | (x2 == (1  :  int))}1) {x2 : GHC.Word.Word16 | (x2 == hi)}hi
305:         {x32 : GHC.Types.Int | ((hi == 1) => (x32 == (hi * 2))) && ((i == 1) => (x32 == (i * 2))) && ((lo == 1) => (x32 == (lo * 2))) && ((m == 1) => (x32 == (m * 2))) && ((n == 1) => (x32 == (hi * 2))) && ((n == 1) => (x32 == (i * 2))) && ((n == 1) => (x32 == (lo * 2))) && ((n == 1) => (x32 == (m * 2))) && ((n == 1) => (x32 == (n * 2))) && (x32 > 0) && (x32 >= 0) && (x32 < n) && (x32 <= (malen marr)) && ((i + x32) <= (malen marr))}
-> (GHC.ST.ST a {x32 : GHC.Types.Int | ((hi == 1) => (x32 == (hi * 2))) && ((i == 1) => (x32 == (i * 2))) && ((lo == 1) => (x32 == (lo * 2))) && ((m == 1) => (x32 == (m * 2))) && ((n == 1) => (x32 == (hi * 2))) && ((n == 1) => (x32 == (i * 2))) && ((n == 1) => (x32 == (lo * 2))) && ((n == 1) => (x32 == (m * 2))) && ((n == 1) => (x32 == (n * 2))) && (x32 > 0) && (x32 >= 0) && (x32 < n) && (x32 <= (malen marr)) && ((i + x32) <= (malen marr))})return {x2 : GHC.Types.Int | (x2 == (2  :  int))}2
306:     where {x2 : GHC.Types.Int | (x2 == (ord c))}n = x1:GHC.Types.Char -> {x2 : GHC.Types.Int | (x2 == (ord x1))}ord {x4 : GHC.Types.Char | (((ord x4) >= 65536) => ((i + 2) <= (malen marr))) && (((ord x4) < 65536) => ((i + 1) <= (malen marr))) && (x4 == c)}c
307:           GHC.Types.Intm = {x3 : GHC.Types.Int | (x3 == n) && (x3 == (ord c))}n x1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 - x2))}- {x2 : GHC.Types.Int | (x2 == (65536  :  int))}0x10000
308:           GHC.Word.Word16lo = x1:GHC.Types.Int -> {x2 : GHC.Word.Word16 | (x2 == x1)}fromIntegral (GHC.Types.Int -> GHC.Word.Word16)
-> GHC.Types.Int -> GHC.Word.Word16$ GHC.Types.Int({x2 : GHC.Types.Int | (x2 == m)}m GHC.Types.Int -> GHC.Types.Int -> GHC.Types.Int`shiftR` {x2 : GHC.Types.Int | (x2 == (10  :  int))}10) x1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+ {x2 : GHC.Types.Int | (x2 == (55296  :  int))}0xD800
309:           GHC.Word.Word16hi = x1:GHC.Types.Int -> {x2 : GHC.Word.Word16 | (x2 == x1)}fromIntegral (GHC.Types.Int -> GHC.Word.Word16)
-> GHC.Types.Int -> GHC.Word.Word16$ GHC.Types.Int({x2 : GHC.Types.Int | (x2 == m)}m GHC.Types.Int -> GHC.Types.Int -> GHC.Types.Int.&. {x2 : GHC.Types.Int | (x2 == (1023  :  int))}0x3FF) x1:GHC.Types.Int
-> x2:GHC.Types.Int -> {x4 : GHC.Types.Int | (x4 == (x1 + x2))}+ {x2 : GHC.Types.Int | (x2 == (56320  :  int))}0xDC00

In typical design-by-contract style, we’re putting the burden of proof to establish safety on writeChars caller. Now, scroll back up to unstream and mouse over j to see its inferred type. You should see something like

316: {v:Int | ((ord x >= 65536) => (v == i+1))
317:       && ((ord x <  65536) => (v == i))}

which, combined with the case-split on j >= top, provides the proof that writing x will be safe!

Stay tuned, next time we’ll look at another example of building a Text where LiquidHaskell fails to infer this crucial refinement…