So far we have mostly discussed LiquidHaskell in the context of
recursive data structures like lists, but there comes a time in
many programs when you have to put down the list and pick up an
array for the sake of performance.
In this series we’re going to examine the text
library, which
does exactly this in addition to having extensive Unicode support.
text
is a popular library for efficient text processing.
It provides the high-level API Haskellers have come to expect while
using stream fusion and byte arrays under the hood to guarantee high
performance.
Suppose we wanted to get the ith Char
of a Text
,
we could write a function1
42: charAt (Text a o l) i = word2char $ unsafeIndex a i
43: where word2char = chr . fromIntegral
which extracts the underlying array, indexes into it, and casts the Word16
to a Char
, using functions exported by text
.
Let’s try this out in GHCi.
51: ghci> let t = T.pack "dog"
52: ghci> charAt t 0
53: 'd'
54: ghci> charAt t 2
55: 'g'
Looks good so far, what happens if we keep going?
59: ghci> charAt t 3
60: '\NUL'
61: ghci> charAt t 100
62: '\8745'
Oh dear, not only did we not get any sort of exception from Haskell, we weren’t
even stopped by the OS with a segfault. This is quite dangerous since we have no
idea what sort of data (private keys?) we just read! To be fair to Bryan
O’Sullivan, we did use a function that was clearly marked unsafe, but I’d
much rather have the compiler throw an error on these last two calls.
In this post we’ll see exactly how prevent invalid memory accesses like this
with LiquidHaskell.