Case Study: MapReduce


Chunk input, map operation (in parallel), and reduce the results.

MapReduce "Library"


Haskell definition and reflection

{-@ reflect mapReduce @-} mapReduce :: Int -> (List a -> b) -> (b -> b -> b) -> List a -> b mapReduce n f op is = reduce op (f N) (map f (chunk n is)) {-@ reflect reduce @-} reduce :: (a -> a -> a) -> a -> List a -> a reduce op b N = b reduce op b (C x xs) = op x (reduce op b xs) {-@ reflect map @-} {-@ map :: (a -> b) -> xs:List a -> {v:List b | llen v == llen xs } @-} map _ N = N map f (C x xs) = f x `C` map f xs {-@ reflect chunk @-} chunk :: Int -> List a -> List (List a)

MapReduce "Client": Summing List

  • Standard List Summing
    {-@ reflect sum @-} sum :: List Int -> Int sum N = 0 sum (C x xs) = x `plus` sum xs
  • Reduction Operation
    {-@ reflect plus @-} plus :: Int -> Int -> Int plus x y = x + y
  • MapReduce List Summing
    {-@ reflect psum @-} psum :: Int -> List Int -> Int psum n is = mapReduce n sum plus is


Question: Is psum equivalent to sum?

Proving Code Equivalence

  • By application of Higher Order Theorem
{-@ automatic-instances sumEq @-} {-@ sumEq :: n:Int -> is:List Int -> { sum is == psum n is } @-} sumEq n is = mRTheorem n -- chunk size sum -- function to map-reduce plus -- reduction operator plusRightId -- plus has "right-identity" sumDistr -- sum is "distributive" is -- input list

Right Identity of plus


  plusRightId :: xs:List Int ->
                   {plus (sum xs) (sum N) == sum xs}

Distributivity of sum


  sumDistr :: xs:List Int -> ys:List Int ->
                {sum (xs ++ ys) == plus (sum xs) (sum ys)}

Higher Order Map Reduce Theorem


If f is right-id and op distributive
Then map-reduce is equivalent to sequential

{-@ mRTheorem :: n:Int -> f:(List a -> b) -> op:(b -> b -> b) -> rightId:(xs:List a -> {op (f xs) (f N) == f xs } ) -> distrib:(xs:List a -> ys:List a -> {f (xs ++ ys) == op (f xs) (f ys)} ) -> is:List a -> { mapReduce n f op is == f is } / [llen is] @-}


Manual Proof (see Appendix)

Right Identity of plus


Exercise: Can you prove plus has right identity?

{-@ plusRightId :: xs:List Int -> {plus (sum xs) (sum N) == sum xs} @-} plusRightId xs = undefined

Warmup: Associativity of Append


Exercise: Can you prove plus has right identity?

{-@ appendAssoc :: xs:List a -> ys:List a -> zs:List a -> { xs ++ (ys ++ zs) == (xs ++ ys) ++ zs } @-} appendAssoc xs ys zs = undefined

Proof Automation: Associativity of Append


Proof Automation Flag

{-@ LIQUID "--automatic-instances=liquidinstanceslocal" @-}
{-@ automatic-instances appendAssocAuto @-} {-@ appendAssocAuto :: xs:List a -> ys:List a -> zs:List a -> { xs ++ (ys ++ zs) == (xs ++ ys) ++ zs } @-} appendAssocAuto N _ _ = trivial appendAssocAuto (C _ xs) ys zs = appendAssocAuto xs ys zs

Distributivity of sum


Exercise: Can you prove distribution of sum?

  • Distribution of sum
    {-@ automatic-instances sumDistr @-} {-@ sumDistr :: xs:List Int -> ys:List Int -> {sum (xs ++ ys) == plus (sum xs) (sum ys)} @-} sumDistr xs ys = undefined

Recap



  • Refinement Reflection: Allow Haskell functions in Logic
  • Case Study: Prove Program Equivalence



Prove crucial properties for Haskell in Haskell!


where Haskell = a general purpose programming language.

Recap


Refinements: Types + Predicates
Specification: Refined Input/Output Types
Verification: SMT-based Predicate Subtyping
Measures: Specify Properties of Data
Termination: Well-founded Metrics
Reflection: Haskell functions in Logic


Evaluation & Conclusion

Appendix: Proof of mRTheorem

mRTheorem n f op rightId _ N = mapReduce n f op N ==. reduce op (f N) (map f (chunk n N)) ==. reduce op (f N) (map f (C N N)) ==. reduce op (f N) (f N `C` map f N ) ==. reduce op (f N) (f N `C` N) ==. op (f N) (reduce op (f N) N) ==. op (f N) (f N) ? rightId N ==. f N *** QED mRTheorem n f op rightId _ is@(C _ _) | n <= 1 || llen is <= n = mapReduce n f op is ==. reduce op (f N) (map f (chunk n is)) ==. reduce op (f N) (map f (C is N)) ==. reduce op (f N) (f is `C` map f N) ==. reduce op (f N) (f is `C` N) ==. op (f is) (reduce op (f N) N) ==. op (f is) (f N) ==. f is ? rightId is *** QED mRTheorem n f op rightId distrib is = mapReduce n f op is ==. reduce op (f N) (map f (chunk n is)) ==. reduce op (f N) (map f (C (take n is) (chunk n (drop n is)))) ==. reduce op (f N) (C (f (take n is)) (map f (chunk n (drop n is)))) ==. op (f (take n is)) (reduce op (f N) (map f (chunk n (drop n is)))) ==. op (f (take n is)) (mapReduce n f op (drop n is)) ==. op (f (take n is)) (f (drop n is)) ? mRTheorem n f op rightId distrib (drop n is) ==. f ((take n is) ++ (drop n is)) ? distrib (take n is) (drop n is) ==. f is ? appendTakeDrop n is *** QED

Append of Take and Drop

{-@ automatic-instances appendTakeDrop @-} {-@ appendTakeDrop :: i:Nat -> xs:{List a | i <= llen xs} -> {xs == (take i xs) ++ (drop i xs) } @-} appendTakeDrop i N = trivial appendTakeDrop i (C x xs) | i == 0 = trivial appendTakeDrop i (C x xs) = appendTakeDrop (i-1) xs

List Definition

Built-in Lists are not supported for now.

(So does imports...)

{-@ data List [llen] a = N | C {lhead :: a, ltail :: List a} @-} data List a = N | C a (List a) {-@ measure llen @-} {-@ llen :: List a -> Nat @-} llen N = 0 llen (C _ xs) = 1 + llen xs

List Manipulation

{-@ chunk :: i:Int -> xs:List a -> {v:List (List a) | if (i <= 1 || llen xs <= i) then (llen v == 1) else (llen v < llen xs) } / [llen xs] @-} chunk i xs | i <= 1 = C xs N | llen xs <= i = C xs N | otherwise = C (take i xs) (chunk i (drop i xs)) {-@ reflect drop @-} {-@ drop :: i:Nat -> xs:{List a | i <= llen xs } -> {v:List a | llen v == llen xs - i } @-} drop i N = N drop i (C x xs) | i == 0 = C x xs | otherwise = drop (i-1) xs {-@ reflect take @-} {-@ take :: i:Nat -> xs:{List a | i <= llen xs } -> {v:List a | llen v == i} @-} take i N = N take i (C x xs) | i == 0 = N | otherwise = C x (take (i-1) xs) {-@ reflect ++ @-} N ++ ys = ys (C x xs) ++ ys = x `C` (xs ++ ys)