Haskell @ Club De Science - BST



Binary Search Trees

A binary search tree (BST) is a binary tree where each node has a Comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all nodes in that node’s left subtree and smaller than the keys in all nodes in that node’s right subtree.

> module BST (
> 	Map,
> 
> 	-- Construction 
> 	empty, singleton,
> 
> 	-- Insertion
> 	insert, insertWith, 
> 
> 	-- Deletion 
> 	delete,
> 
> 	-- Searching
> 	find, findWithDefault
> 
> 	) where

The data type

A Binary Search Tree is a Map from keys of type k to values of type v. The tree is

> data Map k v = Tip 
>              | Bin Size k v (Map k v) (Map k v)
>              deriving (Show)
> 
> type Size = Int

Construction

How do we construct BST? One way would be to use the data constuctors, e.g.,

> tree2 = Bin 1 2 "two"  Tip Tip
> tree4 = Bin 2 4 "four" tree2 Tip     
> tree3 = Bin 2 3 "three" tree4 Tip    -- case left left

Annoying right? We need to create functions that manipulate these trees!

Write a function that returns the empty tree:

> empty :: Map k a
> empty = Tip

Write a function that given a key and a value, it returns a Map that contains them

singeton 1 "one" = Bin 1 1 "one" Tip Tip
singeton 2 "two" = Bin 1 2 "two" Tip Tip
> singleton :: k -> a -> Map k a
> singleton k x = Bin 1 k x Tip Tip

Write a function that returns the size of the tree

size empty              == 0
size (singleton 1 'a')  == 1
size tree1              == 2
> size :: Map k a -> Int
> size Tip             = 0 
> size (Bin s k v l r) = 1 + (max (size l) (size r))

Insertion

Write a function that inserts the key k and the value x to the tree t

insert 4 "four" tree1 = Bin 2 3 "three" Tip (Bin 1 4 "four" Tip Tip)
> insert :: Ord k => k -> v -> Map k v -> Map k v 
> insert k v Tip = singleton k v 
> insert k v (Bin s k1 v1 l r) 
>   | k < k1  = balance $ Bin (s+1) k1 v1 (insert k v l) r
>   | k >= k1 = balance $ Bin (s+1) k1 v1 l (insert k v r) 

What is the complexity of insert? Can we do better? Yes, if the trees are balanced!

A tree is balanced both the left and the right subtrees have about the same size! Formally, we define the balanceFactor as

balanceFactor = size(left subtree) - size(right subtree)

A tree is balanced is the balanceFactor is between -1 and 1.

Insertion of a node in a balanced tree can modify the balanced factor by 1, leading to balanced factors between -2 and 2. Then, one of the following rotations may be requied:

Implement tha above balancing algorithm:

> 
> tree2 = Bin 1 2 "two"  Tip Tip
> tree4 = Bin 2 4 "four" tree2 Tip     
> tree3 = Bin 2 3 "three" tree4 Tip    -- case left left
> 
> 
> balance :: Map k v -> Map k v
> balance Tip = Tip
> balance t | isLeftRight t  = leftright t 
>           | isLeftLeft t   = leftleft t
>           | isRightRight t = rightright t
>           | isLRightLeft t  = rightleft t
> 
> balanceFactor Tip 
>   = 0
> balanceFactor (Bin s k v l r) 
>   = size l - size r 
> 
> 
> {- isLeftRight t checks if t is in the leftRight case -}
> isLeftRight :: Map k v -> Bool 
> isLeftRight Tip 
>   = False 
> isLeftRight t@(Bin _ _ _ l _)
>   = balanceFactor l == -1 && balanceFactor t == 2
> 
> isLeftLeft t@(Bin s k v l r)
>   = undefined
> 
> isRightRight t@(Bin s k v l r)
>   = (balanceFactor r == -1 || balanceFactor r == 0) && balanceFactor t == (-2)
> 
> isLRightLeft t@(Bin s k v l r)
>   = undefined
> 
> leftleft (Bin s1 k1 v1 (Bin s2 k2 v2 (Bin s3 k3 v3 l3 r3) r2) r1)
>   = Bin s2' k2 v2 (Bin s3' k3 v3 l3 r3)
>                   (Bin s1' k1 v1 r2 r1)
>   where
>   	s1' = 1 + max (size r2) (size r1)
>   	s2' = 1 + max s3' s1'
>   	s3' = 1 + max (size l3) (size r3)
> 
> 
> leftright (Bin s1 k1 v1 (Bin s2 k2 v2 l2 (Bin s3 k3 v3 l3 r3)) r1)
>   = Bin s3' k3 v3 (Bin s2' k2 v2 l2 l3)
>                   (Bin s1' k1 v1 r3 r1)
>    where
>    	s1' = 1 + max (size r3) (size r1)
>    	s2' = 1 + max (size l2) (size l3)
>    	s3' = 1 + max s2' s1' 
> 
> rightright (Bin s1 k1 v1 l1 (Bin s2 k2 v2 l2 (Bin s3 k3 v3 l3 r3))) 
>   = undefined
> 
> rightleft  = undefined 

Use balance to improve the complexity of the insert function.

Testing Trees

We need to test if our inserting function works properly. To do so, first define a function fromList that turns a list into a Map

> fromList :: Ord k => [(k, v)] -> Map k v
> fromList xs = undefined

and vice versa

> toList :: Map k v -> [(k, v)]
> toList t = undefined

Then write the properties you want to check: ordering and balancing

> isOrdered :: Ord k => Map k v -> Bool
> isOrdered Tip             
>   = True 
> isOrdered (Bin _ k v l r) 
>   = isOrdered l && isOrdered r && all (k>) (keys l) && all (k<) (keys r)
> 
> keys :: Map k v -> [k]
> keys = map fst . toList  
> 
> 
> isBalanced :: Ord k => Map k v -> Bool
> isBalanced = undefined

Then, come up with list of pairs, turns them to maps and test the properties on them

> bst1 = fromList $ [(1, "one"), (5, "five"), (6, "six"), (3, "three")]
> 
> check1 = isOrdered bst1 && isBalanced bst1 

What if the key you want to insert already exists in the tree? Write a function insertWith that combines the old and the new values in the tree:

insertWith (++) 1 "ten &" (fromList [(1, "one")]) = fromList [(1, "ten & one")]
> insertWith :: Ord k => (a -> a -> a) -> k -> a -> Map k a -> Map k a
> insertWith f k v t = undefined 

Deletion

Write a function the deletes the key k from a tree

> delete :: Ord k => k -> Map k v -> Map k v
> delete k t = undefined

Did you rebalance the tree?

Searching

lookup k t returns Just v if the binding (k, v) exists on t. Otherwise, it returns Nothing

> find :: Ord k => k -> Map k a -> Maybe a
> find k t = undefined

What is the complexity of lookup?

Instead of returning Nothing values, the next function findWithDefault get one more “default” argument and if the key is not found in the tree it returns the default value!

> findWithDefault :: Ord k => v -> k -> Map k v -> v
> findWithDefault = undefined