This article follows how I developed an implementation of the Graham scan algorithm in Haskell, missteps and all. I think it’s valuable to see the process that others use, especially in a “weird” language like Haskell. At times, it can seem like most of what you’ve learned about developing software doesn’t apply to Haskell, but I think that’s just because the language allows so much of the scaffolding to be cut away once the software is finished. Reading Haskell code written by the masters can feel like looking at the Sistine chapel and wondering where they got such long paint brushes. This will be a step-by-step implementation as I developed it. I’m still learning too, so don’t take this as an example of excellent Haskell style.

The Graham scan algorithm finds the convex hull of a set of planar points in three steps. First, we find the lower-left point in the set (call it P). Second, we sort the set based on the angle each point makes with P and the x-axis. Third, we go through the sorted list and keep each point which makes a “left turn.”

Data Types

Even before we start, there are some obvious types we need. We’ll need a type to store points. This could just be a tuple, but I think the code will be easier to understand if the point type has accessors. I’ll define Point as


data Point a = Point {px :: a, py :: a}
               deriving Show
 

I parameterized the number type, so our algorithm can be as general as possible. I think we’ll want a type to represent the result of the turn calculation too, so define a turn data type:


data Turn = Left | Right | Colinear
            deriving (Show, Eq)
 

Since I’m redefining Left and Right here, I have to hide Either with import Prelude hiding (Either(..)).

Stubs

I want to start capturing what I know about the Graham scan function in code. First, it will take a set of points and return the points of the convex hull, so


grahamScan :: [Point a] -> [Point a]
 

Next, the algorithm doesn’t work right if there are fewer than three points, so it will generate an error in that case. There are many ways to handle errors in Haskell. Some common methods include putting the function in the Error monad, returning an Either or Maybe, or returning an empty list. I’ll explore other methods in future articles, but since this function is for experimentation I’m going to just raise an error. I’m not even going to bother with a detailed error report, so the function will start like this.


grahamScan points
    | length points < 3 = error "Degenerate"
    | otherwise         = …
 

Now that I know we have valid input, I can get to the meat of the algorithm. First we find the lower left point and sort the point list with it. I’m not sure how to find the lower left point, so I’ll just defer that to a function called findFirstPoint. I’ll use the functions compare and on with a new function angle to create a function that will compare points by their angle. This new comparator is handed to sortBy to sort the list. For now, I set findFirstPoint and angle to undefined so the code will compile. Compile early and often; it helps to keep mistakes from compounding and mutating.


let (firstPoint:rest) = findFirstPoint points
    sortedRest = sortBy (compare `on` (angle firstPoint)) rest
 

The third step in the algorithm is a bit tricky. We need to look at the next three points in the set and decide whether or not to keep the second point, and the last two points have to be tested along with the first point. Since our input list is not circular, there has to be special handling for the end of the list.


loop (a:b:[]) = case turn a b firstPoint of
                  Left -> b : []
                  _    -> []
loop (a:b:c:ps) = case turn a b c of
                    Left -> b : loop (b:c:ps)
                    _    -> loop (a:c:ps)
 

Isn’t that lovely? Haskell is so expressive, the code basically reads like the English description: “Take the next three points in the list, and if they make a left turn keep the middle point. If there are only two points left, tack on the first point.” All that remains is to start the list with the first point and append the rest of the hull. Here is the complete body of the function.


grahamScan points
    | length points < 3 = error "Degenerate"
    | otherwise         =
        let (firstPoint:rest) = findFirstPoint points
            sortedRest = sortBy (compare `on` (angle firstPoint)) rest

            loop (a:b:[]) = case turn a b firstPoint of
                              Left -> b : []
                              _    -> []
            loop (a:b:c:ps) = case turn a b c of
                                Left -> b : loop (b:c:ps)
                                _    -> loop (a:c:ps)

        in firstPoint : loop (firstPoint:sortedRest)
 

Support

Now to define the helper functions I stubbed out before. The angle function is supposed to calculate the angle between ba and the x-axis, but it’s really calculating the cosine of that angle. Don’t tell anybody.


angle a b = dx / len where
    dx = px a – px b
    dy = py a – py b
    len = sqrt (dx * dx + dy * dy)
 

Now the compiler says that the type parameter in our Point has to be in the Floating class for sqrt. Add that to the type declaration of grahamScan.

The function turn is also simple. If the cross product of the two line segments is greater than zero then it is a left turn, while less than zero is a right turn. Equal to zero is colinear, which can go either way depending on what you want from your convex hull. I report ‘em all and let the caller sort it out.


turn a b c = case compare cross 0 of
               GT -> Left
               EQ -> Colinear
               LT -> Right
    where
      cross = x1 * y2 – x2 * y1
      x1 = px b – px a
      y1 = py b – py a
      x2 = px c – px b
      y2 = py c – py b
 

I used compare here so I could use case instead of nested if statements, or a function with guards. I don’t know what the performance differences are, I just think the code is cleaner like this.

Since this function compares the values of the point coordinates, we’ll need to add the Ord class to the Point parameter. That goes on grahamScan, which now reads


grahamScan :: (Floating a, Ord a) => [Point a] -> [Point a]
 

The last function, firstPoint, is a little tricky. You could define it using two passes, like this:


findFirstPoint points =
  let firstPoint = minimumBy lowerLeftCmp points
      rest = filter (/= firstPoint) points
  in firstPoint:rest
 

However, I chose to do it in one pass by moving points from the input list to an output list. The lower-left point at each step is held back and prepended to the output at the end.


findFirstPoint points
    | null points = error "Null points"
    | otherwise   = loop points [] where
    loop (a:[]) ps = a:ps
    loop (a:b:rest) ps =
        if (py a < py b) || (py a == py b && px a < px b)
        then loop (a:rest) (b:ps)
        else loop (b:rest) (a:ps)
 

And that’s it for a working implementation. I also added a convenience function for converting tuples to Points: point = uncurry Point.

Refinement

Now we can test this out in ghci:


*GrahamScan> grahamScan . map point $ [(2, 3), (1, 1), (-1, 0), (1, -3), (0, 0)]
[Point {px = 1.0, py = -3.0},Point {px = 2.0, py = 3.0},Point {px = -1.0, py = 0.0}]
*GrahamScan> grahamScan . map point $ [(0, 0), (0.5, 0.5), (1, 1), (0, 2), (-1, 1)]
[Point {px = 0.0, py = 0.0},Point {px = 0.5, py = 0.5},Point {px = 0.0, py = 2.0},Point {px = -1.0, py = 1.0}]
 

Uh oh, there’s a bug. The convex hull in the last test should have included (1, 1) but not (0.5, 0.5). The problem is that they are colinear with the lowest point (0, 0), so they have the same angle. They will be sorted in an arbitrary order, but whichever one ends up first will be rejected. To fix this, we’ll need to add the distance to the lower-left point in the angle comparison. If two points have the same angle, the point that is closer should come first in the list. That way, it will be rejected.

To do this, I’ll change angle to return a tuple with the angle and the length. When comparing tuples, the elements n are only compared if the elements n-1 are equal, which is exactly what we want. So, with the subdefinitions the same, angle becomes:


angle a b = (dx / len, len)
 

When I did this, I realized that I could use a tuple in the findFirstPoint comparison too. The expression:


(py a < py b) || (py a == py b && px a < px b)
 

is equivalent to


(py a, px a) < (py b, px b)
 

I think the second is easier to read, and due to laziness the x-coordinates aren’t even calculated unless the y-coordinates are equal.

Final Thoughts

So that’s it really. You can get the final source here. I would be interested to see someone tune the performance on this. It scales about linearly,1 which is expected since the algorithm is O(n log n), but I think that there’s plenty of room for improvement. If you have some suggestions, fork the repository at Github and show me what you can do.

Haskell’s expressive power made the implementation of this algorithm fairly easy. There are several different ways to do it, and I’ll explore some of the options in future articles.


  1. These are timings for the calculation of the convex hull of sets of random vertices. The timings were performed on a 2.1 GHz Intel Core 2 Duo.

    Input Vertices Time (s)
    100 0.013
    200 0.021
    500 0.037
    10000 0.70
    20000 1.4
    50000 3.5
    100000 7.0
    200000 14.3
    500000 37