Graham Scan in Haskell
filed in Software Development on Sep.06, 2011
This article follows how I developed an implementation of the Graham scan algorithm in Haskell, missteps and all. I think it’s valuable to see the process that others use, especially in a “weird” language like Haskell. At times, it can seem like most of what you’ve learned about developing software doesn’t apply to Haskell, but I think that’s just because the language allows so much of the scaffolding to be cut away once the software is finished. Reading Haskell code written by the masters can feel like looking at the Sistine chapel and wondering where they got such long paint brushes. This will be a stepbystep implementation as I developed it. I’m still learning too, so don’t take this as an example of excellent Haskell style.
The Graham scan algorithm finds the convex hull of a set of planar points in three steps. First, we find the lowerleft point in the set (call it P). Second, we sort the set based on the angle each point makes with P and the xaxis. Third, we go through the sorted list and keep each point which makes a “left turn.”
Data Types
Even before we start, there are some obvious types we need. We’ll
need a type to store points. This could just be a tuple, but I think
the code will be easier to understand if the point type has
accessors. I’ll define Point
as
I parameterized the number type, so our algorithm can be as general as possible. I think we’ll want a type to represent the result of the turn calculation too, so define a turn data type:
Since I’m redefining Left
and Right
here, I have to hide Either
with import Prelude hiding (Either(..))
.
Stubs
I want to start capturing what I know about the Graham scan function in code. First, it will take a set of points and return the points of the convex hull, so
grahamScan :: [Point a] > [Point a]
Next, the algorithm doesn’t work right if there are fewer than three points, so it will generate an error in that case. There are many ways to handle errors in Haskell. Some common methods include putting the function in the Error monad, returning an Either or Maybe, or returning an empty list. I’ll explore other methods in future articles, but since this function is for experimentation I’m going to just raise an error. I’m not even going to bother with a detailed error report, so the function will start like this.
Now that I know we have valid input, I can get to the meat of the
algorithm. First we find the lower left point and sort the point list
with it. I’m not sure how to find the lower left point, so I’ll just defer that to a function called findFirstPoint
. I’ll use the functions compare
and on
with a new function angle
to create a function that will compare points by their angle. This new comparator is handed to sortBy
to sort the list. For now, I set findFirstPoint
and angle
to undefined
so
the code will compile. Compile early and often; it helps to keep
mistakes from compounding and mutating.
let (firstPoint:rest) = findFirstPoint points
sortedRest = sortBy (compare `on` (angle firstPoint)) rest
The third step in the algorithm is a bit tricky. We need to look at the next three points in the set and decide whether or not to keep the second point, and the last two points have to be tested along with the first point. Since our input list is not circular, there has to be special handling for the end of the list.
loop (a:b:[]) = case turn a b firstPoint of
Left > b : []
_ > []
loop (a:b:c:ps) = case turn a b c of
Left > b : loop (b:c:ps)
_ > loop (a:c:ps)
Isn’t that lovely? Haskell is so expressive, the code basically reads like the English description: “Take the next three points in the list, and if they make a left turn keep the middle point. If there are only two points left, tack on the first point.” All that remains is to start the list with the first point and append the rest of the hull. Here is the complete body of the function.
grahamScan points
 length points < 3 = error "Degenerate"
 otherwise =
let (firstPoint:rest) = findFirstPoint points
sortedRest = sortBy (compare `on` (angle firstPoint)) rest
loop (a:b:[]) = case turn a b firstPoint of
Left > b : []
_ > []
loop (a:b:c:ps) = case turn a b c of
Left > b : loop (b:c:ps)
_ > loop (a:c:ps)
in firstPoint : loop (firstPoint:sortedRest)
Support
Now to define the helper functions I stubbed out before. The angle
function is supposed to calculate the angle between ba and the xaxis, but it’s really calculating the cosine of that angle. Don’t tell anybody.
Now the compiler says that the type parameter in our Point
has to be
in the Floating
class for sqrt
. Add that to the type declaration
of grahamScan
.
The function turn
is also simple. If the cross product of the two
line segments is greater than zero then it is a left turn, while less than zero
is a right turn. Equal to zero is colinear, which can go either way
depending on what you want from your convex hull. I report ‘em all and
let the caller sort it out.
turn a b c = case compare cross 0 of
GT > Left
EQ > Colinear
LT > Right
where
cross = x1 * y2 – x2 * y1
x1 = px b – px a
y1 = py b – py a
x2 = px c – px b
y2 = py c – py b
I used compare
here so I could use case
instead of nested if
statements, or a function with guards. I don’t know what the
performance differences are, I just think the code is cleaner like
this.
Since this function compares the values of the point coordinates,
we’ll need to add the Ord
class to the Point
parameter. That goes
on grahamScan
, which now reads
The last function, firstPoint
, is a little tricky. You could define
it using two passes, like this:
findFirstPoint points =
let firstPoint = minimumBy lowerLeftCmp points
rest = filter (/= firstPoint) points
in firstPoint:rest
However, I chose to do it in one pass by moving points from the input list to an output list. The lowerleft point at each step is held back and prepended to the output at the end.
findFirstPoint points
 null points = error "Null points"
 otherwise = loop points [] where
loop (a:[]) ps = a:ps
loop (a:b:rest) ps =
if (py a < py b)  (py a == py b && px a < px b)
then loop (a:rest) (b:ps)
else loop (b:rest) (a:ps)
And that’s it for a working implementation. I also added a
convenience function for converting tuples to Points:
point = uncurry Point
.
Refinement
Now we can test this out in ghci
:
*GrahamScan> grahamScan . map point $ [(2, 3), (1, 1), (1, 0), (1, 3), (0, 0)]
[Point {px = 1.0, py = 3.0},Point {px = 2.0, py = 3.0},Point {px = 1.0, py = 0.0}]
*GrahamScan> grahamScan . map point $ [(0, 0), (0.5, 0.5), (1, 1), (0, 2), (1, 1)]
[Point {px = 0.0, py = 0.0},Point {px = 0.5, py = 0.5},Point {px = 0.0, py = 2.0},Point {px = 1.0, py = 1.0}]
Uh oh, there’s a bug. The convex hull in the last test should have
included (1, 1)
but not (0.5, 0.5)
. The problem is that they are
colinear with the lowest point (0, 0)
, so they have the same angle.
They will be sorted in an arbitrary order, but whichever one ends up
first will be rejected. To fix this, we’ll need to add the distance
to the lowerleft point in the angle comparison. If two points have
the same angle, the point that is closer should come first in the
list. That way, it will be rejected.
To do this, I’ll change angle
to return a tuple with the angle and
the length. When comparing tuples, the elements n
are only compared
if the elements n1
are equal, which is exactly what we want. So,
with the subdefinitions the same, angle becomes:
angle a b = (dx / len, len)
When I did this, I realized that I could use a tuple in the
findFirstPoint
comparison too. The expression:
(py a < py b)  (py a == py b && px a < px b)
is equivalent to
(py a, px a) < (py b, px b)
I think the second is easier to read, and due to laziness the xcoordinates aren’t even calculated unless the ycoordinates are equal.
Final Thoughts
So that’s it really. You can get the final source here. I would be interested to see someone tune the performance on this. It scales about linearly,^{1} which is expected since the algorithm is O(n log n), but I think that there’s plenty of room for improvement. If you have some suggestions, fork the repository at Github and show me what you can do.
Haskell’s expressive power made the implementation of this algorithm fairly easy. There are several different ways to do it, and I’ll explore some of the options in future articles.

These are timings for the calculation of the convex hull of sets of random vertices. The timings were performed on a 2.1 GHz Intel Core 2 Duo.
Input Vertices Time (s) 100 0.013 200 0.021 500 0.037 10000 0.70 20000 1.4 50000 3.5 100000 7.0 200000 14.3 500000 37
Leave a Reply