324 lines
11 KiB
Plaintext
324 lines
11 KiB
Plaintext
Title: Experimental, optimized route dispatch code
|
|
|
|
Let's start with our module declaration and imports.
|
|
|
|
> module Yesod.Routes.Dispatch
|
|
> ( Piece (..)
|
|
> , Route (..)
|
|
> , Dispatch
|
|
> , toDispatch
|
|
> ) where
|
|
>
|
|
> import Data.Text (Text)
|
|
> import qualified Data.Vector as V
|
|
> import Data.Maybe (fromMaybe, mapMaybe)
|
|
> import qualified Data.Map as Map
|
|
> import Data.List (sortBy)
|
|
> import Data.Ord (comparing)
|
|
> import Control.Arrow (second)
|
|
> import Control.Exception (assert)
|
|
|
|
This module provides an efficient routing system. The code is pure, requires no
|
|
fancy extensions, has no Template Haskell involved and is not Yesod specific.
|
|
It does, however, assume a routing system similar to that of Yesod.
|
|
|
|
Routing works based on splitting up a path into its components. This is handled
|
|
very well by both the web-routes and http-types packages, and this module does
|
|
not duplicate that functionality. Instead, it assumes that the requested path
|
|
will be provided as a list of 'Text's.
|
|
|
|
A route will be specified by a list of pieces (using the 'Piece' datatype).
|
|
|
|
> data Piece = Static Text | Dynamic
|
|
|
|
Each piece is either a static piece- which is required to match a component of
|
|
the path precisely- or a dynamic piece, which will match any component.
|
|
Additionally, a route can optionally match all remaining components in the
|
|
path, or fail if extra components exist.
|
|
|
|
Usually, the behavior of dynamic is not what you really want. Often times, you
|
|
will want to match integers, or slugs, or some other limited format. This
|
|
brings us nicely to the dispatch function. Each route provides a function of
|
|
type:
|
|
|
|
> type Dispatch res = [Text] -> Maybe res
|
|
|
|
The res argument is application-specific. For example, in a simple
|
|
WAI application, it could be the Application datatype. The important
|
|
thing to point out about Dispatch is that is takes a list of 'Text's and
|
|
returns its response in a Maybe. This gives you a chance to have
|
|
finer-grained control over how individual components are parsed. If you don't
|
|
want to deal with it, you return 'Nothing' and routing continues.
|
|
|
|
Note: You do *not* need to perform any checking on your static pieces, this
|
|
module handles that for you automatically.
|
|
|
|
So each route is specified by:
|
|
|
|
> data Route res = Route
|
|
> { rhPieces :: [Piece]
|
|
> , rhHasMulti :: Bool
|
|
> , rhDispatch :: Dispatch res
|
|
> }
|
|
|
|
Your application needs to provide this module with a list of routes, and then
|
|
this module will give you back a new dispatch function. In other words:
|
|
|
|
> toDispatch :: [Route res] -> Dispatch res
|
|
> toDispatch rhs =
|
|
> bcToDispatch bc
|
|
> where
|
|
> bc = toBC rhs
|
|
|
|
In addition to the requirements listed above for routing, we add one extra
|
|
rule: your specified list of routes is treated as ordered, with the earlier
|
|
ones matching first. If you have an overlap between two routes, the first one
|
|
will be dispatched.
|
|
|
|
The simplest approach would be to loop through all of your routes and compare
|
|
against the path components. But this has linear complexity. Many existing
|
|
frameworks (Rails and Django at least) have such algorithms, usually based on
|
|
regular expressions. But we can provide two optimizations:
|
|
|
|
* Break up routes based on how many components they can match. We can then
|
|
select which group of routes to continue testing. This lookup runs in
|
|
constant time.
|
|
|
|
* Use a Map to reduce string comparisons for each route to logarithmic
|
|
complexity.
|
|
|
|
Let's start with the first one. Each route has a fixed number of pieces. Let's
|
|
call this *n*. If that route can also match trailing components (rhHasMulti
|
|
above), then it will match *n* and up. Otherwise, it will match specifically on
|
|
*n*.
|
|
|
|
If *max(n)* is the maximum value of *n* for all routes, what we need is
|
|
(*max(n)* + 2) groups: a zero group (matching a request for the root of the
|
|
application), 1 - *max(n)* groups, and a final extra group containing all
|
|
routes that can match more than *max(n)* components. This group will consist of
|
|
all the routes with rhHasMulti, and only those routes.
|
|
|
|
> data ByCount res = ByCount
|
|
> { bcVector :: !(V.Vector (PieceMap res))
|
|
> , bcRest :: !(PieceMap res)
|
|
> }
|
|
|
|
We haven't covered PieceMap yet; it is used for the second optimization. We'll
|
|
discuss it below.
|
|
|
|
The following function breaks up a list of routes into groups. Again, please
|
|
ignore the PieceMap references for the moment.
|
|
|
|
> toBC :: [Route res] -> ByCount res
|
|
> toBC rhs =
|
|
> ByCount
|
|
> { bcVector = groups
|
|
> , bcRest = allMultis
|
|
> }
|
|
> where
|
|
|
|
Determine the value of *max(n)*.
|
|
|
|
> maxLen
|
|
> | null rhs = 0
|
|
> | otherwise = maximum $ map (length . rhPieces) rhs
|
|
|
|
Get the list of all routes which can have multis. This will make up the *rest*
|
|
group.
|
|
|
|
> allMultis = toPieceMap maxLen $ filter rhHasMulti rhs
|
|
|
|
And now get all the numbered groups. For each group, we need to get all routes
|
|
with *n* components, __and__ all routes with less than *n* components and that
|
|
have rhHasMulti set to True.
|
|
|
|
> groups = V.map group $ V.enumFromN 0 (maxLen + 1)
|
|
> group i = toPieceMap i $ filter (canHaveLength i) rhs
|
|
>
|
|
> canHaveLength :: Int -> Route res -> Bool
|
|
> canHaveLength i rh =
|
|
> len == i || (len < i && rhHasMulti rh)
|
|
> where
|
|
> len = length $ rhPieces rh
|
|
|
|
Next we'll set up our routing by maps. What we need is a bunch of nested Maps.
|
|
For example, if we have the following routings:
|
|
|
|
/foo/bar/1
|
|
/foo/baz/2
|
|
|
|
We would want something that looks vaguely like:
|
|
|
|
/foo
|
|
/bar
|
|
/1
|
|
/baz
|
|
/2
|
|
|
|
But there's an added complication: we need to deal with dynamic compnents and HasMulti as well. So what we'd really have is routes looking like:
|
|
|
|
/foo/bar/1
|
|
/foo/baz/2
|
|
/*dynamic*/bin/3
|
|
/multi/*bunch of multis*
|
|
|
|
We can actually simplify away the multi business. Remember that for each group,
|
|
we will have a fixed number of components to match. In the list above, it's
|
|
three. Even though the last route only has one component, we can actually just
|
|
fill up the missing components with *dynamic*, which will give the same result
|
|
for routing. In other words, we'll treat it as:
|
|
|
|
/foo
|
|
/bar
|
|
/1
|
|
/baz
|
|
/2
|
|
/*dynamic*
|
|
/bin
|
|
/3
|
|
/multi
|
|
/*dynamic*
|
|
/*dynamic*
|
|
|
|
What we need is then two extra features on our datatype:
|
|
|
|
* Support both a 'Map Text PieceMap' for static pieces, and a general
|
|
'PieceMap' for all dynamic pieces.
|
|
|
|
* An extra constructive after we've gone three levels deep, to provide all
|
|
matching routes.
|
|
|
|
What we end up with is:
|
|
|
|
> data PieceMap res = PieceMap
|
|
> { pmDynamic :: PieceMap res
|
|
> , pmStatic :: Map.Map Text (PieceMap res)
|
|
> } | PieceMapEnd [(Int, Dispatch res)]
|
|
|
|
Note that the PieceMapEnd is a list of pairs, including an Int. Since the map
|
|
process will confuse the original order of our routes, we need some way to get
|
|
that back to make sure overlapping is handled correctly.
|
|
|
|
We'll need two pieces of information to make a PieceMap: the depth to drill
|
|
down to, and the routes in the current group. We'll immediately zip up those
|
|
routes with an Int to indicate route priority.
|
|
|
|
> toPieceMap :: Int -> [Route res] -> PieceMap res
|
|
> toPieceMap depth = toPieceMap' depth . zip [1..]
|
|
>
|
|
> toPieceMap' :: Int
|
|
> -> [(Int, Route res)]
|
|
> -> PieceMap res
|
|
|
|
The stopping case: we've exhausted the full depth, so let's put together a
|
|
PieceMapEnd. Technically speaking, the sort here is unnecessary, since we'll
|
|
sort again later. However, that second sorting occurs during each dispatch
|
|
occurrence, whereas this sorting only occurs once, in the initial construction
|
|
of the PieceMap. Therefore, we presort here.
|
|
|
|
> toPieceMap' 0 rhs =
|
|
> PieceMapEnd $ map (second rhDispatch)
|
|
> $ sortBy (comparing fst) rhs
|
|
|
|
Note also that we apply rhDispatch to the route. We are no longer interested in
|
|
the rest of the route information, so it can be discarded.
|
|
|
|
Now the heart of this algorithm: we construct the pmDynamic and pmStatic
|
|
records. For both, we recursively call toPieceMap' again, with the depth
|
|
knocked down by 1.
|
|
|
|
> toPieceMap' depth rhs = PieceMap
|
|
> { pmDynamic = toPieceMap' depth' dynamics
|
|
> , pmStatic = Map.map (toPieceMap' depth') statics
|
|
> }
|
|
> where
|
|
> depth' = depth - 1
|
|
|
|
We turn our list of routes into a list of pairs. The first item in the pair
|
|
gives the next piece, and the second gives the route again, minus that piece.
|
|
|
|
> pairs = map toPair rhs
|
|
> toPair (i, Route (p:ps) b c) = (p, (i, Route ps b c))
|
|
|
|
And as we mentioned above, for multi pieces we fill in the remaining pieces
|
|
with Dynamic.
|
|
|
|
> toPair (i, Route [] b c) = assert b (Dynamic, (i, Route [] b c))
|
|
|
|
Next, we break up our list of dynamics.
|
|
|
|
> getDynamic (Dynamic, rh) = Just rh
|
|
> getDynamic _ = Nothing
|
|
> dynamics = mapMaybe getDynamic pairs
|
|
|
|
And now we make a Map for statics. Note that Map.fromList would not be
|
|
appropriate here, since it would only keep one route per Text.
|
|
|
|
> getStatic (Static t, rh) = Just $ Map.singleton t [rh]
|
|
> getStatic _ = Nothing
|
|
> statics = Map.unionsWith (++) $ mapMaybe getStatic pairs
|
|
|
|
The time has come to actually dispatch.
|
|
|
|
> bcToDispatch :: ByCount res -> Dispatch res
|
|
> bcToDispatch (ByCount vec rest) ts0 =
|
|
> bcToDispatch' ts0 pm0
|
|
> where
|
|
|
|
Get the PieceMap for the appropriate group. If the length of the requested path
|
|
is greater than *max(n)*, then use the "rest" group.
|
|
|
|
> pm0 = fromMaybe rest $ vec V.!? length ts0
|
|
|
|
Stopping case: we've found our list of routes. Sort them, then starting
|
|
applying their dispatch functions. If the first one returns Nothing, go to the
|
|
next, and so on.
|
|
|
|
> bcToDispatch' _ (PieceMapEnd r) = firstJust (\f -> f ts0) $ map snd r
|
|
|
|
For each component, get the static PieceMap and the dynamic one, combine them
|
|
together, and then continue dispatching.
|
|
|
|
> bcToDispatch' (t:ts) (PieceMap dyn sta) = bcToDispatch' ts $
|
|
> case Map.lookup t sta of
|
|
> Nothing -> dyn
|
|
> Just pm -> append dyn pm
|
|
|
|
Handle an impossible case that should never happen.
|
|
|
|
> bcToDispatch' [] _ = assert False Nothing
|
|
|
|
Helper function: get the first Just response.
|
|
|
|
> firstJust :: (a -> Maybe b) -> [a] -> Maybe b
|
|
> firstJust _ [] = Nothing
|
|
> firstJust f (a:as) = maybe (firstJust f as) Just $ f a
|
|
|
|
Combine two PieceMaps together.
|
|
|
|
> append :: PieceMap res -> PieceMap res -> PieceMap res
|
|
|
|
At the end, just combine the list of routes. But we combine them in such a way
|
|
so as to preserve their order. Since a and b come presorted (as mentioned
|
|
above), we can just merge the two lists together in linear time.
|
|
|
|
> append (PieceMapEnd a) (PieceMapEnd b) = PieceMapEnd $ merge a b
|
|
|
|
Combine the dynamic and static portions of the maps.
|
|
|
|
> append (PieceMap a x) (PieceMap b y) =
|
|
> PieceMap (append a b) (Map.unionWith append x y)
|
|
|
|
An impossible case.
|
|
|
|
> append _ _ = assert False $ PieceMapEnd []
|
|
|
|
Our O(n) merge.
|
|
|
|
> merge :: Ord a => [(a, b)] -> [(a, b)] -> [(a, b)]
|
|
> merge x [] = x
|
|
> merge [] y = y
|
|
> merge x@(a@(ai, _):xs) y@(b@(bi, _):ys)
|
|
> | ai < bi = a : merge xs y
|
|
> | otherwise = b : merge x ys
|