These files demonstrate an idea about extending foldr to allow the user to specify how the recursion is encoded.
The goal is to make foldl' (and related functions like foldM) fuse well with other list functions. This would make fold/build fusion more powerful by eliminating many more intemediate lists, because it's quite common that the final consumer of a list is an application of foldl'.
Currently foldl' is not a good consumer. To turn it a good consumer, we'd need to define it in terms of foldr. The problem is that such an encoding doesn't result in efficient code, at least under the current GHC optimizer.
The standard way to define foldl' in terms of foldr is:
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f v xs = foldr g id xs v
where
g x next acc = next $! f x acc
This definition works, but it doesn't result in efficient code. GHC generates something essentially like:
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f v xs = let
go [] = id
go (x:xs) = (go xs $!) . f x
in go xs acc
The problem here is that go
is defined as a recursive function that
creates a chain of closures, which then are used only once. Note
that this allocates memory, whereas the standard definition of
foldl' doesn't need any allocation (apart from what f
allocates).
The idea is that instead of defining list operations in terms of foldr and build, use a slight generalization of them, namely foldrW and buildW. The definition of foldrW looks like:
-- | A mapping between @a@ and @b@.
data Wrap a b = Wrap (a -> b) (b -> a)
foldrW
:: (forall e. Wrap (f e) (e -> b -> b))
-> (a -> b -> b) -> b -> [a] -> b
foldrW (Wrap wrap unwrap) f z0 list0 = wrap go list0 z0
where
go = unwrap $ \list z' -> case list of
[] -> z'
x:xs -> f x $ wrap go xs z'
foldrW
differs from foldr
in that it takes one extra argument.
That argument defines a mapping between a function type e -> b -> b
and a user-specified type f e
. foldrW
recursively defines a
value of type f [a]
, effectively letting the user choose the representation
of the loop.
Note that this is an example of worker-wrapper transformation.
The corresponding build function is defined thus:
newtype Simple b e = Simple { runSimple :: e -> b -> b }
buildW
:: (forall f r.
(forall e. Wrap (f e) (e -> r -> r))
-> (a -> r -> r)
-> r)
-> [a]
buildW f = f (Wrap runSimple Simple) (:) []
Using foldrW
, foldl'
can be defined as:
newtype Simple b e = Simple { runSimple :: e -> b -> b }
foldl' f initial xs = foldrW (Wrap wrap unwrap) g id xs initial
where
wrap (Simple s) e k a = k $ s e a
unwrap u = Simple $ \e -> u e id
g x next acc = next $! f acc x
The effect of the worker-wrapper split here is to transform a continuation-passing recursion into a direct-style loop.
I believe foldrW
is expressive enough that it can be used to
efficiently define other types
of list consumers, e.g. mapM_
for the IO monad. I haven't implemented
it yet, though.
The idea of having a versatile fold primitive is generally useful,
not just in conjuction with list fusion. foldrW
(or its variant
based on foldMap
) would be useful in context of other container types,
e.g. maps, vectors and byte strings.
It is known that the above encoding of foldl'
in terms of foldr
can
be made allocation-free by eta-expanding the local function go
. So
one could argue that, if GHC can automatically eta-expand it, unmodified
fold/build fusion will be able to fuse foldl'
nicely, so there is no
need for foldrW/buildW.
My answer to this question is that eta-expansion alone is not enough for two reasons:
-
If the initial list producer is a function that traverses a tree, the resulting fused loop will be in continuation passing style, which means a new closure is allocated for each non-leaf node of the input tree. More specifically, consider the following function that flattens a tree into a list:
data Tree = Tip {-# UNPACK #-} !Int | Bin Tree Tree toList :: Tree -> [Int] toList tree = build (toListFB tree) {-# INLINE toList #-} toListFB :: Tree -> (Int -> r -> r) -> r -> r toListFB root cons nil = go root nil where go (Tip x) rest = cons x rest go (Bin x y) rest = go x (go y rest)
Let's say we want to eliminate the intermediate list in the expression
(sum (toList t))
. Currently sum is not a good consumer, but if it were, after fusion we'd get something like:sumList :: Tree -> Int sumList root = go0 root id 0 go0 :: Tree -> (Int -> Int) -> Int -> Int go0 (Tip x) k = (k $!) . (x+) go0 (Bin x y) k = go0 x (go0 y k)
Now, merely eta-expanding go0 is not enough to get efficient code, because the function will still build a partial application every time it sees a Bin constructor. For this recursion to work in an allocation-free way, it must be rather like:
go1 :: Tree -> Int -> Int go1 (Tip x) n = x + n go1 (Bin x y) n = go1 y (go1 x n)
And this is what we get if we define foldl' and toList in terms of
foldrW
andbuildW
. -
GHC would eta-expand a function only when it knows for sure that eta-expansion will not duplicate work. For
foldl'
this is not a problem, but there are other list consumers for which this analysis is not locally possible. For example, considermconcat
, specialized for bytestring builder:mconcat :: [Builder] -> Builder mconcat . map fromInt :: [Int] -> Builder
where fromInt is a functin that serializes a value of Int, and Builder is a newtype of some function type. Here, it's often possible to get a big performance boost by allowing
fromInt
to be executed every time the resulting the builder is called (as a function), rather than just once. The user may or may not want this. In cases like this, it would be difficult for ghc to generate code the user expects, because (1) it's not possible to tell the resulting function is run only once by analyzing the code locally, and (2) the user may be willing to actually repeat the call tofromInt
.foldrW
andbuildW
would give more control to the user in such cases.
I believe basically all of the current good consumer or producer will continue
to fuse well under the new foldrW
/buildW
framework, because foldr
can
be defined in terms of foldrW
(by passing a trivial wrapper) and build
can
be defined in terms of buildW
(by ignoring the given wrapper). However I'll
need to actually replace the whole set of list functions using foldrW/buildW
to see if this is true.
As a preliminary evidence, here is a criterion benchmark that compares
foldrW/buildW
against fold/build
in a number of example scenarios:
http://htmlpreview.github.io/?https://github.com/takano-akio/ww-fusion/blob/master/fusion.html
If you have a list produer defined using build
, it needs to be rewritten
using buildW
in order to take advantage of the proposed fusion scheme.
Here I describe how to do such a rewrite.
A good produer in the fold/build framework shold look like this:
{-# LANGUAGE ScopedTypeVariables #-}
producer :: ... -> [ELEM]
producer ... = build (producerFB ...)
producerFB :: forall r. ... -> (ELEM -> r -> r) -> r -> r
producerFB ... = go ...
where
go :: ... -> r
go ... = ... -- usually recursive
I use a version of enumFromTo
(specialized for Ints) as an example.
eft :: Int -> Int -> [Int]
eft from to = build (eftFB from to)
eftFB :: forall r. Int -> Int -> (Int -> r -> r) -> r -> r
eftFB from to cons nil = go from
where
go :: Int -> r
go !i = if i <= to
then cons i (go (i + 1))
else nil
First, the type of go
needs to be changed to
go :: A -> r -> r
for some type A, where the second argument (of type r
) represents the rest of
the list. This change involves the following two steps:
- explicitly pass around a 'rest' parameter, if it's not being done already.
- combine all the other arguments into one, using a tuple or () if needed.
In the eft
example it looks like this:
eftFB :: forall r. Int -> Int -> (Int -> r -> r) -> r -> r
eftFB from to cons nil = go from nil
where
go :: Int -> r -> r
go !i rest = if i <= to
then cons i (go (i + 1) rest)
else rest
Now it's straightforward to rewrite the producer function using buildW
:
- Replace
build
withbuildW
- The
FB
function takes an extra argument,(Wrap wrap unwrap)
. - Change the type of
go
from (A -> r -> r) to (f A). - Replace each ocurrence of
go
withwrap go
. - Replace the definition of
go
, (go ... = ...) with (go = wrap $ \... -> ...).
The eft
example now looks like this:
eft :: Int -> Int -> [Int]
eft from to = buildW (eftFB from to)
eftFB
:: forall r f
. Int -> Int -> Wrap f r -> (Int -> r -> r) -> r -> r
eftFB from to (Wrap wrap unwrap) cons nil = wrap go from nil
where
go :: f Int
go = unwrap $ \ !i rest -> if i <= to
then cons i (wrap go (i + 1) rest)
else rest
There seem to be many ways to extend fold/build fusion to get a similar
expressivity to foldrW/buildW
. One of them is particularly attractive because
it can be described with just those building blocks that many Haskell
programmers are familiar with.
It uses the following pair of functions in place of foldr and build.
mapA :: (ArrowApply a) => a b () -> [b] -> a () ()
buildA :: (forall a. (ArrowApply a) => a b () -> a () ()) -> [b]
However it seems like the GHC optimizer and/or the Arrow
class need to be
tweaked in order for this framework to work efficiently.