List-like types supporting O(1) append and snoc operations.
Installation
dlist is a Haskell package available from Hackage.
It can be installed with cabal or stack.
See the change log for the changes in each version.
Usage
Here is an example of “flattening” a Tree into a list of the elements in its
Leaf constructors:
importqualifiedData.DListasDListdataTreea=Leafa | Branch (Treea) (Treea)
flattenSlow::Treea-> [a]
flattenSlow = go
where
go (Leaf x) = [x]
go (Branch left right) = go left ++ go right
flattenFast::Treea-> [a]
flattenFast =DList.toList . go
where
go (Leaf x) =DList.singleton x
go (Branch left right) = go left `DList.append` go right
flattenSlow is likely to be slower than flattenFast:
flattenSlow uses ++ to concatenate lists, each of which is recursively
constructed from the left and rightTree values in the Branch
constructor.
flattenFast does not use ++ but constructs a composition of functions,
each of which is a “cons” introduced by DList.singleton ((x :)). The
function DList.toList applies the composed function to [], constructing
a list in the end.
To see the difference between flattenSlow and flattenFast, consider some
rough evaluations of the functions applied to a Tree:
flattenSlow (Branch (Branch (Leaf'a') (Leaf'b')) (Leaf'c'))
= go (Branch (Branch (Leaf'a') (Leaf'b')) (Leaf'c'))
= go (Branch (Leaf'a') (Leaf'b')) ++ go (Leaf'c')
= (go (Leaf'a') ++ go (Leaf'b')) ++"c"= ("a"++"b") ++"c"= ('a':[]++"b") ++"c"= ('a':"b") ++"c"='a':"b"++"c"='a':'b':[]++"c"='a':'b':"c"
The left-nested ++ in flattenSlow results in intermediate list constructions
that are immediately discarded in the evaluation of the outermost ++. On the
other hand, the evaluation of flattenFast involves no intermediate list
construction but rather function applications and newtype constructor wrapping
and unwrapping. This is where the efficiency comes from.
Warning! Note that there is truth in the above, but there is also a lot of
hand-waving and intrinsic complexity. For example, there may be GHC rewrite
rules that apply to ++, which will change the actual evaluation. And, of
course, strictness, laziness, and sharing all play a significant role. Also, not
every function in the dlist package is the most efficient for every situation.
Moral of the story: If you are using dlist to speed up your code, check
to be sure that it actually does. Benchmark!
Design Notes
These are some notes on design and development choices made for the dlist
package.
Avoid ++
The original intent of Hughes' representation of lists as first-class functions
was to provide an abstraction such that the list append operation found in
functional programming languages (and now called ++ in Haskell) would not
appear in left-nested positions to avoid duplicated structure as lists are
constructed. The lesson learned by many people using list over the years is that
the append operation can appear, sometimes surprisingly, in places they don't
expect it.
One of our goals is for the dlist package to avoid surprising its users with
unexpected insertions of ++. Towards this end, there should be a minimal set
of functions in dlist in which ++ can be directly or indirectly found. The
list of known uses of ++ includes:
If any future requested functions involve ++ (e.g. via fromList), the burden
of inclusion is higher than it would be otherwise.
Abstraction
The DList representation and its supporting functions (e.g. append, snoc,
etc.) rely on an invariant to preserve its safe use. That is, without this
invariant, a user may encounter unexpected outcomes.
(We use safety in the sense that the semantics are well-defined and expected,
not in the sense of side of referential transparency. The invariant does not
directly lead to side effects in the dlist package, but a program that uses an
unsafely generated DList may do something surprising.)
The invariant is that, for any xs :: DList a:
fromList (toList xs) = xs
To see how this invariant can be broken, consider this example:
It would be rather unhelpful and surprising to find (xs `snoc` 1) turned out
to be the empty list.
To preserve the invariant on DList, we provide it as an abstract type in the
Data.DList module. The constructor, UnsafeDList, and record label,
unsafeApplyDList, are not exported because these can be used, as shown above,
to break the invariant.
All of that said, there have been numerous requests to export the DList
constructor. We are not convinced that it is necessary, but we are convinced
that users should decide for themselves.
To use the constructor and record label of DList, you import them as follows:
If you are using Safe Haskell, you may need to add this at the top of your
module:
{-# LANGUAGE Trustworthy #-}
Just be aware that the burden of proof for safety is on you.
References
These are various references where you can learn more about difference lists.
Research
A novel representation of lists and its application to the function
“reverse.” John Hughes. Information Processing Letters. Volume 22, Issue 3.
1986-03. Pages 141-144. PDF
This is the original published source for a representation of lists as
first-class functions.
请发表评论