In the few last months I've been busy coming up with techniques mastering the
gdiff
Haskell library, even for use cases it was not originally designed for.
This is mostly a brain dump of some approaches I have learnt while tinkering and trying to solve certain problems.
1) Polymorphic
diff
: Eelco Lempsink writes in the conclusion of his
2010 thesis «Furthermore, we can not encode polymorphic datatypes such as lists, but need to write specific encodings for each type we want to include». In the
library documentation he says «It might require another master thesis project to solve this». However in the last days I developed a trick, which does allow me to describe polymorphic data types, such as
[a]
or
Maybe a
. I just needed to add these lines:
in the family GADT:
ListNil :: List Fam as → Fam a as → Fam [a] Nil
ListCons :: List Fam as → Fam a as → Fam [a] (a `Cons` [a] `Cons` Nil)
then define a
Type
instance:
instance Type Fam a ⇒ Type Fam [a] where
constructors = head [Concr (ListNil cc) | Concr cc Concr cc ← constructors] : [head [Concr (ListCons cc) | Concr cc ← constructors]]
What is this doing? It picks two random constructors from the underlying data type and wraps them with
ListNil
and
ListCons
. We usually ignore what is wrapped, with one exception: in
decEq
one needs to ensure that the constructors' arguments are compared too, otherwise one cannot finish the type equality proof.
2) Non-moving pairs: In the usual textual
diff
algorithm the lines may move around a bit to make room for insertions, etc. This is normally the case for
gdiff
too. I have seen
gdiff
reusing values (with
Cpy
) when comparing
(True, False)
with
(False, True)
. But sometimes this is not desired at all. I figured out that "inlining" the leaves into the pair's description (i.e. instead of
(a `Cons` b `Cons` Nil)
writing
appendList
on the field decompositions of
a
and
b
does the trick.
3) Location types: I found a way to equip
Abstr
constructor descriptors with type class dictionaries. It is similar to the lifting approach shown under 1), but much more involved.
The idea is to wrap a location-aware descriptor with a descriptor that is location-blind, i.e. that hides the location parameter:
Loc' :: KnownNat n ⇒ Fam (Loc n Bool) …
Then we need the wrapper:
Hidden' :: KnownNat n ⇒ Fam (Loc n Bool) ts → Fam (Hidden Loc) ts
Giving the
Type
instance for the latter is tricky, as
Abstr
alone has no provision for dodging the class dictionary into
Hidden'
, so I had to write a
locAware
smart constructor to syphon it through the
Hidden'
wrapper.
This also stressed the compiler to its limits in the advanced
PolyKind
-ed approach I needed, so I filed a
GHC bug #9725. There is also some code showing how the
Hidden Loc
is unwrapped and the dictionary-passing functions is installed into the
Abstr
.
4) Monadic actions coming out of
patch
: Here another wrapping approach is needed, but this time we need a sum type for
diff
and
patch
so that we can enter with a pure value at the
Left
and obtain a
Right
action back. This is the most involved approach. I can probably blog another time about it.