haskell - Struggling with Applicative parsing -
so i'm having go @ writing complex parser, using applicative (the parser in question doesn't implement monad @ all).
for trivial parsers, quite easy. non-trivial ones... not much. applicative interface seems violently force write in point-free style. extremely difficult deal with.
consider, example:
call = n <- name char '(' trim <- sepby argument (char ',' >> trim) char ')' trim char '=' r <- result return $ call {name = n, args = as, result = r}
now let's try write using applicative:
call = (\ n _ _ _ _ _ _ r -> call {name = n, args = as, result = r}) <$> name <*> char '(' <*> trim <*> sepby argument (const const () <$> char ',' <*> trim) <*> char ')' <*> trim <*> char '=' <*> trim <*> result
applicative has forced me put variable bindings far away actual parser is. (e.g., try confirm as
bound sepby argument ...
; it's not @ easy verify haven't got wrong count of _
patterns!)
another unintuitive thing <*>
applies function value, *>
, <*
pure sequencing. took ages wrap mind around. different method names have made far, far clearer. (but monad seems have grabbed >>
, <<
, sadly.) seems these can stacked, yielding things like
exit = "exit:" *> trim *> name <* char '(' <* trim <* char ')' <* trim
it's rather non-obvious can this. and, me, code isn't terribly readable. more importantly, still haven't figured out how deal collecting multiple values while dropping multiple other values.
in all, find myself wishing use do-notation! don't need change effects based on prior results; don't need power of monad. notation more readable. (i keep wondering whether feasible implement this; can syntactically tell when particular do-block can mechanically transformed applicative?)
does know of way around these problems? particularly, how can move variable bindings closer parser bind to?
i don't have solution problem, perhaps intuition might build applicative parsers more easily. when comes applicative, there 2 kinds of "sequencing" needs taken consideration:
- the sequencing of parsing operations: determines order in write parsers.
- the sequencing of underlying values: 1 more flexible, can combine them in order like.
when 2 sequences match each other well, result nice , compact representation of parser in applicative notation. example:
data infix = infix double operator double infix = infix <$> number <*> operator <*> number
the problem when sequence don't match exactly, have massage underlying values in order things work (you can't change ordering of parsers):
number = f <$> sign <*> decimal <*> exponent f sign decimal exponent = sign * decimal * 10 ^^ exponent
here, in order compute number have nontrivial combination of operations, done local function f
.
another typical situation need discard value:
exponent = oneof "ee" *> integer
here, *>
discards value on left, keep value on right. <*
operator opposite, discarding right , keeping left. when have chain of such operations, have decode them using left-associativity:
p1 *> p2 <* p3 *> p4 <* p5 ≡ (((p1 *> p2) <* p3) *> p4) <* p5
this artificially contrived: don't want this. it's better break expression meaningful pieces (and preferably give meaningful names). 1 common pattern see is:
-- discard result of except `p3` p1 *> p2 *> p3 <* p4 <* p5
there small caveat though, if want apply else p3
or if p3
consists of multiple parts, have use parentheses:
-- applying pure function f <$> (p1 *> p2 *> p3 <* p4 <* p5) ≡ p1 *> p2 *> (f <$> p3) <* p4 <* p5 -- p3 consists of multiple parts p1 *> p2 *> (p3' <*> p3'') <* p4 <* p5)
again, in these situations it's better break expression meaningful fragments names.
applicative notation, in sense, forces dividing parsers logical chunks it's easier read, opposed monadic notation conceivably in 1 monolithic block.
Comments
Post a Comment