在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:andrewcooke/ParserCombinator.jl开源软件地址:https://github.com/andrewcooke/ParserCombinator.jl开源编程语言:Julia 100.0%开源软件介绍:ParserCombinatorA parser combinator library for Julia, similar to those in other languages, like Haskell's Parsec or Python's pyparsing. It can parse any iterable type (not just strings) (except for regexp matchers, of course). ParserCombinator's main advantage is its flexible design, which separates the matchers from the evaluation strategy. This makes it easy to "plug in" memoization, or debug traces, or to restrict backtracking in a similar way to Parsec - all while using the same grammar. It also contains pre-built parsers for Graph Modelling Language and DOT. Exampleusing ParserCombinator
# the AST nodes we will construct, with evaluation via calc()
abstract type Node end
Base.:(==)(n1::Node, n2::Node) = n1.val == n2.val
calc(n::Float64) = n
struct Inv<:Node val end
calc(i::Inv) = 1.0/calc(i.val)
struct Prd<:Node val end
calc(p::Prd) = Base.prod(map(calc, p.val))
struct Neg<:Node val end
calc(n::Neg) = -calc(n.val)
struct Sum<:Node val end
calc(s::Sum) = Base.sum(map(calc, s.val))
# the grammar (the combinators!)
sum = Delayed()
val = E"(" + sum + E")" | PFloat64()
neg = Delayed() # allow multiple (or no) negations (eg ---3)
neg.matcher = val | (E"-" + neg > Neg)
mul = E"*" + neg
div = E"/" + neg > Inv
prd = neg + (mul | div)[0:end] |> Prd
add = E"+" + prd
sub = E"-" + prd > Neg
sum.matcher = prd + (add | sub)[0:end] |> Sum
all = sum + Eos()
# and test
# this prints 2.5
calc(parse_one("1+2*3/4", all)[1])
# this prints [Sum([Prd([1.0]),Prd([2.0])])]
parse_one("1+2", all)
Some explanation of the above:
And it supports packrat parsing too (more exactly, it can memoize results to avoid repeating matches). Still, for large parsing tasks (eg parsing source code for a compiler) it would probably be better to use a wrapper around an external parser generator, like Anltr. Note: There's an issue
with the Compat library which means the code above (the assignment to
Installjulia> Pkg.add("ParserCombinator") ManualEvaluationOnce you have a grammar (see below) you can evaluate it against some input in various ways:
These are all implemented by providing different Basic MatchersIn what follows, remember that the power of parser combinators comes from how you combine these. They can all be nested, refer to each other, etc etc. Equalityjulia> parse_one("abc", Equal("ab"))
1-element Array{Any,1}:
"ab"
julia> parse_one("abc", Equal("abx"))
ERROR: ParserCombinator.ParserException("cannot parse") This is so common that there's a corresponding string literal (it's "e" for `Equal(), the corresponding matcher). julia> parse_one("abc", e"ab")
1-element Array{Any,1}:
"ab" SequencesMatchers return lists of values. Multiple matchers can return lists of lists, or the results can be "flattened" a level (usually more useful): julia> parse_one("abc", Series(Equal("a"), Equal("b")))
2-element Array{Any,1}:
"a"
"b"
julia> parse_one("abc", Series(Equal("a"), Equal("b"); flatten=false))
2-element Array{Any,1}:
Any["a"]
Any["b"]
julia> parse_one("abc", Seq(Equal("a"), Equal("b")))
2-element Array{Any,1}:
"a"
"b"
julia> parse_one("abc", And(Equal("a"), Equal("b")))
2-element Array{Any,1}:
Any["a"]
Any["b"]
julia> parse_one("abc", e"a" + e"b")
2-element Array{Any,1}:
"a"
"b"
julia> parse_one("abc", e"a" & e"b")
2-element Array{Any,1}:
Any["a"]
Any["b"] Where Warning - The sugared syntax has to follow standard operator precedence,
where matcher1 + matcher2 | matcher3 is almost always an error because it means: matcher1 + (matcher2 | matcher3) while what was intended was: (matcher1 + matcher2) | matcher3 Empty ValuesOften, you want to match something but then discard it. An empty (or discarded) value is an empty list. This may help explain why I said flattening lists was useful above. julia> parse_one("abc", And(Drop(Equal("a")), Equal("b")))
2-element Array{Any,1}:
Any[]
Any["b"]
julia> parse_one("abc", Seq(Drop(Equal("a")), Equal("b")))
1-element Array{Any,1}:
"b"
julia> parse_one("abc", ~e"a" + e"b")
1-element Array{Any,1}:
"b"
julia> parse_one("abc", E"a" + e"b")
1-element Array{Any,1}:
"b" Note the Alternatesjulia> parse_one("abc", Alt(e"x", e"a"))
1-element Array{Any,1}:
"a"
julia> parse_one("abc", e"x" | e"a")
1-element Array{Any,1}:
"a" Warning - The sugared syntax has to follow standard operator precedence,
where matcher1 + matcher2 | matcher3 is almost always an error because it means: matcher1 + (matcher2 | matcher3) while what was intended was: (matcher1 + matcher2) | matcher3 Regular Expressionsjulia> parse_one("abc", Pattern(r".b."))
1-element Array{Any,1}:
"abc"
julia> parse_one("abc", p".b.")
1-element Array{Any,1}:
"abc"
julia> parse_one("abc", P"." + p"b.")
1-element Array{Any,1}:
"bc" As with equality, a capital prefix to the string literal ("p" for "pattern" by the way) implies that the value is dropped. Note that regular expresions do not backtrack. A typical, greedy, regular expression will match as much of the input as possible, every time that it is used. Backtracking only exists within the library matchers (which can duplicate regular expression functionality, when needed). Repetitionjulia> parse_one("abc", Repeat(p"."))
3-element Array{Any,1}:
"a"
"b"
"c"
julia> parse_one("abc", Repeat(p".", 2))
2-element Array{Any,1}:
"a"
"b"
julia> collect(parse_all("abc", Repeat(p".", 2, 3)))
2-element Array{Any,1}:
Any["a","b","c"]
Any["a","b"]
julia> parse_one("abc", Repeat(p".", 2; flatten=false))
2-element Array{Any,1}:
Any["a"]
Any["b"]
julia> collect(parse_all("abc", Repeat(p".", 0, 3)))
4-element Array{Any,1}:
Any["a","b","c"]
Any["a","b"]
Any["a"]
Any[]
julia> collect(parse_all("abc", Repeat(p".", 0, 3; greedy=false)))
4-element Array{Any,1}:
Any[]
Any["a"]
Any["a","b"]
Any["a","b","c"] You can also use The sugared version looks like this: julia> parse_one("abc", p"."[1:2])
2-element Array{Any,1}:
"a"
"b"
julia> parse_one("abc", p"."[1:2,:?])
1-element Array{Any,1}:
"a"
julia> parse_one("abc", p"."[1:2,:&])
2-element Array{Any,1}:
Any["a"]
Any["b"]
julia> parse_one("abc", p"."[1:2,:&,:?])
1-element Array{Any,1}:
Any["a"] Where the There are also some well-known special cases: julia> collect(parse_all("abc", Plus(p".")))
3-element Array{Any,1}:
Any["a","b","c"]
Any["a","b"]
Any["a"]
julia> collect(parse_all("abc", Star(p".")))
4-element Array{Any,1}:
Any["a","b","c"]
Any["a","b"]
Any["a"]
Any[] Full MatchTo ensure that all the input is matched, add julia> parse_one("abc", Equal("abc") + Eos())
1-element Array{Any,1}:
"abc"
julia> parse_one("abc", Equal("ab") + Eos())
ERROR: ParserCombinator.ParserException("cannot parse") TransformsUse julia> parse_one("abc", App(Star(p"."), tuple))
1-element Array{Any,1}:
("a","b","c")
julia> parse_one("abc", Star(p".") > string)
1-element Array{Any,1}:
"abc" The action of julia> type Node children end
julia> parse_one("abc", Appl(Star(p"."), Node))
1-element Array{Any,1}:
Node(Any["a","b","c"])
julia> parse_one("abc", Star(p".") |> x -> map(uppercase, x))
3-element Array{Any,1}:
"A"
"B"
"C" Lookahead And NegationSometimes you can't write a clean grammar that just consumes data: you need to check ahead to avoid something. Or you need to check ahead to make sure something works a certain way. julia> parse_one("12c", Lookahead(p"\d") + PInt() + Dot())
2-element Array{Any,1}:
12
'c'
julia> parse_one("12c", Not(Lookahead(p"[a-z]")) + PInt() + Dot())
2-element Array{Any,1}:
12
'c' More generally, OtherBacktrackingBy default, matchers will backtrack as necessary. In some (unusual) cases, it is useful to disable backtracking. For
example, see PCRE's "possessive" matching. This can be done here on a
case-by-case basis by adding For example, collect(parse_all("123abc", Seq!(p"\d"[0:end], p"[a-z]"[
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论