在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:BioJulia/Automa.jl开源软件地址:https://github.com/BioJulia/Automa.jl开源编程语言:Julia 100.0%开源软件介绍:Automa.jlA Julia package for text validation, parsing, and tokenizing based on state machine compiler. Automa.jl compiles regular expressions into Julia code, which is then compiled into low-level machine code by the Julia compiler. Automa.jl is designed to generate very efficient code to scan large text data, which is often much faster than handcrafted code. Automa.jl can insert arbitrary Julia code that will be executed in state transitions. This makes it possible, for example, to extract substrings that match a part of a regular expression. This is a number literal tokenizer using Automa.jl (numbers.jl): # A tokenizer of octal, decimal, hexadecimal and floating point numbers
# =====================================================================
import Automa
import Automa.RegExp: @re_str
const re = Automa.RegExp
# Describe patterns in regular expression.
oct = re"0o[0-7]+"
dec = re"[-+]?[0-9]+"
hex = re"0x[0-9A-Fa-f]+"
prefloat = re"[-+]?([0-9]+\.[0-9]*|[0-9]*\.[0-9]+)"
float = prefloat | re.cat(prefloat | re"[-+]?[0-9]+", re"[eE][-+]?[0-9]+")
number = oct | dec | hex | float
numbers = re.cat(re.opt(number), re.rep(re" +" * number), re" *")
# Register action names to regular expressions.
number.actions[:enter] = [:mark]
oct.actions[:exit] = [:oct]
dec.actions[:exit] = [:dec]
hex.actions[:exit] = [:hex]
float.actions[:exit] = [:float]
# Compile a finite-state machine.
machine = Automa.compile(numbers)
# This generates a SVG file to visualize the state machine.
# write("numbers.dot", Automa.machine2dot(machine))
# run(`dot -Tpng -o numbers.png numbers.dot`)
# Bind an action code for each action name.
actions = Dict(
:mark => :(mark = p),
:oct => :(emit(:oct)),
:dec => :(emit(:dec)),
:hex => :(emit(:hex)),
:float => :(emit(:float)),
)
# Generate a tokenizing function from the machine.
context = Automa.CodeGenContext()
@eval function tokenize(data::String)
tokens = Tuple{Symbol,String}[]
mark = 0
$(Automa.generate_init_code(context, machine))
p_end = p_eof = lastindex(data)
emit(kind) = push!(tokens, (kind, data[mark:p-1]))
$(Automa.generate_exec_code(context, machine, actions))
return tokens, cs == 0 ? :ok : cs < 0 ? :error : :incomplete
end
tokens, status = tokenize("1 0x0123BEEF 0o754 3.14 -1e4 +6.022045e23") This emits tokens and the final status:
The compiled deterministic finite automaton (DFA) looks like this: For more details, see fasta.jl and read the docs page. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论