Since this code is written in Beam
, the symbols you are talking about are native to Beam Pipeline
.
|
is the pipeline symbol which indicates the pipeline being addressed to for the given operation: Like in your example, p
is the source pipeline for lines = p | ReadFromText(known_args.input)
and lines
is the source pipeline for
counts = (
lines
| 'Split' >> (
beam.FlatMap(lambda x: re.findall(r'[A-Za-z']+', x)).
with_output_types(unicode))
| 'PairWithOne' >> beam.Map(lambda x: (x, 1))
| 'GroupAndSum' >> beam.CombinePerKey(sum))
>>
gives a name to a certain operation for ease of reading on the UI.
In your example, 'GroupAndSum' >> beam.CombinePerKey(sum))
, GroupAndSum
is the name of the combine operation and so on.
Read the documentation given by @Klaus D. in the comments for more clarity.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…