Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
772 views
in Technique[技术] by (71.8m points)

apache kafka streams - What are the differences between KTable vs GlobalKTable and leftJoin() vs outerJoin()?

In Kafka Stream library, I want to know difference between KTable and GlobalKTable.

Also in KStream class, there are two methods leftJoin() and outerJoin(). What is the difference between these two methods also?

I read KStream.leftJoin, but did not manage to find an exact difference.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

KTable VS GlobalKTable

A KTable shardes the data between all running Kafka Streams instances, while a GlobalKTable has a full copy of all data on each instance. The disadvantage of GlobalKTable is that it obviously needs more memory. The advantage is, that you can do a KStream-GlobalKTable join with a non-key attribute from the stream. For a KStream-KTable join and a non-key stream attribute for the join is only possible by extracting the join attribute and set it as the key before doing the join -- this will result in a repartitioning step of the stream before the join can be computed.

Note though, that there is also a semantical difference: For stream-table join, Kafka Stream align record processing ordered based on record timestamps. Thus, the update to the table are aligned with the records of you stream. For GlobalKTable, there is no time synchronization and thus update to GlobalKTable and completely decoupled from the processing of the stream records (thus, you get weaker semantics).

For further details, see KIP-99: Add Global Tables to Kafka Streams.

leftJoin() VS outerJoin()

About left and outer joins: it's like in a database a left-outer and full-outer join, respectively.

For a left outer join, you might "lose" data of your right input stream in case there is no match for the join in the left-hand side.

For a (full)outer join, no data will be dropped and each input record of both streams will be in the result stream.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...