In this all too fabulous talk we will be addressing the wonderful and new wonders of KSQL vs. KStreams. If you are new-ish to Kafka…you may ask yourself, “What is a large Kafka deployment?” And you may tell yourself, “This is not my beautiful KSQL use case!” And you may tell yourself, “This is not my beautiful KStreams use case!” And you may ask yourself, “What is a beautiful Kafka use case?” And you may ask yourself, “Where does that stream process go to?” And you may ask yourself, “Am I right about this architecture? Am I wrong?” And you may say yourself, “My God! What have I done?”
In this talk, we will discuss the following concepts:
1. KSQL Architecture
2. KSQL Use Cases
3. Performance Considerations
4. When to KSQL and When to Not
5. Introduce KStreams
What this talk is: You will understand the architecture and the power of the KSQL continuous query engine and when to use it successfully.
What this talk is not: An intensive KStreams talk – but you will get enough under your belt to go forth and learn more about Stream Processing overall.
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
When to KSQL & When to Live the KStream (Dani Traphagen, Confluent) Kafka Summit London 2019
1. who am i
user ID Dani
Alias DTrapezoid
Bg Apache Proj
Dist Sys
s real time
event driven Sys
slaying legacy
How did we
get here
age
3in
f
pgggpaeomqao.m.oio.ms aknaf.at
ngI
what are we
doing
Y
KSQL Architecture
KSQL Use cases
3 KS Q performance considerations
4 When to use k SQL when to not
g introduce KST Reams
5. Why does it matter to
think about a
Query
before you
eE
execute it p
P 77
27
n
p
Lots of
led sons
6. we do hit always have control over stuff
E
ear
to
I can r
jeng'd
s
g
FEI
a
7. why
use
Igf
E TEEoietEEIifions
qq.fqtf.gd.de
RDBMS
initial use cases thinking Benefits
11
DB integration Buffering
1 Back pressure
message
bus
Decouples system
pub sub Distributed
Highly Availabl
lowly latent
Data governancei
omen
box wait there's
moire as a
streaming Platform
K2fk2 Streams API
STREAM ALL THE THINGS
8. Event Driven systems
Transform once use
many
o
sto
I
tIIgsEgEe etc
ftp.eamprocessing
3 f
L a Ii
zgiB
iP
KSQL KSTREAMS
ice creams
9. O oFiEE Stream ProcessingEIEEE.EE
Ehis w Kafka
tt
FFfB abstracts
I
F't
7f
9 to
KSQL KSTREAMS
III gently caaa.sieams3
s5om3oa
Let's talk about what KSQL is
under the hood
10. what happens when you KSQL
CREATE STREAM AS SELECT
CREATE TABLE AS SELECT
Hr
Output
To P l C
Name Default same
2Istream
table Or
us to Mre with KAFKA TOPIC prop WITH clause
partitions customize
carefully
w PARTITIONS prop WITH clause
Replication Factor Defdutt RFI topic L
customial with REPLICAS property WITH clause
Aggregations leverage embedded storage
engine to locally manage state
A compacted changelog topic persistsaggregation
state
compacted changelogtopics have the same
ofpartitions as the input stream Default L
replies
11. is KSQL like 5dL
O
I KSQL
µ 1
111111
1 IT Aggregate
11
I
11TH
t 55
join
i ta
i WT
IT
Filter
o HDDI
0 I connect 0 7
WE 0 sinkI source 1 0 connect
connector I 0
Elasticsource It sink
qq.iqconnector 0 connect
qq.gggID
ffoHfi Skinny Ero S3
e o
L
gtfEE.io
EEEEhhaaaE
Golf connect
cassandr
imhoff
KSQL is a DBA's worst nightmare
It's a Query that never stops
Jeremy
Custenborder
principalsystemsEngineers
confluent inc
12. why would someone want to use
KSQL then
Mumia
to write
EE
CONTINUOUS
stream processing programs
simply
my Run SQL Queries against eo date
agnostic of programming language
my works contrhously in Real time
Queries won't Quit until the messages
do
my Fault tolerant
f scales horizontally Vertically
f Distributed
13. what does the architecture of Ksa
look like
EE
ii
I I I
I
Ksai clients
if
EEE
SQL Engine
www.gesksQLstdtemeuts
Quenes
rest API
itow client aeu.esasffctohememra ee.me
384 tzfnkqgtopykosurtkso.ie Apps w confluentcontrol
Ksar server fEosgedneofsikEEEEIYEE.EEaEEa
ADD KSQL servers who Restarting APPS
15. {
"order_id": 1,
"customer_name": "Maryanna Andryszczak",
"date_of_birth": "1922-06-06T02:21:59Z",
"product": "Nut - Walnut, Pieces",
"order_total_usd": "1.65",
"town": "Portland",
"country": "United States"
}
ksql> CREATE STREAM purchases
(order_id INT, customer_name VARCHAR, date_of_birth
VARCHAR,
product VARCHAR, order_total_usd VARCHAR, town
VARCHAR, country VARCHAR)
WITH (KAFKA_TOPIC='purchases', VALUE_FORMAT='JSON');
Message
----------------
Stream created
----------------
SELECT * FROM PURCHASES LIMIT 5;
SELECT ORDER_ID, PRODUCT, TOWN, COUNTRY FROM PURCHASES WHERE
COUNTRY='Germany';
rims
y
create PURCHASES stream
Let's validate the 1st few messages
Now let's filter by the
country
Germany
16. CREATE STREAM PUCHASES_GERMANY AS SELECT * FROM PURCHASES
WHERE COUNTRY='Germany';
ksql> LIST TOPICS;
Kafka Topic | Registered | Partitions | Partition
Replicas | Consumers | ConsumerGroups
-------------------------------------------------------------
-----------------------------------
_confluent-metrics | false | 12 | 1
| 0 | 0
PUCHASES_GERMANY | true | 4 | 1
| 0 | 0
purchases | true | 1 | 1
| 1 | 1
-------------------------------------------------------------
-----------------------------------
ksql>
https://www.confluent.io/stream-processing-cookbook/ksql-
recipes/data-filtering
Let's just get the German orders on
one
topic1 table
cool Now we have all the purchases
from Germany in a separate topic
Try it
Mr
18. Sizing
considerations
It may need to add more brokers
KSQL Queries consume I produce
from topics
Repartitioning
stateful Queries mean changelog
topics
Throughput
Decreases relative to every count due
to message sculcomplexity
Query Types
project joins AggregationsFilter
2X CPU
sumSELECT
COUNT
FROM ETC
WHERE
19. What about headless
TIP D Test in
deployment
indeYET.info
gdefsg
ii
JVM
I
af
s
gcog
pgekg t.pg.es
path to confluent bin1kSqlnodequery file pathltolmyquery Sql
create Drop streams
start Stop Queries
Start server nodes
VC your workflow
20. When
to use
KSQL
when to note
Where is
yerp
B
f outed
data
Kafka Et
ggqB.B.qStreams
4 q BEBoBBoaooqBdOEFBdoao.q
BB BB
q.BBOaooaaBFg.fBe 0KB Oooo Be
Bebop BABB BARB
BBBhAoo OB0BBBGa H
Bo Be
BBeoaooss.oo.BB
BBaqfMih
BffrI.aoaoa.ooo.qqo.Bg.aeo.w
BB B FOE
ee LAB Sooooo B wannabe
Teaser are you O
a
maintainer
Java Ef.ggphhtdf.NET
teams to are you
new to Kafka
21. Kafka Streams
Processer API low level
Imperative customizable
streams API built in abstractions
Functional
KSTREAM
KTABLE
GLOBAL K TABLE
Stateless Stateful
transformations
22. L l I I r I i r a
I
don't
care howy
I
want
KafkaStreams
forEvery
language on
the
planet o
I C C l y 7 Your team
tfGo
gggµ
Agg CONSUME
STREAM DO OUT
process
Jd I
iirEE ifeng.im
gon
qgf.BAsatisfy all
yer devs.iqdOFOFa
yf gf