SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Check out these resources:
Dean’s book
Webinars
etc.
Fast Data Architectures 

for Streaming Applications
Getting Answers Now from Data Sets that Never End
By Dean Wampler, Ph. D., VP of Fast Data Engineering
LIGHTBEND.COM/LEARN
2
Mesos, YARN, Cloud, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3
11
KaEa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Streaming architecture,
from the book
Mesos, YARN, Cloud, …
Logs
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3
11
KaEa Cluster
Ka9a
Microservices
RP Go
Node.js …
2
4
7
8
9
10
Beam
Today’s focus:
•Kafka - the data
backplane
•Akka Streams
and Kafka
Streams -
streaming
microservices
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
Before:
Why Kafka?
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
Before:
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
After:
Why Kafka?
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
After:
Why Kafka?
Kafka:
• Simplify dependencies between
services
• Reduce data loss when a service
crashes
• M producers, N consumers
• Simplicity of one “API” for
communication
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Four Streaming Engines:
Spark, Flink - services to which
you submit work. Large scale,
automatic data partitioning.
Four Streaming Engines:
Spark, Flink - services to which
you submit work. Large scale,
automatic data partitioning.
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Four Streaming Engines:
Spark, Flink - services to which
you submit work. Large scale,
automatic data partitioning.
Akka Streams, Kafka Streams -
libraries for “data-centric
microservices”. Smaller scale, but
great flexibility
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
storage
Data
Model
Training
Model
Serving
Other
Logic
Events Records
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
Akka emerged from the left-hand
side of the spectrum, the world
of highly Reactive microservices.
Akka Streams pushes to the
right, more data-centric.
A Spectrum of Microservices
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
storage
Data
Model
Training
Model
Serving
Other
Logic
Emerged from the right-hand
side.
Kafka Streams pushes to the
left, supporting many event-
processing scenarios.
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Kafka Streams
• Low overhead
• Parallelism <=> Kafka partitions
• Read/write Kafka topics
• Java and Scala (coming) API
• Feels similar to Spark, Scala collections
• SQL interface
• Query in-memory state
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Kafka Streams
• Ideal for
• ETL (“KStreams”)
• Aggregations (“KTables”)
• Joins
• “Effectively once”
requirements
Kafka Streams Example
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
storage
Data
Model
Training
Model
Serving
Other
Logic
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
Kafka Streams Example
A new Scala Kafka Streams API:
• https://github.com/lightbend/kafka-streams-scala
• Adheres very closely to the semantics of the Java API
• Lightbend is contributing this API to Apache Kafka
• Developed by Debasish Ghosh, Boris Lublinsky, Sean Glover,
with contributions from other Fast Data Platform team
members
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
What’s Missing?
Kafka Streams is a powerful library, but you’ll need to provide
the rest of the microservice support telling through other
means. (Some are provided if you run the support services for
the SQL interface.)
You would embed your KS code in microservices written with
more comprehensive toolkits, such as the Lightbend Reactive
Platform!
We’ll return to this point…
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Akka Streams
• Low latency, mid-volume
• Graphs of processing nodes
• Based on Akka Actors:
• Complex event processing
• Efficient, per-event processing
ogs
ckets
REST
Mini-batch
Spark	
Streaming
Batch
Spark
…
Low Latency
Flink
Ka9a	Streams
Akka	Streams
Beam
…
Persistence
S3
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaEa Cluster
Ka9a
2
4
7
8
9
10
Beam
Akka Streams
• Rich ecosystem
• Alpakka - connectivity for DBs,
Kafka, etc. (like Camel)
• Akka Cluster - scale across
nodes
• Akka Persistence - Akka state
durability
Akka Cluster
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Final
Records
Alpakka
Alpakka
storage
Akka Streams Example
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.filter(model => model.valid).map(_.get)
.map(model => ModelImpl.findModel(model))
.filter(model => model.valid).map(_.get)
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.filter(record => record.valid).map(_.get)
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.filter(model => model.valid).map(_.get)
.map(model => ModelImpl.findModel(model))
.filter(model => model.valid).map(_.get)
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.filter(record => record.valid).map(_.get)
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.filter(model => model.valid).map(_.get)
.map(model => ModelImpl.findModel(model))
.filter(model => model.valid).map(_.get)
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.filter(record => record.valid).map(_.get)
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.filter(model => model.valid).map(_.get)
.map(model => ModelImpl.findModel(model))
.filter(model => model.valid).map(_.get)
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.filter(record => record.valid).map(_.get)
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor)
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.filter(model => model.valid).map(_.get)
.map(model => ModelImpl.findModel(model))
.filter(model => model.valid).map(_.get)
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.filter(record => record.valid).map(_.get)
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
val model = ModelStage(modelProcessor)
def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3
val modelPredictions: Source[Option[Double], ReadableModelStateStore] =
Source.fromGraph {
GraphDSL.create(dataStream, modelStream, model)(
keepModelMaterializedValue) { implicit builder => (d, m, w) =>
import GraphDSL.Implicits._
// Wire the input streams with the model stage (2 in, 1 out)
// dataStream --> | |
// | model | -> predictions
// modelStream -> | |
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
val model = ModelStage(modelProcessor)
def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3
val modelPredictions: Source[Option[Double], ReadableModelStateStore] =
Source.fromGraph {
GraphDSL.create(dataStream, modelStream, model)(
keepModelMaterializedValue) { implicit builder => (d, m, w) =>
import GraphDSL.Implicits._
// Wire the input streams with the model stage (2 in, 1 out)
// dataStream --> | |
// | model | -> predictions
// modelStream -> | |
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
case class ModelStage(modelProcessor: …) extends
GraphStageWithMaterializedValue[…, …] {
val scorer = new Scorer(modelProcessor)
val dataRecordIn = Inlet[Record]("dataRecordIn")
val modelRecordIn = Inlet[ModelImpl]("modelRecordIn")
val scoringResultOut = Outlet[ScoredRecord]("scoringOut")
…
setHandler(dataRecordIn, new InHandler {
override def onPush(): Unit = {
val record = grab(dataRecordIn)
val newRecord = new ScoredRecord(scorer.score(record),record))
push(scoringResultOut, Some(newRecord))
pull(dataRecordIn)
})
…
}
val model = ModelStage(modelProcessor)
def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3
val modelPredictions: Source[Option[Double], ReadableModelStateStore] =
Source.fromGraph {
GraphDSL.create(dataStream, modelStream, model)(
keepModelMaterializedValue) { implicit builder => (d, m, w) =>
import GraphDSL.Implicits._
// Wire the input streams with the model stage (2 in, 1 out)
// dataStream --> | |
// | model | -> predictions
// modelStream -> | |
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
val model = ModelStage(modelProcessor)
def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3
val modelPredictions: Source[Option[Double], ReadableModelStateStore] =
Source.fromGraph {
GraphDSL.create(dataStream, modelStream, model)(
keepModelMaterializedValue) { implicit builder => (d, m, w) =>
import GraphDSL.Implicits._
// Wire the input streams with the model stage (2 in, 1 out)
// dataStream --> | |
// | model | -> predictions
// modelStream -> | |
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
val model = ModelStage(modelProcessor)
def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3
val modelPredictions: Source[Option[Double], ReadableModelStateStore] =
Source.fromGraph {
GraphDSL.create(dataStream, modelStream, model)(
keepModelMaterializedValue) { implicit builder => (d, m, w) =>
import GraphDSL.Implicits._
// Wire the input streams with the model stage (2 in, 1 out)
// dataStream --> | |
// | model | -> predictions
// modelStream -> | |
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
d ~> w.dataRecordIn
m ~> w.modelRecordIn
SourceShape(w.scoringResultOut)
}
}
val materializedReadableModelStateStore: ReadableModelStateStore =
modelPredictions
.to(Sink.ignore) // we do not read the results directly
.run() // we run the stream, materializing the stage's StateStore
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
•Scale scoring with workers and
routers, across a cluster
•Persist actor state with Akka
Persistence
•Connect to almost anything with
Alpakka
• Enterprise Suite
•for production
Other Concerns
Akka Cluster
Model Serving
Other
Logic
Alpakka
Alpakka
Router
Worker
Worker
Worker
Worker
Stateful
Logic
Persistence
actor	state	
storage
Akka Cluster
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Final
Records
Alpakka
Alpakka
storage
•Extremely low latency
•Minimal I/O and memory overhead
•Reactive Streams backpressure
•M producers, N consumers, but
directly connected (sort of)
•Use Akka Persistence for durable state
Go Direct or Through Kafka?
Akka Cluster
Model
Serving
Other
Logic
Alpakka
Alpakka
Model
Serving
Other
Logic
Scored
Records
vs. ?
•Higher latency (including queue depth)
•Higher I/O overhead
•Very large buffer (disk size)
•M producers, N consumers, completely
disconnected
•Automatic durability (topics on disk)
•Use for smaller, faster
messaging between
“components”.
•Watch for consumer “backup”
•Use Akka Persistence for
important state!
Akka Cluster
Model
Serving
Other
Logic
Alpakka
Alpakka
Model
Serving
Other
Logic
Scored
Records
vs. ?
•Use for larger volumes, more
course-grained service
interactions
•Plan partitioning and
replication carefully
Go Direct or Through Kafka?
Lightbend Fast Data Platform
lightbend.com/fast-data-platform
Check out these resources:
Dean’s book
Webinars
etc.
Fast Data Architectures 

for Streaming Applications
Getting Answers Now from Data Sets that Never End
By Dean Wampler, Ph. D., VP of Fast Data Engineering
46
LIGHTBEND.COM/LEARN
Serving Machine Learning Models
A Guide to Architecture, Stream Processing Engines, 

and Frameworks
By Boris Lublinsky, Fast Data Platform Architect
47
LIGHTBEND.COM/LEARN
For even more information:
Tutorial - Building Streaming Applications with Kafka:

Software Architecture Conference New York
Strata Data Conference San Jose
Strata Data Conference London
My talk:

Strata Data Conference San Jose
48
Streaming Microservices With Akka Streams And Kafka Streams

Weitere ähnliche Inhalte

Was ist angesagt?

Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Lightbend
 
Akka Streams - From Zero to Kafka
Akka Streams - From Zero to KafkaAkka Streams - From Zero to Kafka
Akka Streams - From Zero to KafkaMark Harrison
 
Developing Secure Scala Applications With Fortify For Scala
Developing Secure Scala Applications With Fortify For ScalaDeveloping Secure Scala Applications With Fortify For Scala
Developing Secure Scala Applications With Fortify For ScalaLightbend
 
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudPakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudLightbend
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
 
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesUnderstanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesLightbend
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
 
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
Akka Streams And Kafka Streams: Where Microservices Meet Fast DataAkka Streams And Kafka Streams: Where Microservices Meet Fast Data
Akka Streams And Kafka Streams: Where Microservices Meet Fast DataLightbend
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreLegacy Typesafe (now Lightbend)
 
Scala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streamsScala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streamsJohan Andrén
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingYaroslav Tkachenko
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsMaking Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsLightbend
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLNick Dearden
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionjglobal
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Michael Noll
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...confluent
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Michael Noll
 
London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)Landoop Ltd
 

Was ist angesagt? (20)

Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
 
Akka Streams - From Zero to Kafka
Akka Streams - From Zero to KafkaAkka Streams - From Zero to Kafka
Akka Streams - From Zero to Kafka
 
Developing Secure Scala Applications With Fortify For Scala
Developing Secure Scala Applications With Fortify For ScalaDeveloping Secure Scala Applications With Fortify For Scala
Developing Secure Scala Applications With Fortify For Scala
 
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google CloudPakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
Pakk Your Alpakka: Reactive Streams Integrations For AWS, Azure, & Google Cloud
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous ArchitecturesUnderstanding Akka Streams, Back Pressure, and Asynchronous Architectures
Understanding Akka Streams, Back Pressure, and Asynchronous Architectures
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
Akka Streams And Kafka Streams: Where Microservices Meet Fast DataAkka Streams And Kafka Streams: Where Microservices Meet Fast Data
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
 
Scala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streamsScala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streams
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processing
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development TeamsMaking Scala Faster: 3 Expert Tips For Busy Development Teams
Making Scala Faster: 3 Expert Tips For Busy Development Teams
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQL
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in production
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)London Apache Kafka Meetup (Jan 2017)
London Apache Kafka Meetup (Jan 2017)
 

Ähnlich wie Streaming Microservices With Akka Streams And Kafka Streams

Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Matt Stubbs
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?Eventador
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?Kenny Gorman
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learningfelixcss
 
SSR: Structured Streaming on R for Machine Learning with Felix Cheung
SSR: Structured Streaming on R for Machine Learning with Felix CheungSSR: Structured Streaming on R for Machine Learning with Felix Cheung
SSR: Structured Streaming on R for Machine Learning with Felix CheungDatabricks
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLFlorent Ramiere
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksDatabricks
 
Large-Scale Data Science in Apache Spark 2.0
Large-Scale Data Science in Apache Spark 2.0Large-Scale Data Science in Apache Spark 2.0
Large-Scale Data Science in Apache Spark 2.0Databricks
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKai Wähner
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductEvans Ye
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Databricks
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKai Wähner
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLconfluent
 
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...Databricks
 

Ähnlich wie Streaming Microservices With Akka Streams And Kafka Streams (20)

Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
 
SSR: Structured Streaming on R for Machine Learning with Felix Cheung
SSR: Structured Streaming on R for Machine Learning with Felix CheungSSR: Structured Streaming on R for Machine Learning with Felix Cheung
SSR: Structured Streaming on R for Machine Learning with Felix Cheung
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQL
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Large-Scale Data Science in Apache Spark 2.0
Large-Scale Data Science in Apache Spark 2.0Large-Scale Data Science in Apache Spark 2.0
Large-Scale Data Science in Apache Spark 2.0
 
SFScon18 - Stefano Pampaloni - The SQL revenge
SFScon18 - Stefano Pampaloni - The SQL revengeSFScon18 - Stefano Pampaloni - The SQL revenge
SFScon18 - Stefano Pampaloni - The SQL revenge
 
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache KafkaKSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache Kafka
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
 
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...Writing Continuous Applications with Structured Streaming Python APIs in Apac...
Writing Continuous Applications with Structured Streaming Python APIs in Apac...
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 

Mehr von Lightbend

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaLightbend
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterLightbend
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesLightbend
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesLightbend
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesLightbend
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessLightbend
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondLightbend
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Lightbend
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lightbend
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...Lightbend
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightLightbend
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In PracticeLightbend
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryLightbend
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowLightbend
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsLightbend
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldLightbend
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaLightbend
 

Mehr von Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
 

Kürzlich hochgeladen

Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 

Kürzlich hochgeladen (20)

Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 

Streaming Microservices With Akka Streams And Kafka Streams

  • 1.
  • 2. Check out these resources: Dean’s book Webinars etc. Fast Data Architectures 
 for Streaming Applications Getting Answers Now from Data Sets that Never End By Dean Wampler, Ph. D., VP of Fast Data Engineering LIGHTBEND.COM/LEARN 2
  • 3. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam Streaming architecture, from the book
  • 4. Mesos, YARN, Cloud, … Logs Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 11 KaEa Cluster Ka9a Microservices RP Go Node.js … 2 4 7 8 9 10 Beam Today’s focus: •Kafka - the data backplane •Akka Streams and Kafka Streams - streaming microservices
  • 5. Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N * M links ConsumersProducers Before: Why Kafka?
  • 6. Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N * M links ConsumersProducers Before: Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N + M links ConsumersProducers After: Why Kafka?
  • 7. Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N + M links ConsumersProducers After: Why Kafka? Kafka: • Simplify dependencies between services • Reduce data loss when a service crashes • M producers, N consumers • Simplicity of one “API” for communication
  • 9. Four Streaming Engines: Spark, Flink - services to which you submit work. Large scale, automatic data partitioning.
  • 10. ogs ckets REST Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 4 7 8 9 10 Beam Four Streaming Engines: Spark, Flink - services to which you submit work. Large scale, automatic data partitioning. Akka Streams, Kafka Streams - libraries for “data-centric microservices”. Smaller scale, but great flexibility
  • 11. “Record-centric” μ-services Events Records A Spectrum of Microservices Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory storage Data Model Training Model Serving Other Logic
  • 12. Events Records Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory Akka emerged from the left-hand side of the spectrum, the world of highly Reactive microservices. Akka Streams pushes to the right, more data-centric. A Spectrum of Microservices
  • 13. “Record-centric” μ-services Events Records A Spectrum of Microservices storage Data Model Training Model Serving Other Logic Emerged from the right-hand side. Kafka Streams pushes to the left, supporting many event- processing scenarios.
  • 14. ogs ckets REST Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 4 7 8 9 10 Beam Kafka Streams • Low overhead • Parallelism <=> Kafka partitions • Read/write Kafka topics • Java and Scala (coming) API • Feels similar to Spark, Scala collections • SQL interface • Query in-memory state
  • 17. Data Model Training Model Serving Other Logic Raw Data Model Params Scored Records Final Records Kafka Streams Example A new Scala Kafka Streams API: • https://github.com/lightbend/kafka-streams-scala • Adheres very closely to the semantics of the Java API • Lightbend is contributing this API to Apache Kafka • Developed by Debasish Ghosh, Boris Lublinsky, Sean Glover, with contributions from other Fast Data Platform team members
  • 18. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 19. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 20. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 21. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 22. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 23. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 24. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 25. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() Data Model Training Model Serving Raw Data Model Params Scored Records
  • 26. Data Model Training Model Serving Other Logic Raw Data Model Params Scored Records Final Records What’s Missing? Kafka Streams is a powerful library, but you’ll need to provide the rest of the microservice support telling through other means. (Some are provided if you run the support services for the SQL interface.) You would embed your KS code in microservices written with more comprehensive toolkits, such as the Lightbend Reactive Platform! We’ll return to this point…
  • 27. ogs ckets REST Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 4 7 8 9 10 Beam Akka Streams • Low latency, mid-volume • Graphs of processing nodes • Based on Akka Actors: • Complex event processing • Efficient, per-event processing
  • 28. ogs ckets REST Mini-batch Spark Streaming Batch Spark … Low Latency Flink Ka9a Streams Akka Streams Beam … Persistence S3 HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaEa Cluster Ka9a 2 4 7 8 9 10 Beam Akka Streams • Rich ecosystem • Alpakka - connectivity for DBs, Kafka, etc. (like Camel) • Akka Cluster - scale across nodes • Akka Persistence - Akka state durability
  • 29. Akka Cluster Data Model Training Model Serving Other Logic Raw Data Model Params Final Records Alpakka Alpakka storage Akka Streams Example Data Model Training Model Serving Other Logic Raw Data Model Params Scored Records Final Records
  • 30. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .filter(model => model.valid).map(_.get) .map(model => ModelImpl.findModel(model)) .filter(model => model.valid).map(_.get) val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .filter(record => record.valid).map(_.get)
  • 31. implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .filter(model => model.valid).map(_.get) .map(model => ModelImpl.findModel(model)) .filter(model => model.valid).map(_.get) val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .filter(record => record.valid).map(_.get) Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 32. implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .filter(model => model.valid).map(_.get) .map(model => ModelImpl.findModel(model)) .filter(model => model.valid).map(_.get) val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .filter(record => record.valid).map(_.get) Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 33. implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .filter(model => model.valid).map(_.get) .map(model => ModelImpl.findModel(model)) .filter(model => model.valid).map(_.get) val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .filter(record => record.valid).map(_.get) Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 34. implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .filter(model => model.valid).map(_.get) .map(model => ModelImpl.findModel(model)) .filter(model => model.valid).map(_.get) val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .filter(record => record.valid).map(_.get) Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 35. val model = ModelStage(modelProcessor) def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3 val modelPredictions: Source[Option[Double], ReadableModelStateStore] = Source.fromGraph { GraphDSL.create(dataStream, modelStream, model)( keepModelMaterializedValue) { implicit builder => (d, m, w) => import GraphDSL.Implicits._ // Wire the input streams with the model stage (2 in, 1 out) // dataStream --> | | // | model | -> predictions // modelStream -> | | d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 36. val model = ModelStage(modelProcessor) def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3 val modelPredictions: Source[Option[Double], ReadableModelStateStore] = Source.fromGraph { GraphDSL.create(dataStream, modelStream, model)( keepModelMaterializedValue) { implicit builder => (d, m, w) => import GraphDSL.Implicits._ // Wire the input streams with the model stage (2 in, 1 out) // dataStream --> | | // | model | -> predictions // modelStream -> | | d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka case class ModelStage(modelProcessor: …) extends GraphStageWithMaterializedValue[…, …] { val scorer = new Scorer(modelProcessor) val dataRecordIn = Inlet[Record]("dataRecordIn") val modelRecordIn = Inlet[ModelImpl]("modelRecordIn") val scoringResultOut = Outlet[ScoredRecord]("scoringOut") … setHandler(dataRecordIn, new InHandler { override def onPush(): Unit = { val record = grab(dataRecordIn) val newRecord = new ScoredRecord(scorer.score(record),record)) push(scoringResultOut, Some(newRecord)) pull(dataRecordIn) }) … }
  • 37. val model = ModelStage(modelProcessor) def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3 val modelPredictions: Source[Option[Double], ReadableModelStateStore] = Source.fromGraph { GraphDSL.create(dataStream, modelStream, model)( keepModelMaterializedValue) { implicit builder => (d, m, w) => import GraphDSL.Implicits._ // Wire the input streams with the model stage (2 in, 1 out) // dataStream --> | | // | model | -> predictions // modelStream -> | | d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 38. val model = ModelStage(modelProcessor) def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3 val modelPredictions: Source[Option[Double], ReadableModelStateStore] = Source.fromGraph { GraphDSL.create(dataStream, modelStream, model)( keepModelMaterializedValue) { implicit builder => (d, m, w) => import GraphDSL.Implicits._ // Wire the input streams with the model stage (2 in, 1 out) // dataStream --> | | // | model | -> predictions // modelStream -> | | d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 39. val model = ModelStage(modelProcessor) def keepModelMaterializedValue[M1,M2,M3](m1:M1, m2:M2, m3:M3):M3 = m3 val modelPredictions: Source[Option[Double], ReadableModelStateStore] = Source.fromGraph { GraphDSL.create(dataStream, modelStream, model)( keepModelMaterializedValue) { implicit builder => (d, m, w) => import GraphDSL.Implicits._ // Wire the input streams with the model stage (2 in, 1 out) // dataStream --> | | // | model | -> predictions // modelStream -> | | d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 40. d ~> w.dataRecordIn m ~> w.modelRecordIn SourceShape(w.scoringResultOut) } } val materializedReadableModelStateStore: ReadableModelStateStore = modelPredictions .to(Sink.ignore) // we do not read the results directly .run() // we run the stream, materializing the stage's StateStore Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka
  • 41. •Scale scoring with workers and routers, across a cluster •Persist actor state with Akka Persistence •Connect to almost anything with Alpakka • Enterprise Suite •for production Other Concerns Akka Cluster Model Serving Other Logic Alpakka Alpakka Router Worker Worker Worker Worker Stateful Logic Persistence actor state storage Akka Cluster Data Model Training Model Serving Other Logic Raw Data Model Params Final Records Alpakka Alpakka storage
  • 42. •Extremely low latency •Minimal I/O and memory overhead •Reactive Streams backpressure •M producers, N consumers, but directly connected (sort of) •Use Akka Persistence for durable state Go Direct or Through Kafka? Akka Cluster Model Serving Other Logic Alpakka Alpakka Model Serving Other Logic Scored Records vs. ? •Higher latency (including queue depth) •Higher I/O overhead •Very large buffer (disk size) •M producers, N consumers, completely disconnected •Automatic durability (topics on disk)
  • 43. •Use for smaller, faster messaging between “components”. •Watch for consumer “backup” •Use Akka Persistence for important state! Akka Cluster Model Serving Other Logic Alpakka Alpakka Model Serving Other Logic Scored Records vs. ? •Use for larger volumes, more course-grained service interactions •Plan partitioning and replication carefully Go Direct or Through Kafka?
  • 46. Check out these resources: Dean’s book Webinars etc. Fast Data Architectures 
 for Streaming Applications Getting Answers Now from Data Sets that Never End By Dean Wampler, Ph. D., VP of Fast Data Engineering 46 LIGHTBEND.COM/LEARN
  • 47. Serving Machine Learning Models A Guide to Architecture, Stream Processing Engines, 
 and Frameworks By Boris Lublinsky, Fast Data Platform Architect 47 LIGHTBEND.COM/LEARN
  • 48. For even more information: Tutorial - Building Streaming Applications with Kafka:
 Software Architecture Conference New York Strata Data Conference San Jose Strata Data Conference London My talk:
 Strata Data Conference San Jose 48