SlideShare ist ein Scribd-Unternehmen logo
1 von 75
Downloaden Sie, um offline zu lesen
Real-Time Streaming in Azure
Timothy Spann | Developer Advocate
Tim Spann
Developer Advocate
● https://www.datainmotion.dev/
● https://github.com/tspannhw/SpeakerProfile
● https://dev.to/tspannhw
● https://sessionize.com/tspann/
DZone Zone Leader and Big Data
MVB Data DJay
Founded by the original developers of
Apache Pulsar and Apache BookKeeper,
StreamNative builds a cloud-native event
streaming platform that enables
enterprises to easily access data as
real-time event streams.
Apache Pulsar
Apache is an open source, cloud-native
distributed messaging and streaming platform.
What are the Benefits of Pulsar?
Data Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
Apache Pulsar
A Unified Messaging Platform
Message Queuing
Data Streaming
Messaging
Messaging systems are ideal for work
queues that do not require tasks to be
performed in a particular order—for
example, sending one email message
to many recipients.
RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Streaming
Streaming works best in situations
where the order of messages is
important—for example, data
ingestion.
Kafka and Amazon Kinesis are
examples of messaging systems that
use streaming semantics for
consuming messages.
The Difference Between Messaging &
Streaming
Unified Messaging
Model
Streaming
Messaging
Producer 1
Producer 2
Pulsar
Topic/Partition
m0
m1
m2
m3
m4
Consumer D-1
Consumer D-2
Consumer D-3
Subscription D
<
k
2
,
v
1
>
<
k
2
,
v
3
>
<k3,v2>
<
k
1
,
v
0
>
<
k
1
,
v
4
>
Key-Shared
Consumer C-1
Consumer C-2
Consumer C-3
Subscription C
m1
m2
m3
m4
m0
Shared
Failover
Consumer B-1
Consumer B-0
Subscription B
m1
m2
m3
m4
m0
In case of failure in
Consumer B-0
Consumer A-1
Consumer A-0
Subscription A
m1
m2
m3
m4
m0
Exclusive
X
Pulsar is Cloud Native
● Rebalance free architecture via separate compute and storage for
easy scalability
● Built for Kubernetes, with production ready helm charts and
Kubernetes operators
● Infinite streams by leveraging cost effective cloud-storage like S3,
GCS, etc
● Supports a broad range of workloads and teams through unified
messaging model
Pulsar is built for easy scale-out.
*Illustrations by Jack
Vanlightly
Pulsar Global Adoption
Key Milestones
2012 2016 2017 2018 2019 2020
Originally developed
inside Yahoo! as “Cloud
Messaging Service”
Pulsar is
committed to
Open Source
Pulsar is accepted into
the Apache Software
Foundation
Pulsar
becomes a
Top-Level
Project
● StreamNative is founded and
seed round raised.
● Tencent adopts Pulsar for
payment processing platform.
● BestPay adopts Pulsar for
payment processing.
● Pulsar hits 200 contributors.
● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees
● Pulsar hits 340 contributors
● StreamNative and OVHCloud launch Kafka on Pulsar (KoP)
● StreamNative + China Mobile launch AMQP on Pulsar (AoP)
● Pulsar Ecosystem expands - StreamNative Hub launches
● StreamNative Cloud launches on GCP and Alibaba Cloud
● StreamNative customer adoption continues - new
customers include Flipkart and Applied Materials
● Pulsar 2.7 + Transactions
● Pulsar Flink Connector 2.7
Major increase in adoption following
TLP designation in 2018
2021
● 3 global Pulsar conferences
● StreamNative hits 400
contributors (June).
● Pulsar surpasses Kafka in
monthly active contributors.
● Pulsar 2.8 + Exactly-Once
semantics
● StreamNative Platform launches
Apache Pulsar Overview
Enable Geo-Replicated Messaging
● Pub-Sub
● Geo-Replication
● Pulsar Functions
● Horizontal Scalability
● Multi-tenancy
● Tiered Persistent Storage
● Pulsar Connectors
● REST API
● CLI
● Many clients available
● Four Different Subscription Types
● Multi-Protocol Support
○ MQTT
○ AMQP
○ JMS
○ Kafka
○ ...
What is the Pulsar Ecosystem?
● Functions and Connectors
○ Functions: Lightweight stream processing
○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink”
APIs
■ Files, Databases, Data tools, Cloud Services, etc
● Protocol Handlers
○ Allows Pulsar to handle additional protocols by an extendable
API running in the broker
■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
What is the Pulsar Ecosystem? (cont’d)
● Processing Engines
○ Supports modern processing engines
■ Flink and Spark, as well as Pulsar SQL (Presto/Trino)
● Offloaders
○ Allows data to be offloaded to cloud storage and used with
existing Pulsar APIs
■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage
(in Pulsar 2.7.0)
Pulsar Functions
Provides a simple API to:
● Receive a message (consume)
● Process the message using your own code
● Send a message (produce)
Takes care of the boilerplate code so there is no need to create
producers and consumers.
Moving Data In and Out of Pulsar
IO/Connectors are a simple way to integrate with external systems and move data
in and out of Pulsar.
● Built on top of Pulsar Functions
● Built-in connectors - hub.streamnative.io
Source Sink
Use Azure BlobStore offloader with
Pulsar
https://pulsar.apache.org/docs/en/tiered-storage-azure/
Pulsar SQL
Presto/Trino workers
can read segments
directly from bookies
(or offloaded storage)
in parallel.
Bookie
1
Segment 1
Producer Consumer
Broker 1
Topic1-Part1
Broker 2
Topic1-Part2
Broker 3
Topic1-Part3
Segment 2 Segment 3 Segment 4 Segment X
Segment 1
Segment 1 Segment 1
Segment 3 Segment 3
Segment 3
Segment 2
Segment 2
Segment 2
Segment 4
Segment 4
Segment 4
Segment X
Segment X
Segment X
Bookie
2
Bookie
3
Query
Coordinator
...
...
SQL Worker SQL Worker SQL Worker
SQL Worker
Query
Topic
Metadata
Query Your Topics with Pulsar SQL (Trino)
Apache Flink
● Unified computing engine
● Batch processing is a special case of stream processing
● Stateful processing
● Massive Scalability
● Flink SQL for queries, inserts against Pulsar Topics
● Streaming Analytics
● Continuous SQL
● Continuous ETL
● Complex Event Processing
● Standard SQL Powered by Apache Calcite
Why Apache Flink?
Apache Flink
● Apache Flink is a distributed stream
processing system.
● It is capable of providing high
throughput, near real-time
processing of streams from Pulsar.
● It is ideal for ambitious Stream
Processing compared to Pulsar’s
model of lightweight Stream
Processing.
Some Apache Flink Users
● Stream: Sequence of data which is made available over time
● All computation processes chunks of data over time producing
results over time → Stream processing
○ E.g. reading data from disks is done in streaming fashion
● Events can be of various forms
● Decisive difference: Is my stream bounded or not?
Everything Is a Stream
How Flink Works
Flink Provides an API for handling events and intermediate state and
periodic snapshots to ensure consistency
How Flink Works
Flink apps are composed of a graph of tasks which is executed in a
distributed runtime
How Flink Works
Flink provides multiple APIs at different layers of abstraction
SQL / Table API: Running The Same Query On
Streams
SQL Query
Incremental query
execution
SELECT
room,
TUMBLE_END(rowtime, INTERVAL ‘1’ HOUR),
AVG(temperature)
FROM
sensors
GROUP BY
TUMBLE(rowtime, INTERVAL ‘1’ HOUR), room
Interpret stream as
table
StreamNative
Cloud
Powered by Apache Pulsar, StreamNative provides a cloud-native,
real-time messaging and streaming platform to support multi-cloud
and hybrid cloud strategies.
Built for Containers
Cloud Native
StreamNative Cloud
Flink SQL
Powered by Apache Pulsar, StreamNative provides a
cloud-native, unified messaging and streaming platform to
support multi-cloud and hybrid cloud strategies.
Built for Containers
Cloud Native
StreamNative
APP Layer
The StreamNative Offering
Application Messaging Data Pipelines Real-time Contextual Analytics
Micro
Service
Notification Dashboard Risk Control Auditing
Payment ETL
StreamNative Solution
Application Messaging Data Pipelines Real-time Contextual Analytics
Tiered Storage
APP Layer
Computing
Layer
Storage
Layer
StreamNative
Platform
IaaS Layer
Micro
Service
Notification Dashboard Risk Control Auditing
Payment ETL
Flink + Pulsar
https://flink.apache.org/2019/05/03/pulsar-flink.html
https://github.com/streamnative/pulsar-flink
https://streamnative.io/en/blog/release/2021-04-20-flink-sql-o
n-streamnative-cloud
Apache NiFi
Why Apache NiFi?
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Supports push and pull
models
• Hundreds of processors
• Visual command and
control
• Over a sixty sources
• Flow templates
• Pluggable/multi-role
security
• Designed for extension
• Clustering
• Version Control
Architecture
https://nifi.apache.org/docs/nifi-docs/html/overview.html
Record Processors
https://www.datainmotion.dev/2019/03/advanced-xml-processing-with-apache.html
● XML, CSV, JSON, AVRO and more
● Schemas or Inferred Schemas
● Easily convert between them
● Support SQL with Apache Calcite
Provenance
https://www.datainmotion.dev/2021/01/automating-starting-services-in-apache.html
Backpressure & Prioritizers
https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
System Diagnostics
Flow File
https://nifi.apache.org/docs/nifi-docs/html/overview.html
Flow Files are content and key/value pairs for attributes that are each
event/message/file that has been introduced into NiFi.
Version Control (Github and Beyond)
Repositories
https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#repositories
No More Spaghetti Flows - DO NOT
https://dev.to/tspannhw/no-more-spaghetti-flows-2emd
Do Not
● Do not Put 1,000 Flows on one workspace.
● If your flow has hundreds of steps, this is a Flow Smell. Investigate why.
● Do not Use ExecuteProcess, ExecuteScripts or a lot of Groovy scripts
as a default, look for existing processors
● Do not Use Random Custom Processors you find that have no
documentation or are unknown.
● Do not forget to upgrade, if you are running anything before Apache NiFi
1.14, upgrade now!
● Do not run on default 512M RAM.
● Do not run one node and think you have a highly available cluster.
● Do not split a file with millions of records to individual records in one
shot without checking available space/memory and back pressure.
● Use Split processors only as an absolute last resort. Many processors
are designed to work on FlowFiles that contain many records or many
lines of text. Keeping the FlowFiles together instead of splitting them
apart can often yield performance that is improved by 1-2 orders of
magnitude.
No More Spaghetti Flows - DO
https://dev.to/tspannhw/no-more-spaghetti-flows-2emd
Do
● Reduce, Reuse, Recycle. Use Parameters to reuse common modules.
● Put flows, reusable chunks (write to Slack, Database, Kafka) into
separate Process Groups.
● Write custom processors if you need new or specialized features
● Use Record Processors everywhere
● Read the Docs!
● Use the NiFi Registry for version control.
● Use NiFi CLI and DevOps for Migrations.
● Walk through your flow and make sure you understand every step and
it’s easy to read and follow. Is every processor used? Are there dead
ends?
● Do run Zookeeper on different nodes from Apache NiFi.
● Use routing based on content and attributes to allow one flow to
handle multiple nearly identical flows is better than deploying the same
flow many times with tweaks to parameters in same cluster.
● Use the correct driver for your database. There's usually a couple
different JDBC drivers.
Apache MXNet Native Processor through DJL.AI for Apache NiFi
This processor uses the DJL.AI Java Interface
https://github.com/tspannhw/nifi-djl-processor
https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79
Use Cases
USE CASE
IoT Ingestion: High-volume streaming sources, multiple message formats,
diverse protocols and multi-vendor devices creates data ingestion challenges.
I Can Haz Data?
REST and Websocket JSON “stonks”
{"symbol":"MSFT",
"uuid":"10640832-f139-4b82-8780-e3ad37b3d0
ce",
"ts":1618529574078,
"dt":1612098900000,
"datetime":"2021/01/31 08:15:00",
"open":"122.24500",
"close":"122.25500",
"high":"122.25500",
"volume":"12353",
"low":"12.24500"}
Best Practice
Architectures
All Data - Anytime - Anywhere -
Multi-Cloud - Multi-Protocol
Multi-
inges
t
Multi-
inges
t
Multi-ingest Merge
Priority
StreamNative Hub
StreamNative Cloud
Unified Batch and Stream COMPUTING
Batch
(Batch + Stream)
Unified Batch and Stream STORAGE
Offload
(Queuing + Streaming)
Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores
Tiered Storage
Pulsar
---
KoP
---
MoP
---
Websocket
---
HTTP
Pulsar
Sink
Pulsar
Sink
Streaming
Edge Gateway
Protocols
End-to-End Streaming FLiP(N) Apps
StreamNative Hub
StreamNative Cloud
Unified Batch and Stream COMPUTING
Batch
(Batch + Stream)
Unified Batch and Stream STORAGE
Offload
(Queuing + Streaming)
End-to-End Streaming Applications with Flink
With Pulsar and Flink, StreamNative offers both stream storage and stream compute for a
complete streaming solution.
Application
Application
Database
Database
DWH
Tiered Storage
Search
Index
Pulsar
---
KoP
---
AoP
---
Websocket
---
HTTP
StreamNative Hub
Pulsar
Source
Pulsar
Source
Pulsar Sink
Pulsar Sink
Apps
Streaming
Pub/Sub
Example: E-Commerce with Pulsar
● Unified storage with
access to underlying
data
● Native tiered storage
● Single system to
exchange data
● Teams share toolset
Edge AI to Cloud Streaming Pipeline
Device Data
Sensors
Energy Logs
Weather
Sensors
Aggregates
Energy
SQL
Analytics
MiNiFi
Agent
Deep Learning
Classification
Edge Private
Cloud
Multi-Public
Cloud
Demo
Demo Walkthrough
{"entriesAddedCounter":1,"numberOfEntries":1,"totalSize":651,"curre
ntLedgerEntries":1,"currentLedgerSize":651,"lastLedgerCreatedTim
estamp":"2021-09-13T16:13:06.6-04:00","waitingCursorsCount":0,"pe
ndingAddEntriesCount":0,"lastConfirmedEntry":"7076:0","state":"L
edgerOpened","ledgers":[{"ledgerId":7076,"entries":0,"size":0,"offloa
ded":false,"underReplicated":false}],"cursors":{},"schemaLedgers":[],
"compactedLedger":{"ledgerId":-1,"entries":-1,"size":-1,"offloaded":fal
se,"underReplicated":false}}
Real-Time Cloud Streaming Pipeline
TRANSCOM
Traffic
Sensors
Aggregates
Status
SQL
Analytics
APIs
REST
Wrap-Up
Connect with the Community & Stay Up-To-Date
● Join the Pulsar Slack channel - Apache-Pulsar.slack.com
● Follow @streamnativeio and @apache_pulsar on Twitter
● Subscribe to Monthly Pulsar Newsletter for major news, events,
project updates, and resources in the Pulsar community
● https://www.datainmotion.dev/2020/04/building-search-indexes-with-apache.html
● https://github.com/tspannhw/nifi-solr-example
● https://github.com/streamnative/pulsar-flink
● https://www.linkedin.com/pulse/2021-schedule-tim-spann/
● https://github.com/tspannhw/SpeakerProfile/blob/main/2021/talks/20210729_HailHydr
ate!FromStreamtoLake_TimSpann.pdf
● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud
● https://docs.streamnative.io/cloud/stable/compute/flink-sql
Deeper Content
@PaasDev
https://www.pulsardeveloper.com/
timothyspann
Interested In Learning More?
Flink SQL Cookbook
The Github Source for Flink
SQL Demo
The GitHub Source for Demo
Manning's Apache Pulsar in
Action
O’Reilly Book
[10/21] Trino Summit
Resources Free eBooks Upcoming Events
71
Pulsar Summit Asia
November 20-21, 2021
Contact us at partners@pulsar-summit.org to become a sponsor or partner
Let’s Keep
in Touch!
Tim Spann
Developer Advocate
@PassDev
https://www.linkedin.com/in/timothyspann
https://github.com/tspannhw
Questions

Weitere ähnliche Inhalte

Was ist angesagt?

Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...Timothy Spann
 
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...Timothy Spann
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302Timothy Spann
 
Codeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkTimothy Spann
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lakeTimothy Spann
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)W2O Group
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practicesTimothy Spann
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Timothy Spann
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300Timothy Spann
 
Cloud streaming presentation
Cloud streaming presentationCloud streaming presentation
Cloud streaming presentationedmandt
 
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)Timothy Spann
 
Axway amplify api management platform
Axway amplify api management platformAxway amplify api management platform
Axway amplify api management platformSmartWave
 
Streaming from the cloud
Streaming from the cloudStreaming from the cloud
Streaming from the clouddmulford
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
 
fluentd -- the missing log collector
fluentd -- the missing log collectorfluentd -- the missing log collector
fluentd -- the missing log collectorMuga Nishizawa
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Matt Stubbs
 

Was ist angesagt? (20)

Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
 
Codeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flink
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)Matt Franklin - Apache Software (Geekfest)
Matt Franklin - Apache Software (Geekfest)
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practices
 
FLiP Into Trino
FLiP Into TrinoFLiP Into Trino
FLiP Into Trino
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
 
Cloud streaming presentation
Cloud streaming presentationCloud streaming presentation
Cloud streaming presentation
 
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
 
Axway amplify api management platform
Axway amplify api management platformAxway amplify api management platform
Axway amplify api management platform
 
Streaming from the cloud
Streaming from the cloudStreaming from the cloud
Streaming from the cloud
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
 
fluentd -- the missing log collector
fluentd -- the missing log collectorfluentd -- the missing log collector
fluentd -- the missing log collector
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 

Ähnlich wie Cloud lunch and learn real-time streaming in azure

Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Timothy Spann
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computingTimothy Spann
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...StreamNative
 
Serverless Event Streaming Applications as Functionson K8
Serverless Event Streaming Applications as Functionson K8Serverless Event Streaming Applications as Functionson K8
Serverless Event Streaming Applications as Functionson K8Timothy Spann
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsTimothy Spann
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event StreamingTimothy Spann
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021StreamNative
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...Timothy Spann
 
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...InfluxData
 
Using FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleUsing FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleTimothy Spann
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteStreamNative
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampTimothy Spann
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaAttunity
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarScyllaDB
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarTimothy Spann
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016Monal Daxini
 

Ähnlich wie Cloud lunch and learn real-time streaming in azure (20)

Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
 
Serverless Event Streaming Applications as Functionson K8
Serverless Event Streaming Applications as Functionson K8Serverless Event Streaming Applications as Functionson K8
Serverless Event Streaming Applications as Functionson K8
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
 
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
 
Using FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleUsing FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at Scale
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache Kafka
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
 

Mehr von Timothy Spann

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...Timothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTimothy Spann
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-ProfitsTimothy Spann
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Timothy Spann
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI PipelinesTimothy Spann
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...Timothy Spann
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesTimothy Spann
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC MeetupTimothy Spann
 

Mehr von Timothy Spann (20)

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
 

Kürzlich hochgeladen

eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfkalichargn70th171
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxRTS corp
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...kalichargn70th171
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxSasikiranMarri
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 

Kürzlich hochgeladen (20)

eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptx
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 

Cloud lunch and learn real-time streaming in azure

  • 1. Real-Time Streaming in Azure Timothy Spann | Developer Advocate
  • 2.
  • 3. Tim Spann Developer Advocate ● https://www.datainmotion.dev/ ● https://github.com/tspannhw/SpeakerProfile ● https://dev.to/tspannhw ● https://sessionize.com/tspann/ DZone Zone Leader and Big Data MVB Data DJay
  • 4. Founded by the original developers of Apache Pulsar and Apache BookKeeper, StreamNative builds a cloud-native event streaming platform that enables enterprises to easily access data as real-time event streams.
  • 6. Apache is an open source, cloud-native distributed messaging and streaming platform.
  • 7. What are the Benefits of Pulsar? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
  • 9. A Unified Messaging Platform Message Queuing Data Streaming
  • 10. Messaging Messaging systems are ideal for work queues that do not require tasks to be performed in a particular order—for example, sending one email message to many recipients. RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Streaming Streaming works best in situations where the order of messages is important—for example, data ingestion. Kafka and Amazon Kinesis are examples of messaging systems that use streaming semantics for consuming messages. The Difference Between Messaging & Streaming
  • 11. Unified Messaging Model Streaming Messaging Producer 1 Producer 2 Pulsar Topic/Partition m0 m1 m2 m3 m4 Consumer D-1 Consumer D-2 Consumer D-3 Subscription D < k 2 , v 1 > < k 2 , v 3 > <k3,v2> < k 1 , v 0 > < k 1 , v 4 > Key-Shared Consumer C-1 Consumer C-2 Consumer C-3 Subscription C m1 m2 m3 m4 m0 Shared Failover Consumer B-1 Consumer B-0 Subscription B m1 m2 m3 m4 m0 In case of failure in Consumer B-0 Consumer A-1 Consumer A-0 Subscription A m1 m2 m3 m4 m0 Exclusive X
  • 12. Pulsar is Cloud Native ● Rebalance free architecture via separate compute and storage for easy scalability ● Built for Kubernetes, with production ready helm charts and Kubernetes operators ● Infinite streams by leveraging cost effective cloud-storage like S3, GCS, etc ● Supports a broad range of workloads and teams through unified messaging model
  • 13. Pulsar is built for easy scale-out. *Illustrations by Jack Vanlightly
  • 15. Key Milestones 2012 2016 2017 2018 2019 2020 Originally developed inside Yahoo! as “Cloud Messaging Service” Pulsar is committed to Open Source Pulsar is accepted into the Apache Software Foundation Pulsar becomes a Top-Level Project ● StreamNative is founded and seed round raised. ● Tencent adopts Pulsar for payment processing platform. ● BestPay adopts Pulsar for payment processing. ● Pulsar hits 200 contributors. ● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees ● Pulsar hits 340 contributors ● StreamNative and OVHCloud launch Kafka on Pulsar (KoP) ● StreamNative + China Mobile launch AMQP on Pulsar (AoP) ● Pulsar Ecosystem expands - StreamNative Hub launches ● StreamNative Cloud launches on GCP and Alibaba Cloud ● StreamNative customer adoption continues - new customers include Flipkart and Applied Materials ● Pulsar 2.7 + Transactions ● Pulsar Flink Connector 2.7 Major increase in adoption following TLP designation in 2018 2021 ● 3 global Pulsar conferences ● StreamNative hits 400 contributors (June). ● Pulsar surpasses Kafka in monthly active contributors. ● Pulsar 2.8 + Exactly-Once semantics ● StreamNative Platform launches
  • 16. Apache Pulsar Overview Enable Geo-Replicated Messaging ● Pub-Sub ● Geo-Replication ● Pulsar Functions ● Horizontal Scalability ● Multi-tenancy ● Tiered Persistent Storage ● Pulsar Connectors ● REST API ● CLI ● Many clients available ● Four Different Subscription Types ● Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ ...
  • 17. What is the Pulsar Ecosystem? ● Functions and Connectors ○ Functions: Lightweight stream processing ○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink” APIs ■ Files, Databases, Data tools, Cloud Services, etc ● Protocol Handlers ○ Allows Pulsar to handle additional protocols by an extendable API running in the broker ■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
  • 18. What is the Pulsar Ecosystem? (cont’d) ● Processing Engines ○ Supports modern processing engines ■ Flink and Spark, as well as Pulsar SQL (Presto/Trino) ● Offloaders ○ Allows data to be offloaded to cloud storage and used with existing Pulsar APIs ■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage (in Pulsar 2.7.0)
  • 19. Pulsar Functions Provides a simple API to: ● Receive a message (consume) ● Process the message using your own code ● Send a message (produce) Takes care of the boilerplate code so there is no need to create producers and consumers.
  • 20. Moving Data In and Out of Pulsar IO/Connectors are a simple way to integrate with external systems and move data in and out of Pulsar. ● Built on top of Pulsar Functions ● Built-in connectors - hub.streamnative.io Source Sink
  • 21. Use Azure BlobStore offloader with Pulsar https://pulsar.apache.org/docs/en/tiered-storage-azure/
  • 22. Pulsar SQL Presto/Trino workers can read segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
  • 23. Query Your Topics with Pulsar SQL (Trino)
  • 25. ● Unified computing engine ● Batch processing is a special case of stream processing ● Stateful processing ● Massive Scalability ● Flink SQL for queries, inserts against Pulsar Topics ● Streaming Analytics ● Continuous SQL ● Continuous ETL ● Complex Event Processing ● Standard SQL Powered by Apache Calcite Why Apache Flink?
  • 26. Apache Flink ● Apache Flink is a distributed stream processing system. ● It is capable of providing high throughput, near real-time processing of streams from Pulsar. ● It is ideal for ambitious Stream Processing compared to Pulsar’s model of lightweight Stream Processing.
  • 28. ● Stream: Sequence of data which is made available over time ● All computation processes chunks of data over time producing results over time → Stream processing ○ E.g. reading data from disks is done in streaming fashion ● Events can be of various forms ● Decisive difference: Is my stream bounded or not? Everything Is a Stream
  • 29. How Flink Works Flink Provides an API for handling events and intermediate state and periodic snapshots to ensure consistency
  • 30. How Flink Works Flink apps are composed of a graph of tasks which is executed in a distributed runtime
  • 31. How Flink Works Flink provides multiple APIs at different layers of abstraction
  • 32. SQL / Table API: Running The Same Query On Streams SQL Query Incremental query execution SELECT room, TUMBLE_END(rowtime, INTERVAL ‘1’ HOUR), AVG(temperature) FROM sensors GROUP BY TUMBLE(rowtime, INTERVAL ‘1’ HOUR), room Interpret stream as table
  • 34. Powered by Apache Pulsar, StreamNative provides a cloud-native, real-time messaging and streaming platform to support multi-cloud and hybrid cloud strategies. Built for Containers Cloud Native StreamNative Cloud Flink SQL
  • 35. Powered by Apache Pulsar, StreamNative provides a cloud-native, unified messaging and streaming platform to support multi-cloud and hybrid cloud strategies. Built for Containers Cloud Native StreamNative
  • 36. APP Layer The StreamNative Offering Application Messaging Data Pipelines Real-time Contextual Analytics Micro Service Notification Dashboard Risk Control Auditing Payment ETL
  • 37. StreamNative Solution Application Messaging Data Pipelines Real-time Contextual Analytics Tiered Storage APP Layer Computing Layer Storage Layer StreamNative Platform IaaS Layer Micro Service Notification Dashboard Risk Control Auditing Payment ETL
  • 38.
  • 39.
  • 42. Why Apache NiFi? • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Hundreds of processors • Visual command and control • Over a sixty sources • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering • Version Control
  • 44. Record Processors https://www.datainmotion.dev/2019/03/advanced-xml-processing-with-apache.html ● XML, CSV, JSON, AVRO and more ● Schemas or Inferred Schemas ● Easily convert between them ● Support SQL with Apache Calcite
  • 48. Flow File https://nifi.apache.org/docs/nifi-docs/html/overview.html Flow Files are content and key/value pairs for attributes that are each event/message/file that has been introduced into NiFi.
  • 49. Version Control (Github and Beyond)
  • 51. No More Spaghetti Flows - DO NOT https://dev.to/tspannhw/no-more-spaghetti-flows-2emd Do Not ● Do not Put 1,000 Flows on one workspace. ● If your flow has hundreds of steps, this is a Flow Smell. Investigate why. ● Do not Use ExecuteProcess, ExecuteScripts or a lot of Groovy scripts as a default, look for existing processors ● Do not Use Random Custom Processors you find that have no documentation or are unknown. ● Do not forget to upgrade, if you are running anything before Apache NiFi 1.14, upgrade now! ● Do not run on default 512M RAM. ● Do not run one node and think you have a highly available cluster. ● Do not split a file with millions of records to individual records in one shot without checking available space/memory and back pressure. ● Use Split processors only as an absolute last resort. Many processors are designed to work on FlowFiles that contain many records or many lines of text. Keeping the FlowFiles together instead of splitting them apart can often yield performance that is improved by 1-2 orders of magnitude.
  • 52. No More Spaghetti Flows - DO https://dev.to/tspannhw/no-more-spaghetti-flows-2emd Do ● Reduce, Reuse, Recycle. Use Parameters to reuse common modules. ● Put flows, reusable chunks (write to Slack, Database, Kafka) into separate Process Groups. ● Write custom processors if you need new or specialized features ● Use Record Processors everywhere ● Read the Docs! ● Use the NiFi Registry for version control. ● Use NiFi CLI and DevOps for Migrations. ● Walk through your flow and make sure you understand every step and it’s easy to read and follow. Is every processor used? Are there dead ends? ● Do run Zookeeper on different nodes from Apache NiFi. ● Use routing based on content and attributes to allow one flow to handle multiple nearly identical flows is better than deploying the same flow many times with tweaks to parameters in same cluster. ● Use the correct driver for your database. There's usually a couple different JDBC drivers.
  • 53.
  • 54. Apache MXNet Native Processor through DJL.AI for Apache NiFi This processor uses the DJL.AI Java Interface https://github.com/tspannhw/nifi-djl-processor https://dev.to/tspannhw/easy-deep-learning-in-apache-nifi-with-djl-2d79
  • 56. USE CASE IoT Ingestion: High-volume streaming sources, multiple message formats, diverse protocols and multi-vendor devices creates data ingestion challenges.
  • 57. I Can Haz Data? REST and Websocket JSON “stonks” {"symbol":"MSFT", "uuid":"10640832-f139-4b82-8780-e3ad37b3d0 ce", "ts":1618529574078, "dt":1612098900000, "datetime":"2021/01/31 08:15:00", "open":"122.24500", "close":"122.25500", "high":"122.25500", "volume":"12353", "low":"12.24500"}
  • 59. All Data - Anytime - Anywhere - Multi-Cloud - Multi-Protocol Multi- inges t Multi- inges t Multi-ingest Merge Priority
  • 60. StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores Tiered Storage Pulsar --- KoP --- MoP --- Websocket --- HTTP Pulsar Sink Pulsar Sink Streaming Edge Gateway Protocols End-to-End Streaming FLiP(N) Apps
  • 61. StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) End-to-End Streaming Applications with Flink With Pulsar and Flink, StreamNative offers both stream storage and stream compute for a complete streaming solution. Application Application Database Database DWH Tiered Storage Search Index Pulsar --- KoP --- AoP --- Websocket --- HTTP StreamNative Hub Pulsar Source Pulsar Source Pulsar Sink Pulsar Sink Apps Streaming Pub/Sub
  • 62. Example: E-Commerce with Pulsar ● Unified storage with access to underlying data ● Native tiered storage ● Single system to exchange data ● Teams share toolset
  • 63. Edge AI to Cloud Streaming Pipeline Device Data Sensors Energy Logs Weather Sensors Aggregates Energy SQL Analytics MiNiFi Agent Deep Learning Classification Edge Private Cloud Multi-Public Cloud
  • 64. Demo
  • 66. Real-Time Cloud Streaming Pipeline TRANSCOM Traffic Sensors Aggregates Status SQL Analytics APIs REST
  • 68. Connect with the Community & Stay Up-To-Date ● Join the Pulsar Slack channel - Apache-Pulsar.slack.com ● Follow @streamnativeio and @apache_pulsar on Twitter ● Subscribe to Monthly Pulsar Newsletter for major news, events, project updates, and resources in the Pulsar community
  • 69. ● https://www.datainmotion.dev/2020/04/building-search-indexes-with-apache.html ● https://github.com/tspannhw/nifi-solr-example ● https://github.com/streamnative/pulsar-flink ● https://www.linkedin.com/pulse/2021-schedule-tim-spann/ ● https://github.com/tspannhw/SpeakerProfile/blob/main/2021/talks/20210729_HailHydr ate!FromStreamtoLake_TimSpann.pdf ● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud ● https://docs.streamnative.io/cloud/stable/compute/flink-sql Deeper Content @PaasDev https://www.pulsardeveloper.com/ timothyspann
  • 70. Interested In Learning More? Flink SQL Cookbook The Github Source for Flink SQL Demo The GitHub Source for Demo Manning's Apache Pulsar in Action O’Reilly Book [10/21] Trino Summit Resources Free eBooks Upcoming Events
  • 71. 71 Pulsar Summit Asia November 20-21, 2021 Contact us at partners@pulsar-summit.org to become a sponsor or partner
  • 72.
  • 73.
  • 74. Let’s Keep in Touch! Tim Spann Developer Advocate @PassDev https://www.linkedin.com/in/timothyspann https://github.com/tspannhw