SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Pulsar Virtual Summit Europe 2021
Interactive Analytics on
Pulsar with Pulsar SQL
Axel Sirota
AI and Coud Consultant
@AxelSirota
Who am I?
QR to my Pluralsight
courses
QR to my O’Reilly
trainings
–Microsoft Certified Trainer
–Author, Instructor and Editor at Pluralsight, O’Reilly
Media, and Develop Intelligence
–AI and Cloud Consultant
Pulsar Virtual Summit Europe 2021
Catalogue
• A Simple Scenario
• Inspecting and Debugging Topics with Pulsar SQL
• Interactive Analytics
Pulsar Virtual Summit Europe 2021
Catalogue
• A Simple Scenario
Pulsar Virtual Summit Europe 2021
Ann
a,28
,$50
Application Instance Pulsar Deployment
File Source
Pulsar Function
Ingress topic
Processed
topic
Pulsar Virtual Summit Europe 2021
1. You check the status on the Pulsar Function and there
are some exceptions
2. And you haven’t set a log topic for each Pulsar function
(at least it happened to us)
3. You don’t want downtime to debug locally
Some issues appear…
What can you do?
Pulsar Virtual Summit Europe 2021
Catalogue
• Inspecting and Debugging Topics with Pulsar SQL
Pulsar Virtual Summit Europe 2021
Pulsar SQL enhances the Pulsar Presto connector to query
topics interactively
One can check every message that passed through the
topic easily and in a safe manner
It is lightweight, simple, enables high concurrent access,
and you can reuse existing Presto clusters
Introducing… Pulsar SQL
Pulsar Virtual Summit Europe 2021
BookKeeper
Pulsar Broker Presto
Bookie 1 Bookie 2 Bookie 3
Presto
Connector
content page
Configuration file
Specify where are the zookeepers
and brokers
connector.name=pulsar
pulsar.broker-service-url=https://my-pulsar-
deployment.com
pulsar.zookeeper-uri=https://my-pulsar-
deployment.com:2181
Put in
conf/presto/catalog/pulsar.proper
ties
content page
Two commands and magic
Start the worker inside the Presto
cluster
->./bin/pulsar sql-worker start
Running in 6896
content page
Two commands and magic
->./bin/pulsar sql
presto>
Start the console
So simple, yet so powerful!
Pulsar Virtual Summit Europe 2021
The Full Architecture
Pulsar Virtual Summit Europe 2021
1. Validate schemas in a readable SQL format
2. Easily debug bad messages that make Pulsar Functions
fail unexpectedly
3. Leverage SQL tools and queries for analytics
But… why should we use it?
What can you do?
Pulsar Virtual Summit Europe 2021
Catalogue
• Interactive Analytics
Pulsar Virtual Summit Europe 2021
Equivalence
Pulsar Presto
Namespaces Schemas
Topics Tables
Fields Columns
Unserialized
message
__value__
Pulsar Virtual Summit Europe 2021
presto> show columns from pulsar."public/default"."voo";
Column | Type | Extra | Comment
-------------------+-----------+-------+-----------------------------------------------------------------------------
__value__ | varchar | | The value of the message with primitive type schema
__partition__ | integer | | The partition number which the message belongs to
__event_time__ | timestamp | | Application defined timestamp in milliseconds of when the event occurred
__publish_time__ | timestamp | | The timestamp in milliseconds of when event as published
__message_id__ | varchar | | The message ID of the message used to generate this row
__sequence_id__ | bigint | | The sequence ID of the message used to generate this row
__producer_name__ | varchar | | The name of the producer that publish the message used to generate this row
__key__ | varchar | | The partition key for the topic
__properties__ | varchar | | User defined properties
(9 rows)
Pulsar Virtual Summit Europe 2021
2021-09-13, 12 2021-09-14, 9 2021-09-15, 15
metrics topic without Schema in public/pulsar-summit
SELECT * from “public/pulsar-summit”.metrics
__value__
2021-09-13,12
2021-09-14,9
2021-09-15,15
Pulsar Virtual Summit Europe 2021
2021-09-13, 12 2021-09-14, 9 2021-09-15, 15
metrics topic with Schema in public/pulsar-summit (Date, Metric)
Date Metric
2021-09-13 12
2021-09-14 9
2021-09-15 15
SELECT * from “public/pulsar-summit”.metrics
Pulsar Virtual Summit Europe 2021
2021-09-13, 12 2021-09-14, 9 2021-09-15, 15
metrics topic with Schema in public/pulsar-summit (Date, Metric)
SELECT count(1) from “public/pulsar-summit”.metrics where Metric > 10
Count
3
2021-10-15, 120
Pulsar Virtual Summit Europe 2021
2021-09-13, 12 2021-09-14, 9 2021-09-15, 15
metrics topic with Schema in public/pulsar-summit (Date, Metric)
Select as month(Date) as month, SUM(Metric) as agg_metric
from “public/pulsar-summit”.metrics
group by 1, order by 2 DESC
Month agg_metric
10 120
9 36
2021-10-15, 120
Pulsar Virtual Summit Europe 2021
1. Interactively debug topics without open subscriptions
2. Audit who send each message, when, where, what did it
send, and how much it took
3. Do analytics on the messages flowing through Pulsar
If you need to…
Then Pulsar SQL is what you look for!
And all of this without affecting production performance
Pulsar Virtual Summit Europe 2021
Thanks!!
Questions?
Axel Sirota
AI and Coud Consultant
@AxelSirota

Weitere ähnliche Inhalte

Was ist angesagt?

Cloud streaming presentation
Cloud streaming presentationCloud streaming presentation
Cloud streaming presentation
edmandt
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
Timothy Spann
 

Was ist angesagt? (20)

Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
 
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreKafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
 
Getting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison HighamGetting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison Higham
 
StreamNative FLiP into scylladb - scylla summit 2022
StreamNative   FLiP into scylladb - scylla summit 2022StreamNative   FLiP into scylladb - scylla summit 2022
StreamNative FLiP into scylladb - scylla summit 2022
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021
Function Mesh: Complex Streaming Jobs Made Simple - Pulsar Summit NA 2021
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
 
Building event streaming pipelines using Apache Pulsar
Building event streaming pipelines using Apache PulsarBuilding event streaming pipelines using Apache Pulsar
Building event streaming pipelines using Apache Pulsar
 
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
Using the JMS 2.0 API with Apache Pulsar - Pulsar Virtual Summit Europe 2021
 
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
 
Cloud streaming presentation
Cloud streaming presentationCloud streaming presentation
Cloud streaming presentation
 
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Interactive querying of streams using Apache Pulsar_Jerry peng
Interactive querying of streams using Apache Pulsar_Jerry pengInteractive querying of streams using Apache Pulsar_Jerry peng
Interactive querying of streams using Apache Pulsar_Jerry peng
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 

Ähnlich wie Interactive Analytics on Pulsar with Pulsar SQL - Pulsar Virtual Summit Europe 2021

How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware
 

Ähnlich wie Interactive Analytics on Pulsar with Pulsar SQL - Pulsar Virtual Summit Europe 2021 (20)

How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
Sql Nexus
Sql NexusSql Nexus
Sql Nexus
 
Spectra Cx V3.2 Webcast 19 May 2010
Spectra Cx V3.2 Webcast 19 May 2010Spectra Cx V3.2 Webcast 19 May 2010
Spectra Cx V3.2 Webcast 19 May 2010
 
Openobject bi
Openobject biOpenobject bi
Openobject bi
 
Generating Code with Oracle SQL Developer Data Modeler
Generating Code with Oracle SQL Developer Data ModelerGenerating Code with Oracle SQL Developer Data Modeler
Generating Code with Oracle SQL Developer Data Modeler
 
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
 
Apache spark 2.4 and beyond
Apache spark 2.4 and beyondApache spark 2.4 and beyond
Apache spark 2.4 and beyond
 
Allan_John_R_Salgado-MCSD.NET, MCTS,MCPD-Resume(LinkedIn)
Allan_John_R_Salgado-MCSD.NET, MCTS,MCPD-Resume(LinkedIn)Allan_John_R_Salgado-MCSD.NET, MCTS,MCPD-Resume(LinkedIn)
Allan_John_R_Salgado-MCSD.NET, MCTS,MCPD-Resume(LinkedIn)
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
NoSQL and MySQL: News about JSON
NoSQL and MySQL: News about JSONNoSQL and MySQL: News about JSON
NoSQL and MySQL: News about JSON
 
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
Apache Pulsar: A Foundation Backbone for Clever Cloud - Pulsar Virtual Summit...
 
MuleSoft Manchester Meetup #3 slides 31st March 2020
MuleSoft Manchester Meetup #3 slides 31st March 2020MuleSoft Manchester Meetup #3 slides 31st March 2020
MuleSoft Manchester Meetup #3 slides 31st March 2020
 
Using AWR for SQL Analysis
Using AWR for SQL AnalysisUsing AWR for SQL Analysis
Using AWR for SQL Analysis
 
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them allEclipseCon 2016 - OCCIware : one Cloud API to rule them all
EclipseCon 2016 - OCCIware : one Cloud API to rule them all
 
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open WideOCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
OCCIware Project at EclipseCon France 2016, by Marc Dutoo, Open Wide
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
NEW TOP FEATURES COMING TO SALESFORCE RELEASE WINTER 23 RELEASE BY NBSCONSULTING
NEW TOP FEATURES COMING TO SALESFORCE RELEASE WINTER 23 RELEASE BY NBSCONSULTINGNEW TOP FEATURES COMING TO SALESFORCE RELEASE WINTER 23 RELEASE BY NBSCONSULTING
NEW TOP FEATURES COMING TO SALESFORCE RELEASE WINTER 23 RELEASE BY NBSCONSULTING
 
UDP Report
UDP ReportUDP Report
UDP Report
 
Monitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with PrometheusMonitoring Cloud Native Applications with Prometheus
Monitoring Cloud Native Applications with Prometheus
 

Mehr von StreamNative

Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 

Mehr von StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Interactive Analytics on Pulsar with Pulsar SQL - Pulsar Virtual Summit Europe 2021

  • 1. Pulsar Virtual Summit Europe 2021 Interactive Analytics on Pulsar with Pulsar SQL Axel Sirota AI and Coud Consultant @AxelSirota
  • 2. Who am I? QR to my Pluralsight courses QR to my O’Reilly trainings –Microsoft Certified Trainer –Author, Instructor and Editor at Pluralsight, O’Reilly Media, and Develop Intelligence –AI and Cloud Consultant
  • 3. Pulsar Virtual Summit Europe 2021 Catalogue • A Simple Scenario • Inspecting and Debugging Topics with Pulsar SQL • Interactive Analytics
  • 4. Pulsar Virtual Summit Europe 2021 Catalogue • A Simple Scenario
  • 5. Pulsar Virtual Summit Europe 2021 Ann a,28 ,$50 Application Instance Pulsar Deployment File Source Pulsar Function Ingress topic Processed topic
  • 6. Pulsar Virtual Summit Europe 2021 1. You check the status on the Pulsar Function and there are some exceptions 2. And you haven’t set a log topic for each Pulsar function (at least it happened to us) 3. You don’t want downtime to debug locally Some issues appear… What can you do?
  • 7. Pulsar Virtual Summit Europe 2021 Catalogue • Inspecting and Debugging Topics with Pulsar SQL
  • 8. Pulsar Virtual Summit Europe 2021 Pulsar SQL enhances the Pulsar Presto connector to query topics interactively One can check every message that passed through the topic easily and in a safe manner It is lightweight, simple, enables high concurrent access, and you can reuse existing Presto clusters Introducing… Pulsar SQL
  • 9. Pulsar Virtual Summit Europe 2021 BookKeeper Pulsar Broker Presto Bookie 1 Bookie 2 Bookie 3 Presto Connector
  • 10. content page Configuration file Specify where are the zookeepers and brokers connector.name=pulsar pulsar.broker-service-url=https://my-pulsar- deployment.com pulsar.zookeeper-uri=https://my-pulsar- deployment.com:2181 Put in conf/presto/catalog/pulsar.proper ties
  • 11. content page Two commands and magic Start the worker inside the Presto cluster ->./bin/pulsar sql-worker start Running in 6896
  • 12. content page Two commands and magic ->./bin/pulsar sql presto> Start the console So simple, yet so powerful!
  • 13. Pulsar Virtual Summit Europe 2021 The Full Architecture
  • 14. Pulsar Virtual Summit Europe 2021 1. Validate schemas in a readable SQL format 2. Easily debug bad messages that make Pulsar Functions fail unexpectedly 3. Leverage SQL tools and queries for analytics But… why should we use it? What can you do?
  • 15. Pulsar Virtual Summit Europe 2021 Catalogue • Interactive Analytics
  • 16. Pulsar Virtual Summit Europe 2021 Equivalence Pulsar Presto Namespaces Schemas Topics Tables Fields Columns Unserialized message __value__
  • 17. Pulsar Virtual Summit Europe 2021 presto> show columns from pulsar."public/default"."voo"; Column | Type | Extra | Comment -------------------+-----------+-------+----------------------------------------------------------------------------- __value__ | varchar | | The value of the message with primitive type schema __partition__ | integer | | The partition number which the message belongs to __event_time__ | timestamp | | Application defined timestamp in milliseconds of when the event occurred __publish_time__ | timestamp | | The timestamp in milliseconds of when event as published __message_id__ | varchar | | The message ID of the message used to generate this row __sequence_id__ | bigint | | The sequence ID of the message used to generate this row __producer_name__ | varchar | | The name of the producer that publish the message used to generate this row __key__ | varchar | | The partition key for the topic __properties__ | varchar | | User defined properties (9 rows)
  • 18. Pulsar Virtual Summit Europe 2021 2021-09-13, 12 2021-09-14, 9 2021-09-15, 15 metrics topic without Schema in public/pulsar-summit SELECT * from “public/pulsar-summit”.metrics __value__ 2021-09-13,12 2021-09-14,9 2021-09-15,15
  • 19. Pulsar Virtual Summit Europe 2021 2021-09-13, 12 2021-09-14, 9 2021-09-15, 15 metrics topic with Schema in public/pulsar-summit (Date, Metric) Date Metric 2021-09-13 12 2021-09-14 9 2021-09-15 15 SELECT * from “public/pulsar-summit”.metrics
  • 20. Pulsar Virtual Summit Europe 2021 2021-09-13, 12 2021-09-14, 9 2021-09-15, 15 metrics topic with Schema in public/pulsar-summit (Date, Metric) SELECT count(1) from “public/pulsar-summit”.metrics where Metric > 10 Count 3 2021-10-15, 120
  • 21. Pulsar Virtual Summit Europe 2021 2021-09-13, 12 2021-09-14, 9 2021-09-15, 15 metrics topic with Schema in public/pulsar-summit (Date, Metric) Select as month(Date) as month, SUM(Metric) as agg_metric from “public/pulsar-summit”.metrics group by 1, order by 2 DESC Month agg_metric 10 120 9 36 2021-10-15, 120
  • 22. Pulsar Virtual Summit Europe 2021 1. Interactively debug topics without open subscriptions 2. Audit who send each message, when, where, what did it send, and how much it took 3. Do analytics on the messages flowing through Pulsar If you need to… Then Pulsar SQL is what you look for! And all of this without affecting production performance
  • 23. Pulsar Virtual Summit Europe 2021 Thanks!! Questions? Axel Sirota AI and Coud Consultant @AxelSirota