SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Build Full Stack
Monitoring and Notification 

with Prometheus
1
Jazz Yao-Tsung Wang
Initiator of Taiwan Data Engineering Association
Co-Founder of Taiwan Hadoop User Group
Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
Hello!
I am Jazz Wang
Co-Founder of Hadoop.TW
Initiator of Taiwan Data Engineering Association (TDEA)
Hadoop Evangelist since 2008.
Open Source Promoter. System Admin (Ops).
- 11 years (2002/08 ~ 2014/02) Researcher in HPC field.
- 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP),
Product Management of ‘Big Data Platform Management Product’
- 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
2
1.
/ /
Why do I need Full Stack Monitoring and Notification ?
Let’s start with Jazz’s Jobs / Pains / Gains
3
AWS
Hybrid ….
4
VM
Azure
GCP
5
NetAdmin
Research
Developer
Security
Cloud Ops
SysAdmin
Data Engineer
6
NetAdmin
Research
Developer
Security
Cacti
NewRelic 

Server
OpsCenter
Kafka Manager
NewRelic 

Synthetic / APM
Status Cake
++ ++ DataDog
Pain
▷ Data Fragments
▷
▷
▷ Data Retention
▷ 7
▷ Black Box
▷ (Metrics)
▷ Metrics
▷ Vendor Lock-in
▷
7
Gain —
▷ Centralized Time-serious Database
▷
▷ Support Alert Notification
▷ Slack, E-mail, SMS …
▷ Self-defined Data Retention Rate
▷
▷ White Box
▷ Metrics = (Metrics)
▷ Self-defined Dashboard
▷ Ex. Data Pipeline
8
( ) …. Inspired by Outlier …
https://www.outlyer.com/
~~ ~~
9
2.
/ /
Introduction to Prometheus Ecosystem
Features / Pain Relievers / Gain Creators
10
11
Concepts
Common Building Blocks
12
Target
Collector
Exporter
Time-Series
Database
Rule
Dashboard
Alert Message
Collector
Exporter
Exporter
Dashboard
Dashboard
TargetTarget
Rule
Rule
Alert Message
Annotation
Push
Pull
Ranking of Time Series DBMS
13https://db-engines.com/en/ranking/time+series+dbms
Comparison of Common Monitor and Notification System
14
Target / Exporter DBMS
Dashboard
Alert
snmpd
Pull
Cacti — Device
( snmpwalk )
RRDTool Cacti — Graph Plugin*
gmond
Pull
Ganglia
gmetad
RRDTool Ganglia Nagios
newrelic-agent
Push (?) NewRelic ?? NewRelic NewRelic Alert
statsD
Push Carbon / whisper Graphite Grafana Grafana
Telegraf
Push Telegraf InfluxDB Grafana Grafana
Pull
Push*
snmp_expoter
node_exporter
jmx_exporter …
Prometheus Grafana AlertManager
15
About Prometheus
▷ https://prometheus.io/
▷ 2012 11 SoundCloud
▷ Go Apache 2.0
▷ 2016 Cloud Native Computing Foundation

Kubernates K8S Prometheus
▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08
▷ PromQL
▷ Grafana
▷ AlertManager
▷ v2.0
16
Components of Prometheus
Push
Pull
Query
Comparison of Time-Series DBMS
17
Prometheus
HA
Prometheus
Data Model
Client Libraries
18
▷ Official Prometheus client library
▷ Go
▷ Java or Scala
▷ Python
▷ Ruby
▷ Unofficial 3rd-party client library
▷ Bash
▷ C++
▷ Common Lisp
▷ Elixir
▷ Erlang
▷ Haskell
▷ Lua for Nginx
▷ Lua for Tarantool
▷ .NET / C#
▷ Node.js
▷ PHP
▷ Rust
19
3.
Docker Compose
Full Stack
Show me the source code!!
○ https://github.com/jazzwang/prometheus-labs
○ Docker Compose
○
20
— Data Pipeline
21
in_dummy Fluentd out_kafka
Kafka
in_kafka_group Fluentd
out_file
Network Layer
▷ snmp_exporter
○ https://github.com/prometheus/snmp_exporter
○ snmp Metrics
○ MIB OID
○ 

snmp_exporter generator
snmp.yml
▷ blackbox_exporter
○ https://github.com/prometheus/blackbox_exporter
○ HTTP, HTTPS, DNS, TCP ICMP
○ 

Web Service SSH DNS
Ping blackbox_exporter
22
System Layer
▷ node_exporter
○ https://github.com/prometheus/node_exporter
○ OS Level Metrics
23
Middleware Layer
▷ jmx_exporter
○ https://github.com/prometheus/jmx_exporter
○ Java YAML
Prometheus Metrics
○
■ Apache Kafka
■ Apache Cassandra
■ Apache Flink
■ Apache Spark
■ Apache Tomcat
■ Apache ZooKeeper
■ Apache ActiveMQ Artemis 2.x
■ WebLogic
■ WildFly 10
24
Kafka
▷ `jmx_exporter` Kafka Cassandra
○ Docker - https://github.com/RobustPerception/docker_examples
▷ kafka_topic_exporter
○ Java Jetty
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ ZK topic_partition
○ https://github.com/cloudflare/kafka_zookeeper_exporter
▷ prometheus-kafka-consumer-group-exporter
○ Python Metrics consumer_group_offset topic_highwater
Lag
○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter
▷ burrow_exporter
○ LinkedIn Kafka Lag Burrow (Go ,
sliding window )
○ https://github.com/jirwin/burrow_exporter
25
Kafka
▷ kafka-consumer-group-exporter
○ Go kafka-consumer-groups.sh
○ https://github.com/kawamuray/prometheus-kafka-consumer-group-
exporter
▷ kafka-prometheus-exporter
○ Go consumergoup_lag metrics
○ Kafka 0.8 (ZK)
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ Go Metrics
○ Kafka 0.9 (KF)
○ https://github.com/danielqsj/kafka_exporter
26
Fluentd
▷ fluent-agent-lite_exporter
○ Tagamoris fluent-agent-lite [1]
○ https://github.com/matsumana/fluent-agent-lite_exporter
○ [1] https://github.com/tagomoris/fluent-agent-lite
▷ fluent-plugin-prometheus
○ fluentd → monitor_agent → fluent-plugin-prometheus
○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd
○ https://github.com/fluent/fluent-plugin-prometheus
▷ fluentd_exporter
○ Release,
○ https://github.com/wyukawa/fluentd_exporter
▷ fluentd_exporter
○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus
○ https://github.com/wyukawa/fluentd_exporter
27
Application Layer
28
▷ https://prometheus.io/docs/instrumenting/clientlibs/
Application Layer
29
▷ http://metrics.dropwizard.io/4.0.0/
30
4.
Lesson Learned
Lesson Learned
▷ Lesson #1



Prometheus 

▷ Lesson #2





Metrics exporter 

○ exporter

https://prometheus.io/docs/instrumenting/exporters/
○ Port

https://github.com/prometheus/prometheus/wiki/Default-port-allocations
○ exporter Metrics
31
Lesson Learned
▷
○ github
○ exporter Metrics
○ http://prometheus:9090/graph
○ Grafana Dashboard
○ Grafana Alert
32
33
Thanks!
Any questions?
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
https://github.com/jazzwang
Github *^__^*

Weitere ähnliche Inhalte

Was ist angesagt?

Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Sergey Lukjanov
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache SparkWes McKinney
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...Spark Summit
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsDataWorks Summit
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Spark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSpark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSandish Kumar H N
 
Sparkler - Spark Crawler
Sparkler - Spark Crawler Sparkler - Spark Crawler
Sparkler - Spark Crawler Thamme Gowda
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Alluxio, Inc.
 
Hadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdHadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdKevin Weil
 
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersHadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersDataWorks Summit/Hadoop Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-PatternsDouglas Moore
 

Was ist angesagt? (20)

Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
Atlanta OpenStack Summit: Technical Deep Dive: Big Data Computations Using El...
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache Spark
 
Empower Data-Driven Organizations
Empower Data-Driven OrganizationsEmpower Data-Driven Organizations
Empower Data-Driven Organizations
 
Cassandra: Now and the Future @ Yahoo! JAPAN
Cassandra: Now and the Future @ Yahoo! JAPANCassandra: Now and the Future @ Yahoo! JAPAN
Cassandra: Now and the Future @ Yahoo! JAPAN
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Treasure Data and Fluentd
Treasure Data and FluentdTreasure Data and Fluentd
Treasure Data and Fluentd
 
Powering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big DataPowering a Virtual Power Station with Big Data
Powering a Virtual Power Station with Big Data
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Spark,Hadoop,Presto Comparition
Spark,Hadoop,Presto ComparitionSpark,Hadoop,Presto Comparition
Spark,Hadoop,Presto Comparition
 
Big Data A La Carte Menu
Big Data A La Carte MenuBig Data A La Carte Menu
Big Data A La Carte Menu
 
Sparkler - Spark Crawler
Sparkler - Spark Crawler Sparkler - Spark Crawler
Sparkler - Spark Crawler
 
Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3Accelerating Hive with Alluxio on S3
Accelerating Hive with Alluxio on S3
 
hotdog a TD tool for DD
hotdog a TD tool for DDhotdog a TD tool for DD
hotdog a TD tool for DD
 
Hadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant birdHadoop summit 2010 frameworks panel elephant bird
Hadoop summit 2010 frameworks panel elephant bird
 
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersHadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
 

Ähnlich wie Full Stack Monitoring with Prometheus and Grafana

Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupWojciech Biela
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Timothy Spann
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Wojciech Biela
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrTimothy Spann
 
Using apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesTimothy Spann
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practicesTimothy Spann
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonInsuk (Chris) Cho
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300Timothy Spann
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceTimothy Spann
 
Monitoring Kafka w/ Prometheus
Monitoring Kafka w/ PrometheusMonitoring Kafka w/ Prometheus
Monitoring Kafka w/ Prometheuskawamuray
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Appsssuser73434e
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkdatamantra
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsKeiichiro Ono
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!Matt Tesauro
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Brian Brazil
 

Ähnlich wie Full Stack Monitoring with Prometheus and Grafana (20)

Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
 
Using apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelines
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practices
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of Python
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
Monitoring Kafka w/ Prometheus
Monitoring Kafka w/ PrometheusMonitoring Kafka w/ Prometheus
Monitoring Kafka w/ Prometheus
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
 
解讀雲端大數據新趨勢
解讀雲端大數據新趨勢解讀雲端大數據新趨勢
解讀雲端大數據新趨勢
 
FLiP Into Trino
FLiP Into TrinoFLiP Into Trino
FLiP Into Trino
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!OWASP WTE - Now in the Cloud!
OWASP WTE - Now in the Cloud!
 
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
 

Kürzlich hochgeladen

UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 

Kürzlich hochgeladen (20)

UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 

Full Stack Monitoring with Prometheus and Grafana

  • 1. Build Full Stack Monitoring and Notification 
 with Prometheus 1 Jazz Yao-Tsung Wang Initiator of Taiwan Data Engineering Association Co-Founder of Taiwan Hadoop User Group Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
  • 2. Hello! I am Jazz Wang Co-Founder of Hadoop.TW Initiator of Taiwan Data Engineering Association (TDEA) Hadoop Evangelist since 2008. Open Source Promoter. System Admin (Ops). - 11 years (2002/08 ~ 2014/02) Researcher in HPC field. - 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP), Product Management of ‘Big Data Platform Management Product’ - 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang 2
  • 3. 1. / / Why do I need Full Stack Monitoring and Notification ? Let’s start with Jazz’s Jobs / Pains / Gains 3
  • 7. Pain ▷ Data Fragments ▷ ▷ ▷ Data Retention ▷ 7 ▷ Black Box ▷ (Metrics) ▷ Metrics ▷ Vendor Lock-in ▷ 7
  • 8. Gain — ▷ Centralized Time-serious Database ▷ ▷ Support Alert Notification ▷ Slack, E-mail, SMS … ▷ Self-defined Data Retention Rate ▷ ▷ White Box ▷ Metrics = (Metrics) ▷ Self-defined Dashboard ▷ Ex. Data Pipeline 8
  • 9. ( ) …. Inspired by Outlier … https://www.outlyer.com/ ~~ ~~ 9
  • 10. 2. / / Introduction to Prometheus Ecosystem Features / Pain Relievers / Gain Creators 10
  • 12. Common Building Blocks 12 Target Collector Exporter Time-Series Database Rule Dashboard Alert Message Collector Exporter Exporter Dashboard Dashboard TargetTarget Rule Rule Alert Message Annotation Push Pull
  • 13. Ranking of Time Series DBMS 13https://db-engines.com/en/ranking/time+series+dbms
  • 14. Comparison of Common Monitor and Notification System 14 Target / Exporter DBMS Dashboard Alert snmpd Pull Cacti — Device ( snmpwalk ) RRDTool Cacti — Graph Plugin* gmond Pull Ganglia gmetad RRDTool Ganglia Nagios newrelic-agent Push (?) NewRelic ?? NewRelic NewRelic Alert statsD Push Carbon / whisper Graphite Grafana Grafana Telegraf Push Telegraf InfluxDB Grafana Grafana Pull Push* snmp_expoter node_exporter jmx_exporter … Prometheus Grafana AlertManager
  • 15. 15 About Prometheus ▷ https://prometheus.io/ ▷ 2012 11 SoundCloud ▷ Go Apache 2.0 ▷ 2016 Cloud Native Computing Foundation
 Kubernates K8S Prometheus ▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08 ▷ PromQL ▷ Grafana ▷ AlertManager ▷ v2.0
  • 17. Comparison of Time-Series DBMS 17 Prometheus HA Prometheus Data Model
  • 18. Client Libraries 18 ▷ Official Prometheus client library ▷ Go ▷ Java or Scala ▷ Python ▷ Ruby ▷ Unofficial 3rd-party client library ▷ Bash ▷ C++ ▷ Common Lisp ▷ Elixir ▷ Erlang ▷ Haskell ▷ Lua for Nginx ▷ Lua for Tarantool ▷ .NET / C# ▷ Node.js ▷ PHP ▷ Rust
  • 20. Show me the source code!! ○ https://github.com/jazzwang/prometheus-labs ○ Docker Compose ○ 20
  • 21. — Data Pipeline 21 in_dummy Fluentd out_kafka Kafka in_kafka_group Fluentd out_file
  • 22. Network Layer ▷ snmp_exporter ○ https://github.com/prometheus/snmp_exporter ○ snmp Metrics ○ MIB OID ○ 
 snmp_exporter generator snmp.yml ▷ blackbox_exporter ○ https://github.com/prometheus/blackbox_exporter ○ HTTP, HTTPS, DNS, TCP ICMP ○ 
 Web Service SSH DNS Ping blackbox_exporter 22
  • 23. System Layer ▷ node_exporter ○ https://github.com/prometheus/node_exporter ○ OS Level Metrics 23
  • 24. Middleware Layer ▷ jmx_exporter ○ https://github.com/prometheus/jmx_exporter ○ Java YAML Prometheus Metrics ○ ■ Apache Kafka ■ Apache Cassandra ■ Apache Flink ■ Apache Spark ■ Apache Tomcat ■ Apache ZooKeeper ■ Apache ActiveMQ Artemis 2.x ■ WebLogic ■ WildFly 10 24
  • 25. Kafka ▷ `jmx_exporter` Kafka Cassandra ○ Docker - https://github.com/RobustPerception/docker_examples ▷ kafka_topic_exporter ○ Java Jetty ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ ZK topic_partition ○ https://github.com/cloudflare/kafka_zookeeper_exporter ▷ prometheus-kafka-consumer-group-exporter ○ Python Metrics consumer_group_offset topic_highwater Lag ○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter ▷ burrow_exporter ○ LinkedIn Kafka Lag Burrow (Go , sliding window ) ○ https://github.com/jirwin/burrow_exporter 25
  • 26. Kafka ▷ kafka-consumer-group-exporter ○ Go kafka-consumer-groups.sh ○ https://github.com/kawamuray/prometheus-kafka-consumer-group- exporter ▷ kafka-prometheus-exporter ○ Go consumergoup_lag metrics ○ Kafka 0.8 (ZK) ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ Go Metrics ○ Kafka 0.9 (KF) ○ https://github.com/danielqsj/kafka_exporter 26
  • 27. Fluentd ▷ fluent-agent-lite_exporter ○ Tagamoris fluent-agent-lite [1] ○ https://github.com/matsumana/fluent-agent-lite_exporter ○ [1] https://github.com/tagomoris/fluent-agent-lite ▷ fluent-plugin-prometheus ○ fluentd → monitor_agent → fluent-plugin-prometheus ○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd ○ https://github.com/fluent/fluent-plugin-prometheus ▷ fluentd_exporter ○ Release, ○ https://github.com/wyukawa/fluentd_exporter ▷ fluentd_exporter ○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus ○ https://github.com/wyukawa/fluentd_exporter 27
  • 31. Lesson Learned ▷ Lesson #1
 
 Prometheus 
 ▷ Lesson #2
 
 
 Metrics exporter 
 ○ exporter
 https://prometheus.io/docs/instrumenting/exporters/ ○ Port
 https://github.com/prometheus/prometheus/wiki/Default-port-allocations ○ exporter Metrics 31
  • 32. Lesson Learned ▷ ○ github ○ exporter Metrics ○ http://prometheus:9090/graph ○ Grafana Dashboard ○ Grafana Alert 32
  • 33. 33 Thanks! Any questions? You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang https://github.com/jazzwang Github *^__^*