Jeeves Grows Up: An AI Chatbot for Performance and Quality

1
Jeeves Grows Up:
An AI Chatbot for
Performance and Quality
Shivnath Babu
CTO/Cofounder @ Unravel
Adjunct Professor @ Duke University
TRUSTED BY

2
About the speaker
Shivnath Babu
Cofounder/CTO at Unravel
Adjunct Professor of Computer Science at Duke
University
Focusing on manageability of data pipelines and
the modern data stack
Recipient of US National Science Foundation
CAREER Award, IBM Faculty Award, HP Labs
Innovation Research Award

3
Unravel radically simplifies DataOps & has
strong adoption across platforms & industries
• Brings together
information about all
your apps, clusters,
resource utilization,
users, & datasets in a
single place
• Creates end-to-end view
of data pipelines to easily
track & understand issues
• Tracks & reports on usage
across environments
• Checks for & alerts on
anomalous behavior
• Uses AI/ML to troubleshoot &
optimize apps to meet desired
performance & cost needs
• Spots & fixes inefficient usage
• Ensures efficiency, quality, &
performance of all apps in
development & production

4
4
Chatbot
A program that conducts a
conversation via text or voice

5
5
#UnifiedAnalytics #SparkAISummit
The happy Spark user

6
6
“I have no clue
which cloud
instance type to
pick for my
workload”
“My cloud
costs are
getting out of
control. Help!”
“I have no
idea why
my app is
slow”
“My app
failed and I
don’t know
why!”
The UNhappy Spark user

7
• Many levels of dependent stack traces
• Identifying the root cause is hard and time consuming
7
Typical app failure in Spark

8
8
“My app
failed and I
don’t know
why!”
Chatbot
“I know that sucks! Let me take a
look here …”
“I see the problem. Executors
are running out of memory”
“Setting spark.executor.memory
to 12g fixes the problem. I have
verified it. See this run here”
“Wow.
Thanks. You
are
awesome!”
Spark User

11
Now every company is a data company
Powered by
Data, ML and AI

12
Most companies have 10+
mission-critical Data Pipelines
Data Pipelines

13
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

14
Data Pipelines
Data Stack for these pipelines
is multi-system & complex
Data Stack

15
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

16
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

17
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

18
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

19
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE

20
Data Pipelines
Data Stack for these pipelines
is multi-system & complex
Data Stack
33% & growing # of data teams
follow a DataOps practice
DataOps

21
SLA misses
are creating
problems
We asked 200+ companies how they
manage their data pipelines
We only
detect the fire
after it starts!
Our pipeline
schedules
are all
messed up!
We need
CI/CD for our
pipelines
Fixing
problems
takes weeks
Users are
always
complaining
I am wasting
most of my
time with
bad data
Do devs ever
#!$ test their
pipelines?
Two failed
attempts to
migrate to
cloud
Cost
reduction is
our #1
priority

22
Effective DataOps practice is required to
solve these problems with data pipelines
SLA misses
are creating
problems
We only
detect the fire
after it starts!
Our pipeline
schedules
are all
messed up!
We need
CI/CD for our
pipelines
Fixing
problems
takes weeks
Users are
always
complaining
I am wasting
most of my
time with
bad data
Do devs ever
#!$ test their
pipelines?
Two failed
attempts to
migrate to
cloud
Cost
reduction is
our #1
priority

23
We created Unravel’s Pipeline Observer to
simplify DataOps
Real-time
Store
Root Cause
Analysis
Service
Baselining
Service
Pipeline
Observer
UI/API
Correlation
Services
Logs
Metrics
Traces
Metadata
Conf
Events
Chatbot
SLA
Tracking UI
Pipeline
Capacity
Planning
Proactive
Alerting
Usage / Cost
Chargeback UI

24
Modern Data Stack composed of:
1. Databricks (Advanced Analytics with Spark)
2. Azure Data Lake Storage (Data Lake)
3. Airflow (Orchestration)
4. dbt (Data Transformation)
5. Great Expectations (Data Quality/Validation)
6. Slack (Chatbot, Team Comm., & Alerting)
7. Unravel (End-to-end Observability)
Demo
Stack

25
1. Pipeline in danger of missing
performance SLA
2. Pipeline in danger of cost overrun
3. Pipeline in danger of breaking due
to data quality problems
Demo
Scenarios

27
In summary
AI-driven DataOps to manage Data Pipelines for the New Data Stack
• Develop & manage data pipelines with ease
• Save time & money
27
Sign up for a free trial, we value your feedback!
https://unraveldata.com/saas-free-trial
We are hiring
shivnath@unraveldata.com

Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Jeeves Grows Up: An AI Chatbot for Performance and Quality

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Jeeves Grows Up: An AI Chatbot for Performance and Quality

Ähnlich wie Jeeves Grows Up: An AI Chatbot for Performance and Quality (20)

Mehr von Databricks

Mehr von Databricks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Jeeves Grows Up: An AI Chatbot for Performance and Quality