SlideShare ist ein Scribd-Unternehmen logo
1 von 202
Downloaden Sie, um offline zu lesen
Database Research at TU Berlin
Today‘s Talks:
Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft
Optimized On-Demand
Data Streaming from
Sensor Nodes
ACM Symposium on
Cloud Computing
(SoCC), 2017.
Estimating Join
Selectivities using
Bandwidth-Optimized
Kernel Density Models
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Generating Custom Code
for Efficient Query
Execution on
Heterogeneous
Processors
The VLDB Journal,
27(6), 2018.
BlockJoin:
Efficient Matrix
Partitioning Through
Joins
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Database Systems and Information Management Group (DIMA) of Volker Markl
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Optimized On-Demand Data
Streaming from Sensor Nodes
Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl
ACM Symposium on Cloud Computing (SoCC), 2017
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
The Sensor Cloud
Real-time
insights
3
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
The Sensor Cloud
Real-time
insights
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
3
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
The Sensor Cloud
Real-time
insights
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
3
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
The Sensor Cloud
Real-time
insights
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
3
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
The Sensor Cloud
Real-time
insights
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
3
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Real-time
insights
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
The Sensor Cloud – Problems
4
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Real-time
insights
Streaming all data from billions
of sensors to all applications
with maximal frequencies is impossible
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
The Sensor Cloud – Problems
4
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Real-time
insights
Streaming all data from billions
of sensors to all applications
with maximal frequencies is impossible
Increasing data rates
require expensive
system scale-out.
Billions of sensor nodes form a sensor cloud
and provide data streams to analysis systems.
The Sensor Cloud – Problems
4
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Tailor Data Streams to the Demand of Applications
• Provide an abstraction to define the data demand of applications.
• Optimize communication costs while maintaining the result accuracy.
• Share sensor reads and data transfer among users and queries.
User-Defined Sampling Functions (UDSFs)
Read-Time Optimization
Multi-Query / Multi-User Optimization
The Sensor Cloud – Solutions
5
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Architecture Overview
6
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Architecture Overview
6
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Architecture Overview
6
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Architecture Overview
6
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Architecture Overview
6
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Sensor Read Scheduling
7
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Input:
Sensor read time and value
Output:
Next Sensor Read Request
User-Defined Sampling Functions
8
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Input:
Sensor read time and value
User-Defined Sampling Functions
9
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Enable adaptive sampling techniques to reduce data transmission
e.g., Adam [Trihinas ‘15], FAST [Fan ‘14], L-SIP [Gaura ’13]
User-Defined Sampling Functions
10
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Sensor Read Fusion
11
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
1) Minimize Sensor Reads and Data Transfer:
Latest possible read time
Sensor Read Fusion
12
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
1) Minimize Sensor Reads and Data Transfer:
Latest possible read time
2) Optimize Sensor Read Times:
● Check the paper for all details on the read time optimizer!
Sensor Read Fusion
12
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Read Execution
14
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Local Filtering
15
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
● Enable adaptive filtering in combination with adaptive sampling
● Enable model-driven data acquisition
Local Filtering
15
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
• On-Demand scheduling reduces sensor reads and data transfer by up to 87%.
• The # of reads and transfers increases sub-linearly with the # of queries.
Increasing the Number of Concurrent Queries
16
independent queries
on-demand scheduling
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Further Publications on Data Streams and Sensor Data:
Optimized On-Demand Data
Streaming from Sensor Nodes
Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl
ACM Symposium on Cloud Computing (SoCC), 2017
Efficient Window
Aggregation with General
Stream Slicing
EDBT 2019
I²: Interactive Real-Time
Visualization for
Streaming Data
EDBT 2017
Resense: Transparent Record
and Replay of Sensor Data in
the Internet of Things
EDBT 2019
Database Research at TU Berlin
Up Next:
Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft
Optimized On-Demand
Data Streaming from
Sensor Nodes
ACM Symposium on
Cloud Computing
(SoCC), 2017.
Estimating Join
Selectivities using
Bandwidth-Optimized
Kernel Density Models
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Generating Custom Code
for Efficient Query
Execution on
Heterogeneous
Processors
The VLDB Journal,
27(6), 2018.
BlockJoin:
Efficient Matrix
Partitioning Through
Joins
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Database Systems and Information Management Group (DIMA) of Volker Markl
Generating Custom Code for Efficient Query
Execution on Heterogeneous Processors
Sebastian Breß, Bastian Köcher, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl
VLDB Journal, 27(6), 797-822, 2018
Heterogeneous Processors
20S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Heterogeneous Processors
20
CPUs
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Heterogeneous Processors
20
CPUs MICs
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Heterogeneous Processors
20
CPUs MICs GPUs
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Heterogeneous Processors
20
Enable databases to automatically exploit heterogeneous processors
Goal
CPUs MICs GPUs
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21
Writing efficient code for different processors is costly and error prone
Problem
Problem and Key Ideas
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21
Writing efficient code for different processors is costly and error prone
Problem
Generate custom code for each query and processor
Key Idea 1
Problem and Key Ideas
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21
Writing efficient code for different processors is costly and error prone
Problem
Generate custom code for each query and processor
Key Idea 1
Identify efficient code modifications and parameters automatically
Key Idea 2
Problem and Key Ideas
Challenges
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22
Challenges
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22
Represent code modifications in query plan
Intermediate Representation
Challenges
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22
Represent code modifications in query plan
Intermediate Representation
Select efficient parameters and code modifications
Variant Optimization
Challenges
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22
Represent code modifications in query plan
Intermediate Representation
Select efficient parameters and code modifications
Variant Optimization
Generate hardware-tailored code
Code Generation
Hawk Code Generator
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
Hawk Code Generator
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
y
a
od a o
a s
Hawk Code Generator
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
y
a
od a o
a s
No changes to SQL parser and optimizer
Alternative Execution Engine
Hawk Code Generator
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
y
a
od a o
a s
No changes to SQL parser and optimizer
Alternative Execution Engine
Execute queries on CPUs/GPUs/MICs
Multi-Processor Support
Hawk Code Generator
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
y
a
od a o
a s
No changes to SQL parser and optimizer
Alternative Execution Engine
Execute queries on CPUs/GPUs/MICs
Multi-Processor Support
Tunes code and parameters to processors
Automatic Performance Optimization
Step 1: Query Segmentation
24
CJCJ
CJ
SQL
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Step 1: Query Segmentation
24
CJCJ
CJ
SQL
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Step 1: Query Segmentation
24
SQL
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Step 2: Select Processor-Specific Code Variants
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25
Pipeline
program
Optimized Pipeline
Programs
Step 2: Select Processor-Specific Code Variants
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25
Pipeline
program
Optimized Pipeline
Programs
Variant
Optimizer
Step 2: Select Processor-Specific Code Variants
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25
Pipeline
program
Optimized Pipeline
Programs
Variant
Optimizer
Step 2: Select Processor-Specific Code Variants
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25
Pipeline
program
Optimized Pipeline
Programs
Variant
Optimizer
Step 2: Select Processor-Specific Code Variants
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25
Pipeline
program
Optimized Pipeline
Programs
Variant
Optimizer
Step 3: Generate Target Code
26
Optimized Pipeline
Programs
Code Generator
Target Code
Step 3: Generate Target Code
26
Optimized Pipeline
Programs
Code Generator
Target Code
Step 3: Generate Target Code
26
Optimized Pipeline
Programs
Code Generator
Target Code
Step 3: Generate Target Code
26
Optimized Pipeline
Programs
Code Generator
Target Code
Code Generator Details
27
Pipeline Program IR
28
SELECT id, age
FROM person
WHERE age < 25;
SQL Query Pipeline Program
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR (2)
29S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR (2)
29
LOOP(person)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR (2)
29
LOOP(person)
FILTER(age<25)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR (2)
29
LOOP(person)
FILTER(age<25)
HASH_PUT(id)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR (2)
29
LOOP(person)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications
30
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications
30
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
Memory Access Pattern
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications
30
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
Memory Access Pattern
Predication Mode
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications
30
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
Memory Access Pattern
Hash Table Implementation
Predication Mode
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications
30
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
Memory Access Pattern
Hash Table Implementation
Predication Mode
Parallelization Strategy
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications (2)
31
LOOP(table, sequential)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
LOOP(table)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications (2)
31
LOOP(table, sequential)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
FILTER(age<25)
HASH_PUT(id)
PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications (2)
31
LOOP(table, sequential)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
HASH_PUT(id)
PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications (2)
31
LOOP(table, sequential)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)PROJECT(id, age)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Modifications (2)
31
LOOP(table, sequential)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Generating Code: Sequential Memory Access
32
int thread_id = get_thread_id();
start=start_idx(thread_id, num_rows);
end=end_idx(thread_id, num_rows);
for(tid=start;tid<end;tid+=1){
if(age[id] < 25){
OUTPUT(id[tid], age[tid]);
}
}
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Memory Access Patterns
33S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Rewrite
80
LOOP(table, coalesced)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
LOOP(table, sequential)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline Program IR: Rewrite
81
LOOP(table, coalesced)
FILTER(age<25, branched)
HASH_PUT(id, linear_probing)
PROJECT(id, age, single-pass)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Generating Code: Coalesced Memory Access
82
int thread_id = get_thread_id();
int num_threads= get_num_threads();
for(id=thread_id;id<num_rows;
id+=num_threads){
if(age[id] < 25){
OUTPUT(id[tid], age[tid]);
}
}
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Generating Code: Coalesced Memory Access
83
int thread_id = get_thread_id();
int num_threads= get_num_threads();
for(id=thread_id;id<num_rows;
id+=num_threads){
if(age[id] < 25){
OUTPUT(id[tid], age[tid]);
}
}
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Pipeline programs provide fine-grained control over generated code
Performance: Memory Access Patterns
84S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Code Variant Optimization
37
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38
Change to a pipeline program that conserves the semantic but changes the code
Modification
Terminology
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38
Change to a pipeline program that conserves the semantic but changes the code
Modification
Provides value for each supported modification, defines the generated code
Variant configuration
Terminology
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38
Change to a pipeline program that conserves the semantic but changes the code
Modification
Provides value for each supported modification, defines the generated code
Variant configuration
Compilation result of a pipeline program
Code variant
Terminology
Variant Optimization
39S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Variant Optimization
39
Derive an efficient code variant for each processor
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Variant Optimization
39
Derive an efficient code variant for each processor
Perform an offline calibration phase on a test workload
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Variant Optimization
39
Derive an efficient code variant for each processor
Perform an offline calibration phase on a test workload
Explore the impact of each code modification separately
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Initial
Variant
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Initial
Variant
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
40
Slow
FastVariant Space
Variant Optimization - Algorithm
41
Slow
FastVariant Space
Variant 1
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
42
Slow
FastVariant Space
Variant Optimization - Algorithm
43
Slow
FastVariant Space
Variant 2
Search Algorithm
44S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Search Algorithm
44
Finds an efficient variant with linear run-time in the number of dimensions
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Search Algorithm
44
Finds an efficient variant with linear run-time in the number of dimensions
Code modifications are not strictly orthogonal (space not convex)
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Search Algorithm
44
Finds an efficient variant with linear run-time in the number of dimensions
Code modifications are not strictly orthogonal (space not convex)
Perform multiple iterations of the algorithm to find best code variant
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45
Optimizing Search Time
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45
Terminate the search if no faster variant is found during an iteration
Early Termination
Optimizing Search Time
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45
Terminate the search if no faster variant is found during an iteration
Early Termination
Explore the parameter values of the most critical modifications first
Feature Ordering
Optimizing Search Time
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45
Terminate the search if no faster variant is found during an iteration
Early Termination
Explore the parameter values of the most critical modifications first
Feature Ordering
Only include code modifications that change the code
Nested Modifications
Optimizing Search Time
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 46
Evaluation of Search Time
Variant exploration times for SSB Q4.1 on SF1
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 46
Evaluation of Search Time
Our strategy outperforms backtracking by up to two orders of magnitude
Variant exploration times for SSB Q4.1 on SF1
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47
Handling Query Dependencies
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47
Variant configuration of a processor serves as starting point for further tuning
Reuse Variant Configurations
Handling Query Dependencies
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47
Variant configuration of a processor serves as starting point for further tuning
Reuse Variant Configurations
Set a query-dependent modification to another parameter value when we
expect a performance improvement
Heuristic-Based Rewrites
Handling Query Dependencies
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47
Variant configuration of a processor serves as starting point for further tuning
Reuse Variant Configurations
Set a query-dependent modification to another parameter value when we
expect a performance improvement
Heuristic-Based Rewrites
Switch to software predication in FILTER when selectivity is 50%
Example: Software Predication
Handling Query Dependencies
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48
Query Compilation Times
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48
Query Compilation Times
Compilation times of OpenCL are in the order of hundreds of milliseconds
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48
Query Compilation Times
Compilation times of OpenCL are in the order of hundreds of milliseconds
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48
Query Compilation Times
Compilation times of OpenCL are in the order of hundreds of milliseconds
Compilation times grow linear with the number of pipelines in a query
Evaluation Results
49
1
1
1
1
1
1
7
11
1
1
1 1 1
1
1
1
17
1
1
1
1
1
1
1
1
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Evaluation Results
49
1
1
1
1
1
1
7
11
1
1
1 1 1
1
1
1
17
1
1
1
1
1
1
1
1
Performance difference among variants up to two orders of magnitude
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Evaluation Results
49
1
1
1
1
1
1
7
11
1
1
1 1 1
1
1
1
17
1
1
1
1
1
1
1
1
Performance difference among variants up to two orders of magnitude
Hawk reliably identifies efficient code variants for CPUs, GPUs, MICs
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Evaluation Results
49
1
1
1
1
1
1
7
11
1
1
1 1 1
1
1
1
17
1
1
1
1
1
1
1
1
Performance difference among variants up to two orders of magnitude
Hawk reliably identifies efficient code variants for CPUs, GPUs, MICs
Best code depends on query characteristics
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Conclusion
50S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Conclusion
50
A hardware-tailored code generator
Hawk
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Conclusion
50
A hardware-tailored code generator
Hawk
Produce custom code variants for each processor
Code Variant Generation
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Conclusion
50
A hardware-tailored code generator
Hawk
Produce custom code variants for each processor
Code Variant Generation
No manual tuning for a specific processor
Variant Optimization
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
https://github.com/TU-Berlin-DIMA/Hawk-VLDBJ
Conclusion
50
A hardware-tailored code generator
Hawk
Produce custom code variants for each processor
Code Variant Generation
No manual tuning for a specific processor
Variant Optimization
S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
Further Publications on Data Management on Modern Hardware:
Generating Custom Code for Efficient Query
Execution on Heterogeneous Processors
Sebastian Breß, Bastian Köcher, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl
VLDB Journal, 27(6), 797-822, 2018
Pipelined Query Processing in
Coprocessor Environments
SIGMOD 2018
Efficient and Scalable
k-Means on GPUs.
Datenbank-Spektrum 2018
Analyzing Efficient Stream
Processing on Modern
Hardware
PVLDB 2019
Database Research at TU Berlin
Up Next:
Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft
Optimized On-Demand
Data Streaming from
Sensor Nodes
ACM Symposium on
Cloud Computing
(SoCC), 2017.
Estimating Join
Selectivities using
Bandwidth-Optimized
Kernel Density Models
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Generating Custom Code
for Efficient Query
Execution on
Heterogeneous
Processors
The VLDB Journal,
27(6), 2018.
BlockJoin:
Efficient Matrix
Partitioning Through
Joins
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Database Systems and Information Management Group (DIMA) of Volker Markl
GPU-Accelerated Join Selectivity Estimation using
KDE Models
Paper:
Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models,
Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl
PVLDB, Volume 10 Issue 13, September 2017
GPU-Accelerated Kernel Density Estimation for
Join Selectivity Estimation
54
Query Optimizer
Database Engine
Query
Plan
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
GPU-Accelerated Kernel Density Estimation for
Join Selectivity Estimation
54
Query Optimizer
Database Engine
Statistical CoprocessorQuery
Plan
Parameters
Estimates
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
GPU-Accelerated Kernel Density Estimation for
Join Selectivity Estimation
54
Query Optimizer
Database Engine
Statistical CoprocessorQuery
Plan
Parameters
Estimates
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
Dataset
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
Dataset Sample 𝑆
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
Dataset Sample 𝑆 Kernels 𝐾 𝐻
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
55
෠𝑃 𝐻 Ԧ𝑥 =
1
|𝑆|
෍
𝑖=1
|𝑆|
𝐾 𝐻 𝑠𝑖, Ԧ𝑥
Average… … over the kernel contributions
Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators
56
Average… … over the kernel contributions
Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻
Ω Ω
sel Ω =
1
|𝑆|
෍
𝑖=1
|𝑆|
න
Ω
𝐾 𝐻(𝑠𝑖, Ԧ𝑥) 𝑑 Ԧ𝑥
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Background: Kernel Density Estimators for Multi-
Dimensional Selectivity Estimation [1]
57
Good fit Overfit Underfit
The bandwidth matrix 𝐻 controls the smoothing applied on the
sample
• Range selections over base tables
• Bandwidth optimization based on the estimation error
• Easy model maintenance
[1] Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation, SIGMOD’15
The Problem:
Multi-Dimensional Join Selectivity Estimation
• and generalization to multiple joins
• Databases: Independence Assumption
• Often violated
• Introduce large errors, potentially bad query plans
• Research: Various Methods (e.g. Sampling, Sketches)
• Our Approach: Kernel Density Estimators
58Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Why KDEs for Join Selectivities?
• Multivariate Estimator
• No independence assumption
• Hybrid between samples and histograms
• Small bandwidth: Sample evaluation
• Increasing bandwidth: More smoothing, increasing bucket sizes
• Bandwidth optimization selects proper bandwidth
59Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
𝑃(𝑐1 ∧ 𝑐2)Compute:
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Sample from 𝑅1 Sample from 𝑅2
𝑃(𝑐1 ∧ 𝑐2)Compute:
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Bandwidth 𝐻
Sample from 𝑅1
Base Table KDE Model
(𝑷 𝟏)
Bandwidth 𝐻
Sample from 𝑅2
Base Table KDE Model
(𝑷 𝟐)
𝑃(𝑐1 ∧ 𝑐2)Compute:
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Bandwidth 𝐻
Sample from 𝑅1
Base Table KDE Model
(𝑷 𝟏)
Bandwidth 𝐻
Sample from 𝑅2
Base Table KDE Model
(𝑷 𝟐)
𝑃(𝑐1 ∧ 𝑐2) Compute: ෍
𝑣∈𝐴
𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute:
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Bandwidth 𝐻
Sample from 𝑅1
Base Table KDE Model
(𝑷 𝟏)
Bandwidth 𝐻
Sample from 𝑅2
Base Table KDE Model
(𝑷 𝟐)
𝑃(𝑐1 ∧ 𝑐2) Compute: ෍
𝑣∈𝐴
𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute:
Easy to evaluate, better estimates
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
The Approach: Join and Base Table Models
60
Bandwidth 𝐻
Sample from
𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1
𝑅2
Join KDE Model (𝑷)
Bandwidth 𝐻
Sample from 𝑅1
Base Table KDE Model
(𝑷 𝟏)
Bandwidth 𝐻
Sample from 𝑅2
Base Table KDE Model
(𝑷 𝟐)
𝑃(𝑐1 ∧ 𝑐2) Compute: ෍
𝑣∈𝐴
𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute:
Easy to evaluate, better estimates
Support for base table and join selectivities
Easy to construct and to maintain
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Table Model: Computation Components
61
Selectivity:
Table Model: Computation Components
61
Sum over cross
product of two
samples
Selectivity:
Table Model: Computation Components
61
Sum over cross
product of two
samples Invariant Contributions:
Contribution of sample
points wrt. selection
predicate
Selectivity:
Table Model: Computation Components
61
Sum over cross
product of two
samples Cross Contribution:
Distance function on join
attributes of sample points
Invariant Contributions:
Contribution of sample
points wrt. selection
predicate
Selectivity:
Table Model: Sample Pruning
9
Table Model: Sample Pruning
9
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑡1
(5)
Sample 1
Table Model: Sample Pruning
9
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑡1
(5)
Compute
Sample 1
Table Model: Sample Pruning
9
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑡1
(5)
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑝1
(1)
𝑝1
(2)
𝑝1
(3)
𝑝1
(4)
𝑡1
(5)
𝑝1
(5)
Compute
Sample 1
Table Model: Sample Pruning
9
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑡1
(5)
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑝1
(1)
𝑝1
(2)
𝑝1
(3)
𝑝1
(4)
𝑡1
(5)
𝑝1
(5)
𝑡1
(1)
𝑡1
(4)
𝑝1
(1)
𝑝1
(4)
Compute
Filter by
contribution
Sample 1
Table Model: Cross Pruning
63
Table Model: Cross Pruning
63
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑝1
(1)
𝑝1
(2)
𝑝1
(3)
𝑝1
(4)
𝑡1
(5)
𝑝1
(5)
Sample 1
Table Model: Cross Pruning
63
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑝1
(1)
𝑝1
(2)
𝑝1
(3)
𝑝1
(4)
𝑡1
(5)
𝑝1
(5)
𝑡2
(1)
𝑡2
(2)
𝑡2
(3)
𝑡2
(4)
𝑝2
(1)
𝑝2
(2)
𝑝2
(3)
𝑝2
(4)
𝑡2
(5)
𝑝2
(5)
Sample 1
Sample 2
(Sorted on join attribute)
Table Model: Cross Pruning
63
𝑡1
(1)
𝑡1
(2)
𝑡1
(3)
𝑡1
(4)
𝑝1
(1)
𝑝1
(2)
𝑝1
(3)
𝑝1
(4)
𝑡1
(5)
𝑝1
(5)
𝑡2
(1)
𝑡2
(2)
𝑡2
(3)
𝑡2
(4)
𝑝2
(1)
𝑝2
(2)
𝑝2
(3)
𝑝2
(4)
𝑡2
(5)
𝑝2
(5)
𝑡1
𝑖
. 𝐴 − 𝑡2
𝑗
. 𝐴 < 𝜃
Sample 1
Sample 2
(Sorted on join attribute)
Evaluation: Scaling the Model Size
(Postgres)
64
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(Table Sample)
65
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(Correlated Sample)
66
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(AGMS Sketch)
67
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(Join Sample)
68
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(Join Sample + KDE)
69
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Evaluation: Scaling the Model Size
(Table Sample + KDE)
70
Dataset: DMV
Query: Q1U
Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Runtime: CPU vs GPU
Dataset: IMDB
Workload: Q1U
GPU: Tesla V100
CPU: Intel Xeon Gold 5115
TS+KDE:
4x faster
JS+KDE:
5x faster
0,1
1
10
100
1% 2% 4% 8% 16%
AverageEstimationTime(ms)
Sample Size (Relative to Base Table Size)
TS+KDE (GPU) TS+KDE (CPU) JS+KDE (GPU) JS+KDE (CPU)
71Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
Conclusion
• KDE models for join selectivity estimation
• “Getting most out of your sample”
• Based on join or base table KDE models
• Learning hybrid between histograms and samples
• GPU-acceleration possible
• Experiments, data, and code online
72
github.com/martinkiefer/join-kde
“Estimating Join Selectivities using Bandwidth-
Optimized Kernel Density Models”, PVLDB 17
Further Publications on GPU-Accelerated Kernel Density Estimation:
Estimating Join Selectivities using Bandwidth-
Optimized Kernel Density Models
Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl
Proceedings of the VLDB Endowment, 10(13), 2017
Demonstrating Transfer-Efficient
Sample Maintenance on Graphics
Cards
EDBT 2015
Self-Tuning, GPU-Accelerated Kernel
Density Models for Multidimensional
Selectivity Estimation
SIGMOD 2015
Database Research at TU Berlin
Up Next:
Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft
Optimized On-Demand
Data Streaming from
Sensor Nodes
ACM Symposium on
Cloud Computing
(SoCC), 2017.
Estimating Join
Selectivities using
Bandwidth-Optimized
Kernel Density Models
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Generating Custom Code
for Efficient Query
Execution on
Heterogeneous
Processors
The VLDB Journal,
27(6), 2018.
BlockJoin:
Efficient Matrix
Partitioning Through
Joins
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Database Systems and Information Management Group (DIMA) of Volker Markl
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
BlockJoin: Efficient Matrix
Partitioning Through Joins
Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, Volker Markl
PVLDB, Volume 10 Issue 13, September 2017
76
Common Pattern in end-to-end machine learning pipelines
1. Relational operators e.g., join and filter the input data
2. User-defined functions e.g., feature transformation and vectorization
3. Linear algebra operators e.g., model training and cross-validation
INTRODUCTION
⋈ ML𝒇
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
77
Parallel Dataflow engines implement
• Relational operators on row-partitioned datasets
• Linear algebra operators on block-partitioned matrices
INTRODUCTION
⋈ ML𝒇
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
78
Parallel Dataflow engines implement
• Relational operators on row-partitioned datasets
• Linear algebra operators on block-partitioned matrices
>> Pipelines combining both require expensive re-partitioning (shuffle) steps
INTRODUCTION
⋈ ML𝒇
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
STANDARD WORKFLOW
79
⋈
Join Result
Row-wise
Products
Reviews
PK
FK
P1 1 1 1 1
P2 2 2 2 2
P1 1 3 3 3
P1 1 4 4 4
P1 1
P2 2
P3 3
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
STANDARD WORKFLOW
80
0
0
1 1
2 2
0
1
1 3
1 4
⋈
Join Result
Row-wise
0 1 1 1 1
1 2 2 2 2
2 1 3 3 3
3 1 4 4 4
Global row-index
Row-wise
1 3
1 4
Matrix
block-partitioned
Products
Reviews
PK
FK
1
0
1 1
2 2
1
1
3 3
4 4
P1 1 1 1 1
P2 2 2 2 2
P1 1 3 3 3
P1 1 4 4 4
P1 1
P2 2
P3 3
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
STANDARD WORKFLOW - PROBLEMS
81
0
0
1 1
2 2
0
1
1 3
1 4
⋈
Join Result
Row-wise
0 1 1 1 1
1 2 2 2 2
2 1 3 3 3
3 1 4 4 4
Global row-index
Row-wise
1 3
1 4
Matrix
block-partitioned
Products
Reviews
PK
FK
1
0
1 1
2 2
1
1
3 3
4 4
P1 1 1 1 1
P2 2 2 2 2
P1 1 3 3 3
P1 1 4 4 4
P1 1
P2 2
P3 3
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
Distributed
Join
Re-
Partitioning
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
0
0
1 1
2 2
0
1
1 3
1 4
STANDARD WORKFLOW - PROBLEMS
82
⋈
Join Result
Row-wise
0 1 1 1 1
1 2 2 2 2
2 1 3 3 3
3 1 4 4 4
Global row-index
Row-wise
1 3
1 4
Matrix
block-partitioned
Materializes the join result, just to apply sequential row-index:
• Shuffles data for row-wise partitioning , which is split up immediately
• Puts heavy load on a few machines in case of skewed keys
• Forces early matrix block materialization
Products
Reviews
PK
FK
1
0
1 1
2 2
1
1
3 3
4 4
P1 1 1 1 1
P2 2 2 2 2
P1 1 3 3 3
P1 1 4 4 4
P1 1
P2 2
P3 3
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
Distributed
Join
Re-
Partitioning
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
• We propose
Specialized operators at the intersection of linear and relational algebra
• Here, we focus on
Efficient creation of block-partitioned results from normalized data
83
HOW CAN WE IMPROVE?
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
OUR APPROACH
84
Prune Apply row-index
1 1
2 2
1 3
1 4
1 1
2 2
3 3
4 4
Block-partitioned matrix
P1 1
P2 2
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
0 1
1 2
2 1
3 1
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
Local
TID-
Join
Products
Reviews
PK
FK
Local Join Kernel Distributed Fetch Kernel
P1 1
P2 2
P3 3
P1 1 1 1
P2 2 2 2
P1 3 3 3
P1 4 4 4
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
OUR APPROACH
Creates block-partitioned results from normalized data
JOIN KERNEL: Local TID-Join on driver to create block-index meta-data
1. Meta-data provides mapping of TID to row-index for both relations
2. Row-index is applied independently: no materialization of join result
85BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
OUR APPROACH
Creates block-partitioned results from normalized data
JOIN KERNEL: Local TID-Join on driver to create block-index meta-data
FETCH KERNEL: Materialization strategy of matrix blocks based on matrix shape:
• Late materialization: Blocks are materialized on the receiver node
|PK columns| >> |FK columns|
• Early materialization: Blocks are materialized on the sender node
|PK columns| << |FK columns|
86BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
Evaluation
87BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
PK – FK JOIN
PK Table: 100k rows, scaling columns
FK Table: 1m rows, 5k columns
88
b. Power-law distributed FKsa. Uniform distributed FKs
up to 2.5x speedup
skew resistant,
while the baseline fails
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
PK – FK JOIN
PK Table: 100k rows, scaling columns
FK Table: 1m rows, 5k columns
89
b. Power-law distributed FKsa. Uniform distributed FKs
BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
RECAP
BlockJoin is a logically fused operator pipeline
• Separation of matrix index creation and matrix materialization
> No materialization of join result
> Skew resistant
• Cost based block materialization based on data shape
> Late materialization
> Early materialization
90BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17
Further Publications:
BlockJoin:
Efficient Matrix PartitioningThrough Joins
Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, and Volker Markl.
PVLDB 10.13, 2017
Bridging the gap: towards
optimization across linear
and relational algebra
BeyondMR 2016
Implicit Parallelism
through Deep Language
Embedding
SIGMOD 2015
ScootR: Scaling R
Dataframes on Dataflow
Systems
SoCC 2018
Database Research at TU Berlin
Today‘s Talks:
Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft
Optimized On-Demand
Data Streaming from
Sensor Nodes
ACM Symposium on
Cloud Computing
(SoCC), 2017.
Estimating Join
Selectivities using
Bandwidth-Optimized
Kernel Density Models
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Generating Custom Code
for Efficient Query
Execution on
Heterogeneous
Processors
The VLDB Journal,
27(6), 2018.
BlockJoin:
Efficient Matrix
Partitioning Through
Joins
Proceedings of the
VLDB Endowment
(PVLDB), 2017.
Database Systems and Information Management Group (DIMA) of Volker Markl

Weitere ähnliche Inhalte

Was ist angesagt?

How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
IRJET - A Review on Crypto-Algorithm using Different Hardware
IRJET -  	  A Review on Crypto-Algorithm using Different HardwareIRJET -  	  A Review on Crypto-Algorithm using Different Hardware
IRJET - A Review on Crypto-Algorithm using Different HardwareIRJET Journal
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsVijay Karan
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...JAYAPRAKASH JPINFOTECH
 
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...IJECEIAES
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Vortex: The Intelligent Data Sharing Platform for the Internet of Things
Vortex: The Intelligent Data Sharing Platform for the Internet of ThingsVortex: The Intelligent Data Sharing Platform for the Internet of Things
Vortex: The Intelligent Data Sharing Platform for the Internet of ThingsAngelo Corsaro
 
32 9139 it rtl modelling for the cipher blcok ((edit lafi)
32 9139 it   rtl modelling for the cipher blcok ((edit lafi)32 9139 it   rtl modelling for the cipher blcok ((edit lafi)
32 9139 it rtl modelling for the cipher blcok ((edit lafi)IAESIJEECS
 
Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill MapR Technologies
 
Identifying Opportunities to Improve Efficiency in HPC Clusters
Identifying Opportunities to Improve Efficiency in HPC ClustersIdentifying Opportunities to Improve Efficiency in HPC Clusters
Identifying Opportunities to Improve Efficiency in HPC Clustersinside-BigData.com
 
From the Pacific Research Platform to a National Research Platform
From the Pacific Research Platform to a National Research PlatformFrom the Pacific Research Platform to a National Research Platform
From the Pacific Research Platform to a National Research PlatformLarry Smarr
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Double Server Public Key Encryption with Keyword Search for Secure Cloud Storage
Double Server Public Key Encryption with Keyword Search for Secure Cloud StorageDouble Server Public Key Encryption with Keyword Search for Secure Cloud Storage
Double Server Public Key Encryption with Keyword Search for Secure Cloud Storageijtsrd
 

Was ist angesagt? (16)

How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
IRJET - A Review on Crypto-Algorithm using Different Hardware
IRJET -  	  A Review on Crypto-Algorithm using Different HardwareIRJET -  	  A Review on Crypto-Algorithm using Different Hardware
IRJET - A Review on Crypto-Algorithm using Different Hardware
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing Projects
 
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
Enabling Efficient and Geometric Range Query with Access Control over Encrypt...
 
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...
Design and implementation of proposed 320 bit RC6-cascaded encryption/decrypt...
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Vortex: The Intelligent Data Sharing Platform for the Internet of Things
Vortex: The Intelligent Data Sharing Platform for the Internet of ThingsVortex: The Intelligent Data Sharing Platform for the Internet of Things
Vortex: The Intelligent Data Sharing Platform for the Internet of Things
 
32 9139 it rtl modelling for the cipher blcok ((edit lafi)
32 9139 it   rtl modelling for the cipher blcok ((edit lafi)32 9139 it   rtl modelling for the cipher blcok ((edit lafi)
32 9139 it rtl modelling for the cipher blcok ((edit lafi)
 
Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill Berlin Hadoop Get Together Apache Drill
Berlin Hadoop Get Together Apache Drill
 
Identifying Opportunities to Improve Efficiency in HPC Clusters
Identifying Opportunities to Improve Efficiency in HPC ClustersIdentifying Opportunities to Improve Efficiency in HPC Clusters
Identifying Opportunities to Improve Efficiency in HPC Clusters
 
From the Pacific Research Platform to a National Research Platform
From the Pacific Research Platform to a National Research PlatformFrom the Pacific Research Platform to a National Research Platform
From the Pacific Research Platform to a National Research Platform
 
25
2525
25
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Double Server Public Key Encryption with Keyword Search for Secure Cloud Storage
Double Server Public Key Encryption with Keyword Search for Secure Cloud StorageDouble Server Public Key Encryption with Keyword Search for Secure Cloud Storage
Double Server Public Key Encryption with Keyword Search for Secure Cloud Storage
 

Ähnlich wie Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019

UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...Jonas Traub
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Jonas Traub
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoTDat Tran
 
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Boris Adryan
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009Ian Foster
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaSpark Summit
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsJen Aman
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1bdemchak
 
Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017Boris Adryan
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsAccumulo Summit
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...Jonas Traub
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsClusterpoint
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Mark Goldstein
 
A Knowledge-based Approach for Real-Time IoT Stream Annotation and Processing
A Knowledge-based Approach for Real-Time IoT Stream Annotation and ProcessingA Knowledge-based Approach for Real-Time IoT Stream Annotation and Processing
A Knowledge-based Approach for Real-Time IoT Stream Annotation and ProcessingPayamBarnaghi
 
Edge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeEdge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeShuquan Huang
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overviewBigData_Europe
 

Ähnlich wie Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019 (20)

UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoT
 
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1
 
AF-2599-P.docx
AF-2599-P.docxAF-2599-P.docx
AF-2599-P.docx
 
Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutions
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
 
IoT meets Big Data
IoT meets Big DataIoT meets Big Data
IoT meets Big Data
 
A Knowledge-based Approach for Real-Time IoT Stream Annotation and Processing
A Knowledge-based Approach for Real-Time IoT Stream Annotation and ProcessingA Knowledge-based Approach for Real-Time IoT Stream Annotation and Processing
A Knowledge-based Approach for Real-Time IoT Stream Annotation and Processing
 
Edge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeEdge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-time
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
Big data for MNO
Big data for MNOBig data for MNO
Big data for MNO
 

Mehr von Jonas Traub

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Jonas Traub
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...Jonas Traub
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...Jonas Traub
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Jonas Traub
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Jonas Traub
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Jonas Traub
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingJonas Traub
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingJonas Traub
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Jonas Traub
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLJonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataJonas Traub
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)Jonas Traub
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisJonas Traub
 

Mehr von Jonas Traub (14)

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
 
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentat...
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 

Kürzlich hochgeladen

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 

Kürzlich hochgeladen (20)

GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 

Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019

  • 1. Database Research at TU Berlin Today‘s Talks: Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft Optimized On-Demand Data Streaming from Sensor Nodes ACM Symposium on Cloud Computing (SoCC), 2017. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models Proceedings of the VLDB Endowment (PVLDB), 2017. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors The VLDB Journal, 27(6), 2018. BlockJoin: Efficient Matrix Partitioning Through Joins Proceedings of the VLDB Endowment (PVLDB), 2017. Database Systems and Information Management Group (DIMA) of Volker Markl
  • 2. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Optimized On-Demand Data Streaming from Sensor Nodes Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl ACM Symposium on Cloud Computing (SoCC), 2017
  • 3. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 The Sensor Cloud Real-time insights 3
  • 4. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 The Sensor Cloud Real-time insights Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. 3
  • 5. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 The Sensor Cloud Real-time insights Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. 3
  • 6. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 The Sensor Cloud Real-time insights Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. 3
  • 7. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 The Sensor Cloud Real-time insights Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. 3
  • 8. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Real-time insights Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. The Sensor Cloud – Problems 4
  • 9. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Real-time insights Streaming all data from billions of sensors to all applications with maximal frequencies is impossible Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. The Sensor Cloud – Problems 4
  • 10. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Real-time insights Streaming all data from billions of sensors to all applications with maximal frequencies is impossible Increasing data rates require expensive system scale-out. Billions of sensor nodes form a sensor cloud and provide data streams to analysis systems. The Sensor Cloud – Problems 4
  • 11. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Tailor Data Streams to the Demand of Applications • Provide an abstraction to define the data demand of applications. • Optimize communication costs while maintaining the result accuracy. • Share sensor reads and data transfer among users and queries. User-Defined Sampling Functions (UDSFs) Read-Time Optimization Multi-Query / Multi-User Optimization The Sensor Cloud – Solutions 5
  • 12. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Architecture Overview 6
  • 13. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Architecture Overview 6
  • 14. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Architecture Overview 6
  • 15. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Architecture Overview 6
  • 16. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Architecture Overview 6
  • 17. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Sensor Read Scheduling 7
  • 18. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Input: Sensor read time and value Output: Next Sensor Read Request User-Defined Sampling Functions 8
  • 19. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Input: Sensor read time and value User-Defined Sampling Functions 9
  • 20. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Enable adaptive sampling techniques to reduce data transmission e.g., Adam [Trihinas ‘15], FAST [Fan ‘14], L-SIP [Gaura ’13] User-Defined Sampling Functions 10
  • 21. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Sensor Read Fusion 11
  • 22. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 1) Minimize Sensor Reads and Data Transfer: Latest possible read time Sensor Read Fusion 12
  • 23. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 1) Minimize Sensor Reads and Data Transfer: Latest possible read time 2) Optimize Sensor Read Times: ● Check the paper for all details on the read time optimizer! Sensor Read Fusion 12
  • 24. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Read Execution 14
  • 25. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Local Filtering 15
  • 26. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 ● Enable adaptive filtering in combination with adaptive sampling ● Enable model-driven data acquisition Local Filtering 15
  • 27. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 • On-Demand scheduling reduces sensor reads and data transfer by up to 87%. • The # of reads and transfers increases sub-linearly with the # of queries. Increasing the Number of Concurrent Queries 16 independent queries on-demand scheduling
  • 28. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Further Publications on Data Streams and Sensor Data: Optimized On-Demand Data Streaming from Sensor Nodes Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl ACM Symposium on Cloud Computing (SoCC), 2017 Efficient Window Aggregation with General Stream Slicing EDBT 2019 I²: Interactive Real-Time Visualization for Streaming Data EDBT 2017 Resense: Transparent Record and Replay of Sensor Data in the Internet of Things EDBT 2019
  • 29. Database Research at TU Berlin Up Next: Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft Optimized On-Demand Data Streaming from Sensor Nodes ACM Symposium on Cloud Computing (SoCC), 2017. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models Proceedings of the VLDB Endowment (PVLDB), 2017. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors The VLDB Journal, 27(6), 2018. BlockJoin: Efficient Matrix Partitioning Through Joins Proceedings of the VLDB Endowment (PVLDB), 2017. Database Systems and Information Management Group (DIMA) of Volker Markl
  • 30. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors Sebastian Breß, Bastian Köcher, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl VLDB Journal, 27(6), 797-822, 2018
  • 31. Heterogeneous Processors 20S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 32. Heterogeneous Processors 20 CPUs S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 33. Heterogeneous Processors 20 CPUs MICs S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 34. Heterogeneous Processors 20 CPUs MICs GPUs S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 35. Heterogeneous Processors 20 Enable databases to automatically exploit heterogeneous processors Goal CPUs MICs GPUs S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 36. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21 Writing efficient code for different processors is costly and error prone Problem Problem and Key Ideas
  • 37. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21 Writing efficient code for different processors is costly and error prone Problem Generate custom code for each query and processor Key Idea 1 Problem and Key Ideas
  • 38. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 21 Writing efficient code for different processors is costly and error prone Problem Generate custom code for each query and processor Key Idea 1 Identify efficient code modifications and parameters automatically Key Idea 2 Problem and Key Ideas
  • 39. Challenges S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22
  • 40. Challenges S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22 Represent code modifications in query plan Intermediate Representation
  • 41. Challenges S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22 Represent code modifications in query plan Intermediate Representation Select efficient parameters and code modifications Variant Optimization
  • 42. Challenges S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 22 Represent code modifications in query plan Intermediate Representation Select efficient parameters and code modifications Variant Optimization Generate hardware-tailored code Code Generation
  • 43. Hawk Code Generator S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23
  • 44. Hawk Code Generator S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23 y a od a o a s
  • 45. Hawk Code Generator S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23 y a od a o a s No changes to SQL parser and optimizer Alternative Execution Engine
  • 46. Hawk Code Generator S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23 y a od a o a s No changes to SQL parser and optimizer Alternative Execution Engine Execute queries on CPUs/GPUs/MICs Multi-Processor Support
  • 47. Hawk Code Generator S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 23 y a od a o a s No changes to SQL parser and optimizer Alternative Execution Engine Execute queries on CPUs/GPUs/MICs Multi-Processor Support Tunes code and parameters to processors Automatic Performance Optimization
  • 48. Step 1: Query Segmentation 24 CJCJ CJ SQL S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 49. Step 1: Query Segmentation 24 CJCJ CJ SQL S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 50. Step 1: Query Segmentation 24 SQL S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 51. Step 2: Select Processor-Specific Code Variants S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25 Pipeline program Optimized Pipeline Programs
  • 52. Step 2: Select Processor-Specific Code Variants S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25 Pipeline program Optimized Pipeline Programs Variant Optimizer
  • 53. Step 2: Select Processor-Specific Code Variants S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25 Pipeline program Optimized Pipeline Programs Variant Optimizer
  • 54. Step 2: Select Processor-Specific Code Variants S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25 Pipeline program Optimized Pipeline Programs Variant Optimizer
  • 55. Step 2: Select Processor-Specific Code Variants S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 25 Pipeline program Optimized Pipeline Programs Variant Optimizer
  • 56. Step 3: Generate Target Code 26 Optimized Pipeline Programs Code Generator Target Code
  • 57. Step 3: Generate Target Code 26 Optimized Pipeline Programs Code Generator Target Code
  • 58. Step 3: Generate Target Code 26 Optimized Pipeline Programs Code Generator Target Code
  • 59. Step 3: Generate Target Code 26 Optimized Pipeline Programs Code Generator Target Code
  • 61. Pipeline Program IR 28 SELECT id, age FROM person WHERE age < 25; SQL Query Pipeline Program S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 62. Pipeline Program IR (2) 29S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 63. Pipeline Program IR (2) 29 LOOP(person) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 64. Pipeline Program IR (2) 29 LOOP(person) FILTER(age<25) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 65. Pipeline Program IR (2) 29 LOOP(person) FILTER(age<25) HASH_PUT(id) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 66. Pipeline Program IR (2) 29 LOOP(person) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 67. Pipeline Program IR: Modifications 30 LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 68. Pipeline Program IR: Modifications 30 LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) Memory Access Pattern S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 69. Pipeline Program IR: Modifications 30 LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) Memory Access Pattern Predication Mode S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 70. Pipeline Program IR: Modifications 30 LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) Memory Access Pattern Hash Table Implementation Predication Mode S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 71. Pipeline Program IR: Modifications 30 LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) Memory Access Pattern Hash Table Implementation Predication Mode Parallelization Strategy S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 72. Pipeline Program IR: Modifications (2) 31 LOOP(table, sequential) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) LOOP(table) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 73. Pipeline Program IR: Modifications (2) 31 LOOP(table, sequential) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) FILTER(age<25) HASH_PUT(id) PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 74. Pipeline Program IR: Modifications (2) 31 LOOP(table, sequential) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) HASH_PUT(id) PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 75. Pipeline Program IR: Modifications (2) 31 LOOP(table, sequential) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass)PROJECT(id, age) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 76. Pipeline Program IR: Modifications (2) 31 LOOP(table, sequential) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 77. Generating Code: Sequential Memory Access 32 int thread_id = get_thread_id(); start=start_idx(thread_id, num_rows); end=end_idx(thread_id, num_rows); for(tid=start;tid<end;tid+=1){ if(age[id] < 25){ OUTPUT(id[tid], age[tid]); } } S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 78. Memory Access Patterns 33S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 79. Pipeline Program IR: Rewrite 80 LOOP(table, coalesced) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) LOOP(table, sequential) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 80. Pipeline Program IR: Rewrite 81 LOOP(table, coalesced) FILTER(age<25, branched) HASH_PUT(id, linear_probing) PROJECT(id, age, single-pass) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 81. Generating Code: Coalesced Memory Access 82 int thread_id = get_thread_id(); int num_threads= get_num_threads(); for(id=thread_id;id<num_rows; id+=num_threads){ if(age[id] < 25){ OUTPUT(id[tid], age[tid]); } } S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 82. Generating Code: Coalesced Memory Access 83 int thread_id = get_thread_id(); int num_threads= get_num_threads(); for(id=thread_id;id<num_rows; id+=num_threads){ if(age[id] < 25){ OUTPUT(id[tid], age[tid]); } } S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 Pipeline programs provide fine-grained control over generated code
  • 83. Performance: Memory Access Patterns 84S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 85. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38 Change to a pipeline program that conserves the semantic but changes the code Modification Terminology
  • 86. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38 Change to a pipeline program that conserves the semantic but changes the code Modification Provides value for each supported modification, defines the generated code Variant configuration Terminology
  • 87. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 38 Change to a pipeline program that conserves the semantic but changes the code Modification Provides value for each supported modification, defines the generated code Variant configuration Compilation result of a pipeline program Code variant Terminology
  • 88. Variant Optimization 39S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 89. Variant Optimization 39 Derive an efficient code variant for each processor S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 90. Variant Optimization 39 Derive an efficient code variant for each processor Perform an offline calibration phase on a test workload S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 91. Variant Optimization 39 Derive an efficient code variant for each processor Perform an offline calibration phase on a test workload Explore the impact of each code modification separately S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 92. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 93. Variant Optimization - Algorithm 40 Slow FastVariant Space Initial Variant
  • 94. Variant Optimization - Algorithm 40 Slow FastVariant Space Initial Variant
  • 95. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 96. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 97. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 98. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 99. Variant Optimization - Algorithm 40 Slow FastVariant Space
  • 100. Variant Optimization - Algorithm 41 Slow FastVariant Space Variant 1
  • 101. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 102. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 103. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 104. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 105. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 106. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 107. Variant Optimization - Algorithm 42 Slow FastVariant Space
  • 108. Variant Optimization - Algorithm 43 Slow FastVariant Space Variant 2
  • 109. Search Algorithm 44S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 110. Search Algorithm 44 Finds an efficient variant with linear run-time in the number of dimensions S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 111. Search Algorithm 44 Finds an efficient variant with linear run-time in the number of dimensions Code modifications are not strictly orthogonal (space not convex) S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 112. Search Algorithm 44 Finds an efficient variant with linear run-time in the number of dimensions Code modifications are not strictly orthogonal (space not convex) Perform multiple iterations of the algorithm to find best code variant S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 113. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45 Optimizing Search Time
  • 114. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45 Terminate the search if no faster variant is found during an iteration Early Termination Optimizing Search Time
  • 115. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45 Terminate the search if no faster variant is found during an iteration Early Termination Explore the parameter values of the most critical modifications first Feature Ordering Optimizing Search Time
  • 116. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 45 Terminate the search if no faster variant is found during an iteration Early Termination Explore the parameter values of the most critical modifications first Feature Ordering Only include code modifications that change the code Nested Modifications Optimizing Search Time
  • 117. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 46 Evaluation of Search Time Variant exploration times for SSB Q4.1 on SF1
  • 118. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 46 Evaluation of Search Time Our strategy outperforms backtracking by up to two orders of magnitude Variant exploration times for SSB Q4.1 on SF1
  • 119. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47 Handling Query Dependencies
  • 120. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47 Variant configuration of a processor serves as starting point for further tuning Reuse Variant Configurations Handling Query Dependencies
  • 121. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47 Variant configuration of a processor serves as starting point for further tuning Reuse Variant Configurations Set a query-dependent modification to another parameter value when we expect a performance improvement Heuristic-Based Rewrites Handling Query Dependencies
  • 122. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 47 Variant configuration of a processor serves as starting point for further tuning Reuse Variant Configurations Set a query-dependent modification to another parameter value when we expect a performance improvement Heuristic-Based Rewrites Switch to software predication in FILTER when selectivity is 50% Example: Software Predication Handling Query Dependencies
  • 123. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48 Query Compilation Times
  • 124. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48 Query Compilation Times Compilation times of OpenCL are in the order of hundreds of milliseconds
  • 125. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48 Query Compilation Times Compilation times of OpenCL are in the order of hundreds of milliseconds
  • 126. S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018 48 Query Compilation Times Compilation times of OpenCL are in the order of hundreds of milliseconds Compilation times grow linear with the number of pipelines in a query
  • 127. Evaluation Results 49 1 1 1 1 1 1 7 11 1 1 1 1 1 1 1 1 17 1 1 1 1 1 1 1 1 S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 128. Evaluation Results 49 1 1 1 1 1 1 7 11 1 1 1 1 1 1 1 1 17 1 1 1 1 1 1 1 1 Performance difference among variants up to two orders of magnitude S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 129. Evaluation Results 49 1 1 1 1 1 1 7 11 1 1 1 1 1 1 1 1 17 1 1 1 1 1 1 1 1 Performance difference among variants up to two orders of magnitude Hawk reliably identifies efficient code variants for CPUs, GPUs, MICs S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 130. Evaluation Results 49 1 1 1 1 1 1 7 11 1 1 1 1 1 1 1 1 17 1 1 1 1 1 1 1 1 Performance difference among variants up to two orders of magnitude Hawk reliably identifies efficient code variants for CPUs, GPUs, MICs Best code depends on query characteristics S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 131. Conclusion 50S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 132. Conclusion 50 A hardware-tailored code generator Hawk S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 133. Conclusion 50 A hardware-tailored code generator Hawk Produce custom code variants for each processor Code Variant Generation S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 134. Conclusion 50 A hardware-tailored code generator Hawk Produce custom code variants for each processor Code Variant Generation No manual tuning for a specific processor Variant Optimization S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 135. https://github.com/TU-Berlin-DIMA/Hawk-VLDBJ Conclusion 50 A hardware-tailored code generator Hawk Produce custom code variants for each processor Code Variant Generation No manual tuning for a specific processor Variant Optimization S. Breß et al.: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors. In The VLDB Journal, 27(6), 797-822, 2018
  • 136. Further Publications on Data Management on Modern Hardware: Generating Custom Code for Efficient Query Execution on Heterogeneous Processors Sebastian Breß, Bastian Köcher, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl VLDB Journal, 27(6), 797-822, 2018 Pipelined Query Processing in Coprocessor Environments SIGMOD 2018 Efficient and Scalable k-Means on GPUs. Datenbank-Spektrum 2018 Analyzing Efficient Stream Processing on Modern Hardware PVLDB 2019
  • 137. Database Research at TU Berlin Up Next: Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft Optimized On-Demand Data Streaming from Sensor Nodes ACM Symposium on Cloud Computing (SoCC), 2017. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models Proceedings of the VLDB Endowment (PVLDB), 2017. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors The VLDB Journal, 27(6), 2018. BlockJoin: Efficient Matrix Partitioning Through Joins Proceedings of the VLDB Endowment (PVLDB), 2017. Database Systems and Information Management Group (DIMA) of Volker Markl
  • 138. GPU-Accelerated Join Selectivity Estimation using KDE Models Paper: Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models, Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl PVLDB, Volume 10 Issue 13, September 2017
  • 139. GPU-Accelerated Kernel Density Estimation for Join Selectivity Estimation 54 Query Optimizer Database Engine Query Plan Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 140. GPU-Accelerated Kernel Density Estimation for Join Selectivity Estimation 54 Query Optimizer Database Engine Statistical CoprocessorQuery Plan Parameters Estimates Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 141. GPU-Accelerated Kernel Density Estimation for Join Selectivity Estimation 54 Query Optimizer Database Engine Statistical CoprocessorQuery Plan Parameters Estimates Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 142. Background: Kernel Density Estimators 55 Dataset Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 143. Background: Kernel Density Estimators 55 Dataset Sample 𝑆 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 144. Background: Kernel Density Estimators 55 Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 145. Background: Kernel Density Estimators 55 Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 146. Background: Kernel Density Estimators 55 Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 147. Background: Kernel Density Estimators 55 ෠𝑃 𝐻 Ԧ𝑥 = 1 |𝑆| ෍ 𝑖=1 |𝑆| 𝐾 𝐻 𝑠𝑖, Ԧ𝑥 Average… … over the kernel contributions Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 148. Background: Kernel Density Estimators 56 Average… … over the kernel contributions Dataset Sample 𝑆 Kernels 𝐾 𝐻 Estimate ෠𝑃 𝐻 Ω Ω sel Ω = 1 |𝑆| ෍ 𝑖=1 |𝑆| න Ω 𝐾 𝐻(𝑠𝑖, Ԧ𝑥) 𝑑 Ԧ𝑥 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 149. Background: Kernel Density Estimators for Multi- Dimensional Selectivity Estimation [1] 57 Good fit Overfit Underfit The bandwidth matrix 𝐻 controls the smoothing applied on the sample • Range selections over base tables • Bandwidth optimization based on the estimation error • Easy model maintenance [1] Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation, SIGMOD’15
  • 150. The Problem: Multi-Dimensional Join Selectivity Estimation • and generalization to multiple joins • Databases: Independence Assumption • Often violated • Introduce large errors, potentially bad query plans • Research: Various Methods (e.g. Sampling, Sketches) • Our Approach: Kernel Density Estimators 58Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 151. Why KDEs for Join Selectivities? • Multivariate Estimator • No independence assumption • Hybrid between samples and histograms • Small bandwidth: Sample evaluation • Increasing bandwidth: More smoothing, increasing bucket sizes • Bandwidth optimization selects proper bandwidth 59Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 152. The Approach: Join and Base Table Models 60Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 153. The Approach: Join and Base Table Models 60 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 154. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 155. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) 𝑃(𝑐1 ∧ 𝑐2)Compute: Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 156. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Sample from 𝑅1 Sample from 𝑅2 𝑃(𝑐1 ∧ 𝑐2)Compute: Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 157. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Bandwidth 𝐻 Sample from 𝑅1 Base Table KDE Model (𝑷 𝟏) Bandwidth 𝐻 Sample from 𝑅2 Base Table KDE Model (𝑷 𝟐) 𝑃(𝑐1 ∧ 𝑐2)Compute: Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 158. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Bandwidth 𝐻 Sample from 𝑅1 Base Table KDE Model (𝑷 𝟏) Bandwidth 𝐻 Sample from 𝑅2 Base Table KDE Model (𝑷 𝟐) 𝑃(𝑐1 ∧ 𝑐2) Compute: ෍ 𝑣∈𝐴 𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute: Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 159. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Bandwidth 𝐻 Sample from 𝑅1 Base Table KDE Model (𝑷 𝟏) Bandwidth 𝐻 Sample from 𝑅2 Base Table KDE Model (𝑷 𝟐) 𝑃(𝑐1 ∧ 𝑐2) Compute: ෍ 𝑣∈𝐴 𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute: Easy to evaluate, better estimates Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 160. The Approach: Join and Base Table Models 60 Bandwidth 𝐻 Sample from 𝑅1 ⋈ 𝑅1.𝐴1=𝑅2.𝐴1 𝑅2 Join KDE Model (𝑷) Bandwidth 𝐻 Sample from 𝑅1 Base Table KDE Model (𝑷 𝟏) Bandwidth 𝐻 Sample from 𝑅2 Base Table KDE Model (𝑷 𝟐) 𝑃(𝑐1 ∧ 𝑐2) Compute: ෍ 𝑣∈𝐴 𝑃1 𝐴1 = 𝑣 ∧ 𝑐1 ⋅ 𝑃2 𝐴2 = 𝑣 ∧ 𝑐2Compute: Easy to evaluate, better estimates Support for base table and join selectivities Easy to construct and to maintain Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 161. Table Model: Computation Components 61 Selectivity:
  • 162. Table Model: Computation Components 61 Sum over cross product of two samples Selectivity:
  • 163. Table Model: Computation Components 61 Sum over cross product of two samples Invariant Contributions: Contribution of sample points wrt. selection predicate Selectivity:
  • 164. Table Model: Computation Components 61 Sum over cross product of two samples Cross Contribution: Distance function on join attributes of sample points Invariant Contributions: Contribution of sample points wrt. selection predicate Selectivity:
  • 165. Table Model: Sample Pruning 9
  • 166. Table Model: Sample Pruning 9 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑡1 (5) Sample 1
  • 167. Table Model: Sample Pruning 9 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑡1 (5) Compute Sample 1
  • 168. Table Model: Sample Pruning 9 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑡1 (5) 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑝1 (1) 𝑝1 (2) 𝑝1 (3) 𝑝1 (4) 𝑡1 (5) 𝑝1 (5) Compute Sample 1
  • 169. Table Model: Sample Pruning 9 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑡1 (5) 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑝1 (1) 𝑝1 (2) 𝑝1 (3) 𝑝1 (4) 𝑡1 (5) 𝑝1 (5) 𝑡1 (1) 𝑡1 (4) 𝑝1 (1) 𝑝1 (4) Compute Filter by contribution Sample 1
  • 170. Table Model: Cross Pruning 63
  • 171. Table Model: Cross Pruning 63 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑝1 (1) 𝑝1 (2) 𝑝1 (3) 𝑝1 (4) 𝑡1 (5) 𝑝1 (5) Sample 1
  • 172. Table Model: Cross Pruning 63 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑝1 (1) 𝑝1 (2) 𝑝1 (3) 𝑝1 (4) 𝑡1 (5) 𝑝1 (5) 𝑡2 (1) 𝑡2 (2) 𝑡2 (3) 𝑡2 (4) 𝑝2 (1) 𝑝2 (2) 𝑝2 (3) 𝑝2 (4) 𝑡2 (5) 𝑝2 (5) Sample 1 Sample 2 (Sorted on join attribute)
  • 173. Table Model: Cross Pruning 63 𝑡1 (1) 𝑡1 (2) 𝑡1 (3) 𝑡1 (4) 𝑝1 (1) 𝑝1 (2) 𝑝1 (3) 𝑝1 (4) 𝑡1 (5) 𝑝1 (5) 𝑡2 (1) 𝑡2 (2) 𝑡2 (3) 𝑡2 (4) 𝑝2 (1) 𝑝2 (2) 𝑝2 (3) 𝑝2 (4) 𝑡2 (5) 𝑝2 (5) 𝑡1 𝑖 . 𝐴 − 𝑡2 𝑗 . 𝐴 < 𝜃 Sample 1 Sample 2 (Sorted on join attribute)
  • 174. Evaluation: Scaling the Model Size (Postgres) 64 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 175. Evaluation: Scaling the Model Size (Table Sample) 65 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 176. Evaluation: Scaling the Model Size (Correlated Sample) 66 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 177. Evaluation: Scaling the Model Size (AGMS Sketch) 67 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 178. Evaluation: Scaling the Model Size (Join Sample) 68 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 179. Evaluation: Scaling the Model Size (Join Sample + KDE) 69 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 180. Evaluation: Scaling the Model Size (Table Sample + KDE) 70 Dataset: DMV Query: Q1U Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 181. Runtime: CPU vs GPU Dataset: IMDB Workload: Q1U GPU: Tesla V100 CPU: Intel Xeon Gold 5115 TS+KDE: 4x faster JS+KDE: 5x faster 0,1 1 10 100 1% 2% 4% 8% 16% AverageEstimationTime(ms) Sample Size (Relative to Base Table Size) TS+KDE (GPU) TS+KDE (CPU) JS+KDE (GPU) JS+KDE (CPU) 71Estimating Join Selectivities using Bandwidth-Optimized Kernel Densitity Models, Martin Kiefer et al. PVLDB, 2017 |
  • 182. Conclusion • KDE models for join selectivity estimation • “Getting most out of your sample” • Based on join or base table KDE models • Learning hybrid between histograms and samples • GPU-acceleration possible • Experiments, data, and code online 72 github.com/martinkiefer/join-kde “Estimating Join Selectivities using Bandwidth- Optimized Kernel Density Models”, PVLDB 17
  • 183. Further Publications on GPU-Accelerated Kernel Density Estimation: Estimating Join Selectivities using Bandwidth- Optimized Kernel Density Models Martin Kiefer, Max Heimel, Sebastian Breß, Volker Markl Proceedings of the VLDB Endowment, 10(13), 2017 Demonstrating Transfer-Efficient Sample Maintenance on Graphics Cards EDBT 2015 Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation SIGMOD 2015
  • 184. Database Research at TU Berlin Up Next: Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft Optimized On-Demand Data Streaming from Sensor Nodes ACM Symposium on Cloud Computing (SoCC), 2017. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models Proceedings of the VLDB Endowment (PVLDB), 2017. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors The VLDB Journal, 27(6), 2018. BlockJoin: Efficient Matrix Partitioning Through Joins Proceedings of the VLDB Endowment (PVLDB), 2017. Database Systems and Information Management Group (DIMA) of Volker Markl
  • 185. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 BlockJoin: Efficient Matrix Partitioning Through Joins Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, Volker Markl PVLDB, Volume 10 Issue 13, September 2017
  • 186. 76 Common Pattern in end-to-end machine learning pipelines 1. Relational operators e.g., join and filter the input data 2. User-defined functions e.g., feature transformation and vectorization 3. Linear algebra operators e.g., model training and cross-validation INTRODUCTION ⋈ ML𝒇 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 187. 77 Parallel Dataflow engines implement • Relational operators on row-partitioned datasets • Linear algebra operators on block-partitioned matrices INTRODUCTION ⋈ ML𝒇 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 188. 78 Parallel Dataflow engines implement • Relational operators on row-partitioned datasets • Linear algebra operators on block-partitioned matrices >> Pipelines combining both require expensive re-partitioning (shuffle) steps INTRODUCTION ⋈ ML𝒇 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 189. STANDARD WORKFLOW 79 ⋈ Join Result Row-wise Products Reviews PK FK P1 1 1 1 1 P2 2 2 2 2 P1 1 3 3 3 P1 1 4 4 4 P1 1 P2 2 P3 3 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 190. STANDARD WORKFLOW 80 0 0 1 1 2 2 0 1 1 3 1 4 ⋈ Join Result Row-wise 0 1 1 1 1 1 2 2 2 2 2 1 3 3 3 3 1 4 4 4 Global row-index Row-wise 1 3 1 4 Matrix block-partitioned Products Reviews PK FK 1 0 1 1 2 2 1 1 3 3 4 4 P1 1 1 1 1 P2 2 2 2 2 P1 1 3 3 3 P1 1 4 4 4 P1 1 P2 2 P3 3 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 191. STANDARD WORKFLOW - PROBLEMS 81 0 0 1 1 2 2 0 1 1 3 1 4 ⋈ Join Result Row-wise 0 1 1 1 1 1 2 2 2 2 2 1 3 3 3 3 1 4 4 4 Global row-index Row-wise 1 3 1 4 Matrix block-partitioned Products Reviews PK FK 1 0 1 1 2 2 1 1 3 3 4 4 P1 1 1 1 1 P2 2 2 2 2 P1 1 3 3 3 P1 1 4 4 4 P1 1 P2 2 P3 3 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 Distributed Join Re- Partitioning BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 192. 0 0 1 1 2 2 0 1 1 3 1 4 STANDARD WORKFLOW - PROBLEMS 82 ⋈ Join Result Row-wise 0 1 1 1 1 1 2 2 2 2 2 1 3 3 3 3 1 4 4 4 Global row-index Row-wise 1 3 1 4 Matrix block-partitioned Materializes the join result, just to apply sequential row-index: • Shuffles data for row-wise partitioning , which is split up immediately • Puts heavy load on a few machines in case of skewed keys • Forces early matrix block materialization Products Reviews PK FK 1 0 1 1 2 2 1 1 3 3 4 4 P1 1 1 1 1 P2 2 2 2 2 P1 1 3 3 3 P1 1 4 4 4 P1 1 P2 2 P3 3 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 Distributed Join Re- Partitioning BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 193. • We propose Specialized operators at the intersection of linear and relational algebra • Here, we focus on Efficient creation of block-partitioned results from normalized data 83 HOW CAN WE IMPROVE? BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 194. OUR APPROACH 84 Prune Apply row-index 1 1 2 2 1 3 1 4 1 1 2 2 3 3 4 4 Block-partitioned matrix P1 1 P2 2 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 0 1 1 2 2 1 3 1 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 Local TID- Join Products Reviews PK FK Local Join Kernel Distributed Fetch Kernel P1 1 P2 2 P3 3 P1 1 1 1 P2 2 2 2 P1 3 3 3 P1 4 4 4 BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 195. OUR APPROACH Creates block-partitioned results from normalized data JOIN KERNEL: Local TID-Join on driver to create block-index meta-data 1. Meta-data provides mapping of TID to row-index for both relations 2. Row-index is applied independently: no materialization of join result 85BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 196. OUR APPROACH Creates block-partitioned results from normalized data JOIN KERNEL: Local TID-Join on driver to create block-index meta-data FETCH KERNEL: Materialization strategy of matrix blocks based on matrix shape: • Late materialization: Blocks are materialized on the receiver node |PK columns| >> |FK columns| • Early materialization: Blocks are materialized on the sender node |PK columns| << |FK columns| 86BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 197. Evaluation 87BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 198. PK – FK JOIN PK Table: 100k rows, scaling columns FK Table: 1m rows, 5k columns 88 b. Power-law distributed FKsa. Uniform distributed FKs up to 2.5x speedup skew resistant, while the baseline fails BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 199. PK – FK JOIN PK Table: 100k rows, scaling columns FK Table: 1m rows, 5k columns 89 b. Power-law distributed FKsa. Uniform distributed FKs BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 200. RECAP BlockJoin is a logically fused operator pipeline • Separation of matrix index creation and matrix materialization > No materialization of join result > Skew resistant • Cost based block materialization based on data shape > Late materialization > Early materialization 90BlockJoin: Efficient Matrix Partitioning Through Joins, Andreas Kunft et al. PVLDB, 2017 |
  • 201. Traub et al., Optimized On-Demand Data Streaming from Sensor Nodes, SoCC ‘17 Further Publications: BlockJoin: Efficient Matrix PartitioningThrough Joins Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, and Volker Markl. PVLDB 10.13, 2017 Bridging the gap: towards optimization across linear and relational algebra BeyondMR 2016 Implicit Parallelism through Deep Language Embedding SIGMOD 2015 ScootR: Scaling R Dataframes on Dataflow Systems SoCC 2018
  • 202. Database Research at TU Berlin Today‘s Talks: Jonas Traub Sebastian Breß Martin Kiefer Andreas Kunft Optimized On-Demand Data Streaming from Sensor Nodes ACM Symposium on Cloud Computing (SoCC), 2017. Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models Proceedings of the VLDB Endowment (PVLDB), 2017. Generating Custom Code for Efficient Query Execution on Heterogeneous Processors The VLDB Journal, 27(6), 2018. BlockJoin: Efficient Matrix Partitioning Through Joins Proceedings of the VLDB Endowment (PVLDB), 2017. Database Systems and Information Management Group (DIMA) of Volker Markl