SlideShare ist ein Scribd-Unternehmen logo
1 von 81
1©2014 TransLattice, Inc. All Rights Reserved. 1
Supersized PostgreSQL:
Postgres-XL for Scale-out
OLTP and Big Data Analytics
June 17, 2014
Mason Sharp
2©2014 TransLattice, Inc. All Rights Reserved.
Agenda
•  Postgres-XL Background
•  Architecture Overview
•  Distributing Tables
•  Configuring a Local Cluster
•  Backup and Recovery
•  Differences to PostgreSQL and Postgres-XC
•  Community
3©2014 TransLattice, Inc. All Rights Reserved.
Announced May 2014
4©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL
n  Open Source
n  Scale-out RDBMS
–  Massive Parallel Processing
–  OLTP write and read scalability
–  Multi-tenant security
–  Cluster-wide ACID properties
–  Can also act as distributed key-value store
n  Mixed workloads, can eliminate need for separate
clusters in some cases
5©2014 TransLattice, Inc. All Rights Reserved.
XL Origins
Considered open sourcing very
early
6©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Applications and
Use Case Examples
Postgres-XL Application Use Case Example
Business intelligence
requiring
MPP parallelism
Financial services big data
analysis of opportunities
across service lines
OLTP write-intensive
workloads
Online ad tracking
Mixed workload
environments
Real-time reporting on a
transactional system
8©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Technology
•  Data is automatically
spread across cluster
•  Queries return back
data from all nodes in
parallel
•  The Global Transaction
Manager maintains a
consistent view of the
data across cluster
•  Connect to any
coordinator
9©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Features
•  Multi Version Concurrency Control
•  Referential Integrity*
•  Full Text Search
•  Geo-spatial Functions and Data Types
•  JSON and XML Support
•  Distributed Key-Value Store
10©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Missing Features
•  Built in High Availability
•  Use external like Corosync/Pacemaker
•  Area for future improvement
•  “Easy” to re-shard / add nodes
•  Some pieces there
•  Can however “pre-shard” multiple nodes on
the same server or VM
•  Some FK and UNIQUE constraints
11©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Connectors
Looks like a PostgreSQL 9.2 database
•  C
•  C++
•  Perl
•  Python
•  PHP
•  Erlang
•  Haskell
– Java
– Javascript
– .NET
– Node.js
– Ruby
– Scala
13©2014 TransLattice, Inc. All Rights Reserved.
Performance
Transactional work loads
10 50 100 200 300 400
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
Dell DVD Benchmark
StormDB
Amazon RDS
14©2014 TransLattice, Inc. All Rights Reserved.
MPP Performance – DBT-1 (TPC-H)
Postgres-XL
PostgreSQL
Postgres-XL
15©2014 TransLattice, Inc. All Rights Reserved.
Key-value Store Comparison
Postgres-XL
MongoDB
17©2014 TransLattice, Inc. All Rights Reserved.
Postgres-XL Architecture
•  Based on the Postgres-XC project
•  Coordinators
•  Connection point for applications
•  Parsing and planning of queries
•  Data Nodes
•  Actual data storage
•  Local execution of queries
•  Global Transaction Manager (GTM)
•  Provides a consistent view to transactions
•  GTM Proxy to increase performance
18©2014 TransLattice, Inc. All Rights Reserved.
Coordinators
•  Handles network connectivity to the client
•  Parse and plan statements
•  Perform final query processing of intermediate
result sets
•  Manages two-phase commit
•  Stores global catalog information
19©2014 TransLattice, Inc. All Rights Reserved.
Data Nodes
•  Stores tables and indexes
•  Only coordinators connects to data nodes
•  Executes queries from the coordinators
•  Communicates with peer data nodes for
distributed joins
20©2014 TransLattice, Inc. All Rights Reserved.
Global Transaction Manager (GTM)
•  Handles necessary MVCC tasks
•  Transaction IDs
•  Snapshots
•  Manages cluster wide values
•  Timestamps
•  Sequences
•  GTM Standby can be configured
21©2014 TransLattice, Inc. All Rights Reserved.
GTM Proxy
•  Runs alongside Coordinator or Datanode
•  Groups requests together
•  Obtain range of transaction ids (XIDs)
•  Obtain snapshot
22©2014 TransLattice, Inc. All Rights Reserved.
Data Distribution
•  Replicated Tables
•  Each row is replicated
to all data nodes
•  Statement based
replication
•  Distributed Tables
•  Each row is stored on
one data node
•  Available strategies
•  Hash
•  Round Robin
•  Modulo
23©2014 TransLattice, Inc. All Rights Reserved.
Availability
•  No Single Point of
Failure
•  Global Transaction
Manager Standby
•  Coordinators are load
balanced
•  Streaming replication for
data nodes
•  But, currently manual…
24©2014 TransLattice, Inc. All Rights Reserved.
DDL
25©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE my_table (…)
DISTRIBUTE BY
HASH(col) | MODULO(col) | ROUNDROBIN | REPLICATION
[ TO NODE (nodename[,nodename…])]
26©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE state_code (
state_code CHAR(2),
state_name VARCHAR,
:
) DISTRIBUTE BY REPLICATION
An exact replica on each node
27©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE invoice (
invoice_id SERIAL,
cust_id INT,
:
) DISTRIBUTE BY HASH(invoice_id)
Distributed evenly amongst sharded nodes
28©2014 TransLattice, Inc. All Rights Reserved.
Defining Tables
CREATE TABLE invoice (
invoice_id SERIAL,
cust_id INT ….
) DISTRIBUTE BY HASH(invoice_id);
CREATE TABLE lineitem (
lineitem_id SERIAL,
invoice_id INT …
) DISTRIBUTE BY HASH(invoice_id);
Joins on invoice_id can be pushed down to the datanodes
29©2014 TransLattice, Inc. All Rights Reserved.
test2=# create table t1 (col1 int,
col2 int) distribute by
hash(col1);
test2=# create table t2 (col3 int,
col4 int) distribute by
hash(col3);
30©2014 TransLattice, Inc. All Rights Reserved.
explain select * from t1 inner join t2 on
col1 = col3;
Remote Subquery Scan on all
(datanode_1,datanode_2)
-> Hash Join
Hash Cond:
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
Join push-down. Looks
much like regular
PostgreSQL
31©2014 TransLattice, Inc. All Rights Reserved.
test2=# explain select * from t1 inner join t2
on col2 = col3;
Remote Subquery Scan on all
(datanode_1,datanode_2)
-> Hash Join
Hash Cond:
-> Remote Subquery Scan on all
(datanode_1,datanode_2)
Distribute results by H: col2
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
Will read t1.col2 once and
put in shared queue for
consumption for other
datanodes for joining.
Datanode-datanode
communication and
parallelism
34©2014 TransLattice, Inc. All Rights Reserved.
35©2014 TransLattice, Inc. All Rights Reserved.
Configuration
36©2014 TransLattice, Inc. All Rights Reserved.
37©2014 TransLattice, Inc. All Rights Reserved.
Use pgxc_ctl!
(demo in a few minutes)
38©2014 TransLattice, Inc. All Rights Reserved.
Otherwise, PostgreSQL-like
steps apply:
initgtm
initdb
39©2014 TransLattice, Inc. All Rights Reserved.
postgresql.conf
•  Very similar to regular PostgreSQL
•  But there are extra parameters
•  max_pool_size
•  min_pool_size
•  pool_maintenance _timeout
•  remote_query_cost
•  network_byte_cost
•  sequence_range
•  pooler_port
•  gtm_host, gtm_port
•  shared_queues
•  shared_queue_size
40©2014 TransLattice, Inc. All Rights Reserved.
Install
n  Download RPMs
–  http://sourceforge.net/projects/postgres-xl/
files/Releases/Version_9.2rc/
Or
n  configure; make; make install
41©2014 TransLattice, Inc. All Rights Reserved.
Setup Environment
•  .bashrc
•  ssh used, so ad installation bin dir to
PATH
42©2014 TransLattice, Inc. All Rights Reserved.
Avoid password with pgxc_ctl
•  ssh-keygen –t rsa (in ~/.ssh)
•  For local test, no need to copy key,
already there
•  cat ~/.ssh/id_rsa.pub >> ~/.ssh/
authorized_keys
43©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl
•  Initialize cluster
•  Start/stop
•  Deploy to node
•  Add coordinator
•  Add data node / slave
•  Add GTM slave
44©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
•  Let’s create a local test cluster!
•  One GTM
•  One Coordinator
•  Two Datanodes
Make sure all are using different ports
•  including pooler_port
45©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
pgxcOwner=pgxl
pgxcUser=$pgxcOwner
pgxcInstallDir=/usr/postgres-xl-9.2
46©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
gtmMasterDir=/data/gtm_master
gtmMasterServer=localhost
gtmSlave=n
gtmProxy=n
47©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
coordMasterDir=/data/coord_master
coordNames=(coord1)
coordPorts=(5432)
poolerPorts=(20010)
coordMasterServers=(localhost)
coordMasterDirs=($coordMasterDir/coord1)
48©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
coordMaxWALSenders=($coordMaxWALsernder)
coordSlave=n
coordSpecificExtraConfig=(none)
coordSpecificExtraPgHba=(none)
49©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
datanodeNames=(dn1 dn2)
datanodeMasterDir=/data/dn_master
datanodePorts=(5433 5434)
datanodePoolerPorts=(20011 20012)
datanodeMasterServers=(localhost localhost)
datanodeMasterDirs=($datanodeMasterDir/dn1
$datanodeMasterDir/dn2)
50©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl.conf
datanodeMaxWALSenders=($datanodeMaxWalSe
nder $datanodeMaxWalSender)
datanodeSlave=n
primaryDatanode=dn1
51©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl
pgxc_ctl init all
Now have a working cluster
52©2014 TransLattice, Inc. All Rights Reserved.
pgxc_ctl – Add a second Coordinator
Format:
pgxc_ctl add coordinator master name host port
pooler data_dir
pgxc_ctl add coordinator master coord2 localhost
15432 20020 /data/coord_master/coord2
53©2014 TransLattice, Inc. All Rights Reserved.
PostgreSQL Differences
54©2014 TransLattice, Inc. All Rights Reserved.
pg_catalog
•  pgxc_node
•  Coordinator and Datanode definition
•  Currently not GTM
•  pgxc_class
•  Table replication or distribution info
55©2014 TransLattice, Inc. All Rights Reserved.
Additional Commands
•  PAUSE CLUSTER / UNPAUSE CLUSTER
•  Wait for current transactions to complete
•  Prevent new commands until UNPAUSE
•  EXECUTE DIRECT
ON (nodename) ‘command’
•  CLEAN CONNECTION
•  Cleans connection pool
•  CREATE NODE / ALTER NODE
56©2014 TransLattice, Inc. All Rights Reserved.
Noteworthy
•  SELECT pgxc_pool_reload()
•  pgxc_clean for 2PC
•  Cleans prepared transactions
•  If committed on one node, must be
committed on all
•  If aborted on one node, must be aborted
on all
•  If both committed and aborted output
error
57©2014 TransLattice, Inc. All Rights Reserved.
Backup and Recovery
58©2014 TransLattice, Inc. All Rights Reserved.
Backup
•  Like PostgreSQL
•  pg_dump, pg_dumpall from any Coordinator
•  Will pull all data from all nodes in one central
location
•  No notion yet of parallel pg_dump of individual
nodes on node-by-node basis, nor parallel
pg_restore
59©2014 TransLattice, Inc. All Rights Reserved.
Cold Backup
•  Stop Cluster
•  Backup individual data directories on nodes’
file system
•  Gzip, compress, rsync off, etc
•  Cold restore do the opposite
60©2014 TransLattice, Inc. All Rights Reserved.
Hot Backup
•  Similar to PostgreSQL
•  Base backup individual nodes
•  use streaming replication or WAL archiving
•  Backup GTM control file
•  Contains XID
•  Contains Sequence values
•  Periodically written to
•  On restart, it will skip ahead some values
What about PITR?
61©2014 TransLattice, Inc. All Rights Reserved.
Hot Backup - Barriers
•  CREATE BARRIER ‘barrier_name’
•  Waits for currently committing transactions to
finish
•  Writes a barrier in each node’s WAL for a
consistent restoration point
•  Currently done manually!
•  Could be configured as background process in the
future to periodically write
In recovery.conf, use recovery_target_barrier
62©2014 TransLattice, Inc. All Rights Reserved.
Similarities and differences
compared to
Postgres-XC
63©2014 TransLattice, Inc. All Rights Reserved.
Code Merging
PostgreSQL
Postgres-XC
Postgres-XL
Postgres-XC
Enhancements
Postgres-XL
Enhancements
Planner and Executor
are different
64©2014 TransLattice, Inc. All Rights Reserved.
Similarities with
Postgres-XC
65©2014 TransLattice, Inc. All Rights Reserved.
Some of the original
Postgres-XC developers
developed Postgres-XL
66©2014 TransLattice, Inc. All Rights Reserved.
Global Transaction Manager
67©2014 TransLattice, Inc. All Rights Reserved.
Pooler
68©2014 TransLattice, Inc. All Rights Reserved.
pg_catalog
69©2014 TransLattice, Inc. All Rights Reserved.
pgxc_*
pgxc_ctl, pgxc_node
70©2014 TransLattice, Inc. All Rights Reserved.
“pgxc” =
Postgres-XL Cluster
J
71©2014 TransLattice, Inc. All Rights Reserved.
Differences compared to
Postgres-XC
72©2014 TransLattice, Inc. All Rights Reserved.
n  Performance
n  Multi-tenant security
n  Statistics
73©2014 TransLattice, Inc. All Rights Reserved.
Performance
Data node ßà Data node Communication
–  Avoid Coordinator-centric processing
–  Can be many times faster for some queries
Pooler needs to be configured for each Data node,
not just coordinator
–  More connections needed at each node
74©2014 TransLattice, Inc. All Rights Reserved.
Performance
Re-parse avoidance
In XC, SQL strings are sent to other nodes from
Coordinator, where they are reparsed and replanned.
In XL, plan once on coordinator, serialize and send
over
75©2014 TransLattice, Inc. All Rights Reserved.
Performance
Sequences
Detect frequently accessed sequences, fetch range
of values, avoid fewer round trips to GTM
COPY
INSERT SELECT
76©2014 TransLattice, Inc. All Rights Reserved.
Multi-tenant Security
User can only get own info from global pg_catalog
pg_database, pg_user
Harder to hack if names of other database objects
are unknown
77©2014 TransLattice, Inc. All Rights Reserved.
Statistics
storm_stats
Basic DML stats by time range
pg_stat_statements may be sufficient
78©2014 TransLattice, Inc. All Rights Reserved.
Differences
Planner and Executor are different
79©2014 TransLattice, Inc. All Rights Reserved.
Differences
GUC
n  remote_query_cost
n  network_byte_cost
n  pool_maintenance_timeout
n  sequence_range
n  shared_queues
n  shared_queue_size
80©2014 TransLattice, Inc. All Rights Reserved.
Looking at Postgres-XL Code
n  XC
#ifdef PGXC
n  XL
#ifdef XCP
#ifndef XCP
81©2014 TransLattice, Inc. All Rights Reserved.
Community
82©2014 TransLattice, Inc. All Rights Reserved.
Website and Code
n  postgres-xl.org
n  sourceforge.net/projects/postgres-xl
n  Will add mirror for github.com
n  Docs
–  files.postgres-xl.org/documentation/index.html
84©2014 TransLattice, Inc. All Rights Reserved.
Philosophy
n  Stability and bug fixes over features
n  Performance-focused
n  Open and inclusive
–  Welcome input for roadmap, priorities and design
–  Welcome others to become core members and
contributors
85©2014 TransLattice, Inc. All Rights Reserved.
Planned Community Structure
n  Like PostgreSQL
–  Core
–  Major Contributors
–  Minor Contributors
–  Advocacy
87©2014 TransLattice, Inc. All Rights Reserved.
Roadmap
n  Catch up to PostgreSQL Community -> 9.3, 9.4
n  Make XL a more robust analytical platform
–  Distributed Foreign Data Wrappers
n  Remove blockers to adoption
n  Native High Availability
88©2014 TransLattice, Inc. All Rights Reserved.
Thank You!
msharp@translattice.com
@mason_db

Weitere ähnliche Inhalte

Was ist angesagt?

Integrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect FrameworkIntegrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect Frameworkconfluent
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...DataStax
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow ArchitectureGerard Toonstra
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringErik Krogen
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...DataStax
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introductionRico Chen
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariDataWorks Summit
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflowmutt_data
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...HostedbyConfluent
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Daniel Hochman
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Loki - like prometheus, but for logs
Loki - like prometheus, but for logsLoki - like prometheus, but for logs
Loki - like prometheus, but for logsJuraj Hantak
 
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...HostedbyConfluent
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 

Was ist angesagt? (20)

Integrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect FrameworkIntegrating Apache Kafka and Elastic Using the Connect Framework
Integrating Apache Kafka and Elastic Using the Connect Framework
 
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with Ambari
 
Introduction to Apache Airflow
Introduction to Apache AirflowIntroduction to Apache Airflow
Introduction to Apache Airflow
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Loki - like prometheus, but for logs
Loki - like prometheus, but for logsLoki - like prometheus, but for logs
Loki - like prometheus, but for logs
 
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 

Andere mochten auch

Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-trainingconline training
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterEDB
 
Why PostgreSQL for Analytics Infrastructure (DW)?
Why PostgreSQL for Analytics Infrastructure (DW)?Why PostgreSQL for Analytics Infrastructure (DW)?
Why PostgreSQL for Analytics Infrastructure (DW)?Huy Nguyen
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA TipsEDB
 
One Coin One Brick project
One Coin One Brick projectOne Coin One Brick project
One Coin One Brick projectHuy Nguyen
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6EDB
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPGConf APAC
 

Andere mochten auch (7)

Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-training
 
Making Postgres Central in Your Data Center
Making Postgres Central in Your Data CenterMaking Postgres Central in Your Data Center
Making Postgres Central in Your Data Center
 
Why PostgreSQL for Analytics Infrastructure (DW)?
Why PostgreSQL for Analytics Infrastructure (DW)?Why PostgreSQL for Analytics Infrastructure (DW)?
Why PostgreSQL for Analytics Infrastructure (DW)?
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA Tips
 
One Coin One Brick project
One Coin One Brick projectOne Coin One Brick project
One Coin One Brick project
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6
 
PostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability ImprovementsPostgreSQL 9.6 Performance-Scalability Improvements
PostgreSQL 9.6 Performance-Scalability Improvements
 

Ähnlich wie Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics

Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over YarnInMobi Technology
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing enginebigdatagurus_meetup
 
MapR-DB Elasticsearch Integration
MapR-DB Elasticsearch IntegrationMapR-DB Elasticsearch Integration
MapR-DB Elasticsearch IntegrationMapR Technologies
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AITyrone Systems
 
Apache Thrift, a brief introduction
Apache Thrift, a brief introductionApache Thrift, a brief introduction
Apache Thrift, a brief introductionRandy Abernethy
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelt3rmin4t0r
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Drill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is PossibleDrill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is PossibleMapR Technologies
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaData Con LA
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformDataStax Academy
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Michael Renner
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Dipti Borkar
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 

Ähnlich wie Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics (20)

Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Tez Data Processing over Yarn
Tez Data Processing over YarnTez Data Processing over Yarn
Tez Data Processing over Yarn
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
 
MapR-DB Elasticsearch Integration
MapR-DB Elasticsearch IntegrationMapR-DB Elasticsearch Integration
MapR-DB Elasticsearch Integration
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
Apache Thrift, a brief introduction
Apache Thrift, a brief introductionApache Thrift, a brief introduction
Apache Thrift, a brief introduction
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Drill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is PossibleDrill into Drill – How Providing Flexibility and Performance is Possible
Drill into Drill – How Providing Flexibility and Performance is Possible
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
Spark etl
Spark etlSpark etl
Spark etl
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
RISC V in Spacer
RISC V in SpacerRISC V in Spacer
RISC V in Spacer
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 

Kürzlich hochgeladen

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 

Kürzlich hochgeladen (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 

Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics

  • 1. 1©2014 TransLattice, Inc. All Rights Reserved. 1 Supersized PostgreSQL: Postgres-XL for Scale-out OLTP and Big Data Analytics June 17, 2014 Mason Sharp
  • 2. 2©2014 TransLattice, Inc. All Rights Reserved. Agenda •  Postgres-XL Background •  Architecture Overview •  Distributing Tables •  Configuring a Local Cluster •  Backup and Recovery •  Differences to PostgreSQL and Postgres-XC •  Community
  • 3. 3©2014 TransLattice, Inc. All Rights Reserved. Announced May 2014
  • 4. 4©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL n  Open Source n  Scale-out RDBMS –  Massive Parallel Processing –  OLTP write and read scalability –  Multi-tenant security –  Cluster-wide ACID properties –  Can also act as distributed key-value store n  Mixed workloads, can eliminate need for separate clusters in some cases
  • 5. 5©2014 TransLattice, Inc. All Rights Reserved. XL Origins Considered open sourcing very early
  • 6. 6©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Applications and Use Case Examples Postgres-XL Application Use Case Example Business intelligence requiring MPP parallelism Financial services big data analysis of opportunities across service lines OLTP write-intensive workloads Online ad tracking Mixed workload environments Real-time reporting on a transactional system
  • 7. 8©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Technology •  Data is automatically spread across cluster •  Queries return back data from all nodes in parallel •  The Global Transaction Manager maintains a consistent view of the data across cluster •  Connect to any coordinator
  • 8. 9©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Features •  Multi Version Concurrency Control •  Referential Integrity* •  Full Text Search •  Geo-spatial Functions and Data Types •  JSON and XML Support •  Distributed Key-Value Store
  • 9. 10©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Missing Features •  Built in High Availability •  Use external like Corosync/Pacemaker •  Area for future improvement •  “Easy” to re-shard / add nodes •  Some pieces there •  Can however “pre-shard” multiple nodes on the same server or VM •  Some FK and UNIQUE constraints
  • 10. 11©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Connectors Looks like a PostgreSQL 9.2 database •  C •  C++ •  Perl •  Python •  PHP •  Erlang •  Haskell – Java – Javascript – .NET – Node.js – Ruby – Scala
  • 11. 13©2014 TransLattice, Inc. All Rights Reserved. Performance Transactional work loads 10 50 100 200 300 400 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Dell DVD Benchmark StormDB Amazon RDS
  • 12. 14©2014 TransLattice, Inc. All Rights Reserved. MPP Performance – DBT-1 (TPC-H) Postgres-XL PostgreSQL Postgres-XL
  • 13. 15©2014 TransLattice, Inc. All Rights Reserved. Key-value Store Comparison Postgres-XL MongoDB
  • 14. 17©2014 TransLattice, Inc. All Rights Reserved. Postgres-XL Architecture •  Based on the Postgres-XC project •  Coordinators •  Connection point for applications •  Parsing and planning of queries •  Data Nodes •  Actual data storage •  Local execution of queries •  Global Transaction Manager (GTM) •  Provides a consistent view to transactions •  GTM Proxy to increase performance
  • 15. 18©2014 TransLattice, Inc. All Rights Reserved. Coordinators •  Handles network connectivity to the client •  Parse and plan statements •  Perform final query processing of intermediate result sets •  Manages two-phase commit •  Stores global catalog information
  • 16. 19©2014 TransLattice, Inc. All Rights Reserved. Data Nodes •  Stores tables and indexes •  Only coordinators connects to data nodes •  Executes queries from the coordinators •  Communicates with peer data nodes for distributed joins
  • 17. 20©2014 TransLattice, Inc. All Rights Reserved. Global Transaction Manager (GTM) •  Handles necessary MVCC tasks •  Transaction IDs •  Snapshots •  Manages cluster wide values •  Timestamps •  Sequences •  GTM Standby can be configured
  • 18. 21©2014 TransLattice, Inc. All Rights Reserved. GTM Proxy •  Runs alongside Coordinator or Datanode •  Groups requests together •  Obtain range of transaction ids (XIDs) •  Obtain snapshot
  • 19. 22©2014 TransLattice, Inc. All Rights Reserved. Data Distribution •  Replicated Tables •  Each row is replicated to all data nodes •  Statement based replication •  Distributed Tables •  Each row is stored on one data node •  Available strategies •  Hash •  Round Robin •  Modulo
  • 20. 23©2014 TransLattice, Inc. All Rights Reserved. Availability •  No Single Point of Failure •  Global Transaction Manager Standby •  Coordinators are load balanced •  Streaming replication for data nodes •  But, currently manual…
  • 21. 24©2014 TransLattice, Inc. All Rights Reserved. DDL
  • 22. 25©2014 TransLattice, Inc. All Rights Reserved. Defining Tables CREATE TABLE my_table (…) DISTRIBUTE BY HASH(col) | MODULO(col) | ROUNDROBIN | REPLICATION [ TO NODE (nodename[,nodename…])]
  • 23. 26©2014 TransLattice, Inc. All Rights Reserved. Defining Tables CREATE TABLE state_code ( state_code CHAR(2), state_name VARCHAR, : ) DISTRIBUTE BY REPLICATION An exact replica on each node
  • 24. 27©2014 TransLattice, Inc. All Rights Reserved. Defining Tables CREATE TABLE invoice ( invoice_id SERIAL, cust_id INT, : ) DISTRIBUTE BY HASH(invoice_id) Distributed evenly amongst sharded nodes
  • 25. 28©2014 TransLattice, Inc. All Rights Reserved. Defining Tables CREATE TABLE invoice ( invoice_id SERIAL, cust_id INT …. ) DISTRIBUTE BY HASH(invoice_id); CREATE TABLE lineitem ( lineitem_id SERIAL, invoice_id INT … ) DISTRIBUTE BY HASH(invoice_id); Joins on invoice_id can be pushed down to the datanodes
  • 26. 29©2014 TransLattice, Inc. All Rights Reserved. test2=# create table t1 (col1 int, col2 int) distribute by hash(col1); test2=# create table t2 (col3 int, col4 int) distribute by hash(col3);
  • 27. 30©2014 TransLattice, Inc. All Rights Reserved. explain select * from t1 inner join t2 on col1 = col3; Remote Subquery Scan on all (datanode_1,datanode_2) -> Hash Join Hash Cond: -> Seq Scan on t1 -> Hash -> Seq Scan on t2 Join push-down. Looks much like regular PostgreSQL
  • 28. 31©2014 TransLattice, Inc. All Rights Reserved. test2=# explain select * from t1 inner join t2 on col2 = col3; Remote Subquery Scan on all (datanode_1,datanode_2) -> Hash Join Hash Cond: -> Remote Subquery Scan on all (datanode_1,datanode_2) Distribute results by H: col2 -> Seq Scan on t1 -> Hash -> Seq Scan on t2 Will read t1.col2 once and put in shared queue for consumption for other datanodes for joining. Datanode-datanode communication and parallelism
  • 29. 34©2014 TransLattice, Inc. All Rights Reserved.
  • 30. 35©2014 TransLattice, Inc. All Rights Reserved. Configuration
  • 31. 36©2014 TransLattice, Inc. All Rights Reserved.
  • 32. 37©2014 TransLattice, Inc. All Rights Reserved. Use pgxc_ctl! (demo in a few minutes)
  • 33. 38©2014 TransLattice, Inc. All Rights Reserved. Otherwise, PostgreSQL-like steps apply: initgtm initdb
  • 34. 39©2014 TransLattice, Inc. All Rights Reserved. postgresql.conf •  Very similar to regular PostgreSQL •  But there are extra parameters •  max_pool_size •  min_pool_size •  pool_maintenance _timeout •  remote_query_cost •  network_byte_cost •  sequence_range •  pooler_port •  gtm_host, gtm_port •  shared_queues •  shared_queue_size
  • 35. 40©2014 TransLattice, Inc. All Rights Reserved. Install n  Download RPMs –  http://sourceforge.net/projects/postgres-xl/ files/Releases/Version_9.2rc/ Or n  configure; make; make install
  • 36. 41©2014 TransLattice, Inc. All Rights Reserved. Setup Environment •  .bashrc •  ssh used, so ad installation bin dir to PATH
  • 37. 42©2014 TransLattice, Inc. All Rights Reserved. Avoid password with pgxc_ctl •  ssh-keygen –t rsa (in ~/.ssh) •  For local test, no need to copy key, already there •  cat ~/.ssh/id_rsa.pub >> ~/.ssh/ authorized_keys
  • 38. 43©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl •  Initialize cluster •  Start/stop •  Deploy to node •  Add coordinator •  Add data node / slave •  Add GTM slave
  • 39. 44©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf •  Let’s create a local test cluster! •  One GTM •  One Coordinator •  Two Datanodes Make sure all are using different ports •  including pooler_port
  • 40. 45©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf pgxcOwner=pgxl pgxcUser=$pgxcOwner pgxcInstallDir=/usr/postgres-xl-9.2
  • 41. 46©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf gtmMasterDir=/data/gtm_master gtmMasterServer=localhost gtmSlave=n gtmProxy=n
  • 42. 47©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf coordMasterDir=/data/coord_master coordNames=(coord1) coordPorts=(5432) poolerPorts=(20010) coordMasterServers=(localhost) coordMasterDirs=($coordMasterDir/coord1)
  • 43. 48©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf coordMaxWALSenders=($coordMaxWALsernder) coordSlave=n coordSpecificExtraConfig=(none) coordSpecificExtraPgHba=(none)
  • 44. 49©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf datanodeNames=(dn1 dn2) datanodeMasterDir=/data/dn_master datanodePorts=(5433 5434) datanodePoolerPorts=(20011 20012) datanodeMasterServers=(localhost localhost) datanodeMasterDirs=($datanodeMasterDir/dn1 $datanodeMasterDir/dn2)
  • 45. 50©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl.conf datanodeMaxWALSenders=($datanodeMaxWalSe nder $datanodeMaxWalSender) datanodeSlave=n primaryDatanode=dn1
  • 46. 51©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl pgxc_ctl init all Now have a working cluster
  • 47. 52©2014 TransLattice, Inc. All Rights Reserved. pgxc_ctl – Add a second Coordinator Format: pgxc_ctl add coordinator master name host port pooler data_dir pgxc_ctl add coordinator master coord2 localhost 15432 20020 /data/coord_master/coord2
  • 48. 53©2014 TransLattice, Inc. All Rights Reserved. PostgreSQL Differences
  • 49. 54©2014 TransLattice, Inc. All Rights Reserved. pg_catalog •  pgxc_node •  Coordinator and Datanode definition •  Currently not GTM •  pgxc_class •  Table replication or distribution info
  • 50. 55©2014 TransLattice, Inc. All Rights Reserved. Additional Commands •  PAUSE CLUSTER / UNPAUSE CLUSTER •  Wait for current transactions to complete •  Prevent new commands until UNPAUSE •  EXECUTE DIRECT ON (nodename) ‘command’ •  CLEAN CONNECTION •  Cleans connection pool •  CREATE NODE / ALTER NODE
  • 51. 56©2014 TransLattice, Inc. All Rights Reserved. Noteworthy •  SELECT pgxc_pool_reload() •  pgxc_clean for 2PC •  Cleans prepared transactions •  If committed on one node, must be committed on all •  If aborted on one node, must be aborted on all •  If both committed and aborted output error
  • 52. 57©2014 TransLattice, Inc. All Rights Reserved. Backup and Recovery
  • 53. 58©2014 TransLattice, Inc. All Rights Reserved. Backup •  Like PostgreSQL •  pg_dump, pg_dumpall from any Coordinator •  Will pull all data from all nodes in one central location •  No notion yet of parallel pg_dump of individual nodes on node-by-node basis, nor parallel pg_restore
  • 54. 59©2014 TransLattice, Inc. All Rights Reserved. Cold Backup •  Stop Cluster •  Backup individual data directories on nodes’ file system •  Gzip, compress, rsync off, etc •  Cold restore do the opposite
  • 55. 60©2014 TransLattice, Inc. All Rights Reserved. Hot Backup •  Similar to PostgreSQL •  Base backup individual nodes •  use streaming replication or WAL archiving •  Backup GTM control file •  Contains XID •  Contains Sequence values •  Periodically written to •  On restart, it will skip ahead some values What about PITR?
  • 56. 61©2014 TransLattice, Inc. All Rights Reserved. Hot Backup - Barriers •  CREATE BARRIER ‘barrier_name’ •  Waits for currently committing transactions to finish •  Writes a barrier in each node’s WAL for a consistent restoration point •  Currently done manually! •  Could be configured as background process in the future to periodically write In recovery.conf, use recovery_target_barrier
  • 57. 62©2014 TransLattice, Inc. All Rights Reserved. Similarities and differences compared to Postgres-XC
  • 58. 63©2014 TransLattice, Inc. All Rights Reserved. Code Merging PostgreSQL Postgres-XC Postgres-XL Postgres-XC Enhancements Postgres-XL Enhancements Planner and Executor are different
  • 59. 64©2014 TransLattice, Inc. All Rights Reserved. Similarities with Postgres-XC
  • 60. 65©2014 TransLattice, Inc. All Rights Reserved. Some of the original Postgres-XC developers developed Postgres-XL
  • 61. 66©2014 TransLattice, Inc. All Rights Reserved. Global Transaction Manager
  • 62. 67©2014 TransLattice, Inc. All Rights Reserved. Pooler
  • 63. 68©2014 TransLattice, Inc. All Rights Reserved. pg_catalog
  • 64. 69©2014 TransLattice, Inc. All Rights Reserved. pgxc_* pgxc_ctl, pgxc_node
  • 65. 70©2014 TransLattice, Inc. All Rights Reserved. “pgxc” = Postgres-XL Cluster J
  • 66. 71©2014 TransLattice, Inc. All Rights Reserved. Differences compared to Postgres-XC
  • 67. 72©2014 TransLattice, Inc. All Rights Reserved. n  Performance n  Multi-tenant security n  Statistics
  • 68. 73©2014 TransLattice, Inc. All Rights Reserved. Performance Data node ßà Data node Communication –  Avoid Coordinator-centric processing –  Can be many times faster for some queries Pooler needs to be configured for each Data node, not just coordinator –  More connections needed at each node
  • 69. 74©2014 TransLattice, Inc. All Rights Reserved. Performance Re-parse avoidance In XC, SQL strings are sent to other nodes from Coordinator, where they are reparsed and replanned. In XL, plan once on coordinator, serialize and send over
  • 70. 75©2014 TransLattice, Inc. All Rights Reserved. Performance Sequences Detect frequently accessed sequences, fetch range of values, avoid fewer round trips to GTM COPY INSERT SELECT
  • 71. 76©2014 TransLattice, Inc. All Rights Reserved. Multi-tenant Security User can only get own info from global pg_catalog pg_database, pg_user Harder to hack if names of other database objects are unknown
  • 72. 77©2014 TransLattice, Inc. All Rights Reserved. Statistics storm_stats Basic DML stats by time range pg_stat_statements may be sufficient
  • 73. 78©2014 TransLattice, Inc. All Rights Reserved. Differences Planner and Executor are different
  • 74. 79©2014 TransLattice, Inc. All Rights Reserved. Differences GUC n  remote_query_cost n  network_byte_cost n  pool_maintenance_timeout n  sequence_range n  shared_queues n  shared_queue_size
  • 75. 80©2014 TransLattice, Inc. All Rights Reserved. Looking at Postgres-XL Code n  XC #ifdef PGXC n  XL #ifdef XCP #ifndef XCP
  • 76. 81©2014 TransLattice, Inc. All Rights Reserved. Community
  • 77. 82©2014 TransLattice, Inc. All Rights Reserved. Website and Code n  postgres-xl.org n  sourceforge.net/projects/postgres-xl n  Will add mirror for github.com n  Docs –  files.postgres-xl.org/documentation/index.html
  • 78. 84©2014 TransLattice, Inc. All Rights Reserved. Philosophy n  Stability and bug fixes over features n  Performance-focused n  Open and inclusive –  Welcome input for roadmap, priorities and design –  Welcome others to become core members and contributors
  • 79. 85©2014 TransLattice, Inc. All Rights Reserved. Planned Community Structure n  Like PostgreSQL –  Core –  Major Contributors –  Minor Contributors –  Advocacy
  • 80. 87©2014 TransLattice, Inc. All Rights Reserved. Roadmap n  Catch up to PostgreSQL Community -> 9.3, 9.4 n  Make XL a more robust analytical platform –  Distributed Foreign Data Wrappers n  Remove blockers to adoption n  Native High Availability
  • 81. 88©2014 TransLattice, Inc. All Rights Reserved. Thank You! msharp@translattice.com @mason_db