SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Feb 2015
NewSQL Overview
Ivan Glushkov
@gliush
ivan.glushkov@gmail.com
❖ MIPT
❖ MCST, Elbrus compiler project
❖ Echo, real-time social platform (PaaS)
❖ DevZen podcast (http://devzen.ru)
About myself
❖ Relational Model in 1970
❖ disk-oriented
❖ rows
❖ sql
❖ “One size fits all” doesn’t work:
❖ Column-oriented data warehouses for OLAP.
❖ Key-Value storages, Document storages
Complexity WorkLoad focus
Data WareHouses
Social Networks
OLTP
Writes Reads
SimpleComplex
History of SQL
Startups lifecycle
Users Errors
❖ Start: no money, no users, open source
Startups lifecycle
Users Errors
❖ Start: no money, no users, open source
❖ Middle: more users, storage optimization
Startups lifecycle
❖ Start: no money, no users, open source
❖ Middle: more users, storage optimization
❖ Final: plenty of users, storage failure
Users Errors
New requirements
❖ Large scale systems, with huge and growing data sets
❖ 9M messages per hour in Facebook
❖ 50M messages per day in Twitter
❖ Information is frequently generated by devices
❖ High concurrency requirements
❖ Usually, data model with some relations
❖ Often, transactional integrity
Trends: architecture change
Client Side
Server Side
Cloud Storage
Client Side
Server Side
Database
Consistency, transactions: Database
Storage optimization: Database
Scalability: Client Side
Consistency, transactions: Cloud
Storage optimization: Cloud
Scalability: All levels
Trends: architecture change
❖ CAP: consistency, availability, partitioning
❖ ACID: atomicity, consistency, isolation, durability
❖ BASE: basically available, soft state, eventual
consistency
Trends: architecture change
❖ ‘P’ in CAP is not discrete
❖ Managing partitions: detection, limitations in
operations, recovery
NoSQL
❖ CAP: first ‘A’, then ‘C’: finer control over availability
❖ Horizontal scaling
❖ Not a “relational model”, custom API
❖ Schemaless
❖ Types: Key-Value, Document, Graph, …
Application-level sharding
❖ Additional application-level logic
❖ Difficulties with cross-sharding transactions
❖ More servers to maintain
❖ More components — higher prob for breakdown
NewSQL: definition
“A DBMS that delivers the scalability
and flexibility promised by NoSQL
while retaining the support for SQL
queries and/or ACID, or to improve
performance for appropriate workloads.”
451 Group
NewSQL: definition
❖ SQL as the primary interface
❖ ACID support for transactions
❖ Non-locking concurrency control
❖ High per-node performance
❖ Scalable, shared nothing architecture
Michael Stonebraker
Shared nothing architecture
❖ No single point of failure
❖ Each node is independent and self-sufficient
❖ No shared memory or disk
❖ Scale infinitely
❖ Data partitioning
❖ Slow multi-shards requests
Column-oriented DBMS
❖ Store content by column rather than by row
❖ Efficient in hard disk access
❖ Good for sparse and repeated data
❖ Higher data compression
❖ More reads/writes for large records with a lot of fields
❖ Better for relatively infrequent writes, lots of data throughput on reads
(OLAP, analytic requests).
John Smith 20
Joe Smith 30
Alice Adams 50
John:001; Joe:002; Alice:003.
Smith:001,002; Adams:003.
20:001; 30:002; 50:003.
Traditional DBMS overheads
12%
10%
11%
18% 20%
29%Buffer Management
Logging
Locking
Index management
Latching
Useful work
“Removing those overheads and running the database in
main memory would yield orders of magnitude improvements
in database performance”
by Stonebraker & research group
In-memory storage
❖ High throughput
❖ Low latency
❖ No Buffer Management
❖ If serialized, no Locking or Latching
In-memory storage: price
on-demand 3Y-reserved plan
per hour 11.2 $ 3.9 $
per month 8.1K $ 2.8K $
per year 97K $ 33,7K $
Amazon price reduction
Current price for 1TB (~4 instances of ‘r3.8xlarge’ type)
NewSQL: categories
❖ New approaches: VoltDB, Clustrix, NuoDB
❖ New storage engines: TokuDB, ScaleDB
❖ Transparent clustering: ScaleBase, dbShards
NuoDB
❖ Multi-tier architecture:
❖ Administrative: managing, stats, cli, web-ui
❖ Transactional: ACID except ‘D’, cache
❖ Storage: key-value store (‘D’ from ACID)
NuoDB
❖ Everything is an ‘Atom’
❖ Peer-to-peer communication, encrypted sessions
❖ MVCC + Append-only storage
NuoDB: CAP & ACID
❖ `CP` system. Need majority of nodes to work
❖ If split to two equal parts -> stop
❖ Several consistency modes including ‘consistent_read’
YCSB
❖ Yahoo Cloud Serving Benchmark
❖ Key-value: insert/read/update/scan
❖ Measures:
❖ Performance: latency/throughput
❖ Scaling: elastic speedup
NuoDB: YCSB
Throughput, tps/nodes
0
275 000
550 000
825 000
1 100 000
1 2 4 8 16 24
Update latency, "s
0
25
50
1 2 4 8 16 24
Read latency, "s
0
1.5
3
1 2 4 8 16 24
Hosts: 32GB, Xeon 8 cores, 1TB HDD, 1Gb LAN
5% updates, 95% reads
VoltDB
❖ In-memory storage
❖ Stored procedure interface, async/sync proc execution
❖ Serializing all data access
❖ Horizontal partitioning
❖ Multi-master replication (“K-safety”)
❖ Snapshots + Command Logging
VoltDB
❖ Open-source, community edition is under GPLv3.
❖ Java + C++
❖ Partitioning and Replication control
VoltDB: CAP & ACID
❖ Without K-safety, any node fail break the whole DB
❖ Snapshot and shutdown minor segments during
network paritions
❖ Single-partition transactions are very fast
❖ Multi-partition transactions are slower (manager), try to
avoid (1000s tps in ’13, no updates since)
VoltDB: key-value bench
90%reads, 10%writes
3 nodes: 64GB, dual 2.93GHz intel 6 core processors
VoltDB: “voter” bench
26 SQL statements per transaction
❖ Multi-master
❖ Shared data
❖ Cluster manager to solve

conflicts (locks)
❖ ACID?
❖ Network Partition Handling?
❖ Scaling?
ScaleDB
MySQL MySQL MySQ
…
Mirrored
Storage
…
Application
Cluster
Manager
Mirrored
Storage
Mirrored
Storage
ClustrixDB
❖ “Query fragment” - basic primitive of the system:
❖ read/write/ execute function
❖ modify control flow
❖ perform synchronisation
❖ send rows to query fragments on another nodes
❖ Data partitions: “slices” split and moved transparently
❖ Replication: master slice for reads + slave for redundancy
ClustrixDB
❖ “Move query to the data”
❖ Dynamic and transparent
data layout
❖ Linear scale
ClustrixDB: CAP & ACID
❖ `CP` system. Need majority of nodes to work
❖ Only ‘Repeatable Read’ isolation level

(so, ‘fantom reads’ are possible)
❖ Distributed Lock Manager for writer-writer locks (on
each node)
TPC-C
❖ Online Transaction Processing 

(OLTP) benchmark
❖ 9 types of tables
❖ 5 concurrent transactions of different complexity
❖ Productivity measured in “new-order transaction”
ClustrixDB: TPC-C
❖ 5000W ~ 400GB of data
❖ Compared with Percona
Mysql, Intel Xeon, 8 cores
❖ ClustrixDB nodes: “Dual 4
core Westmere processors”
ClustrixDB: example
❖ 30M users, 10M logins per day
❖ 4.4B transactions per day
❖ 1.08/4.69 Petabytes per month writes/reads
❖ 42 nodes, 336 cores, 2TB memory, 46TB SSD
FoundationDB
❖ KV store, ordered keys
❖ Paxos for cluster coordination
❖ Global ACID transactions, range operations
❖ Lock-free, optimistic concurrency, MVCC
❖ Good testing (deterministic simulation)
❖ Fault-tolerance (replication)
❖ SQL Layer (similar to Google F1 on top of Spanner)
FoundationDB
❖ SSD/Memory storage engine
❖ Layers concept
❖ ‘CP’ system with Paxos-ed

coordination centres
❖ Written in the Flow language (translated to C++11)

with actor model support
❖ Watches, atomic operations (e.g. ‘add’)
FoundationDB: CAP and ACID
❖ Serializable isolation with optimistic concurrency
❖ > 100 wps to the same key? Use another DB!
❖ ‘CP system’ (Paxos)

Need majority of coordination center to work
FoundationDB: KV Performance
Scaling:

up to 24 EC2 c3.8xlarge, 16 cores
Throughput (per core)
FoundationDB:SQL Layer
❖ SQL - layer on top of KV ->

transactional, scalable, HA
❖ SQL Layer is stateless -> 

scalable, fault tolerant
❖ Hierarchical schema
❖ SQL and JSON interfaces
❖ Powerful indexing (multi-table, geospatial, …)
FoundationDB: SQL Performance
Sysbench: read/write, ~80GB, 300M rows
One node test

4 core, 16GB RAM, 200GB SATA SSD
Multi nodes test

KV: 8 nodes with 1-process; 3-replication

SQL: up to 32 nodes with 

8-thread sysbench process
MemSQL
❖ In-Memory Storage for OLTP
❖ Column-oriented Storage for OLAP
❖ Compiled Query Execution Plans (+cache)
❖ Local ACID transactions (no global txs for distributed)
❖ Lock-free, MVCC
❖ Fault tolerance, automatic replication, 

redundancy (=2 by default)
❖ [Almost] no penalty for replica creation
MemSQL
❖ Two-tiered shared-nothing architecture
• Aggregators for query routing
• Leaves for storage and processing
❖ Integration:
• SQL
• MySQL protocol
• JSON API
MemSQL: CAP & ACID
❖ `CP` system. Need majority of nodes (or half with
master) to work
❖ Only ‘Read Committed’ isolation level

(‘fantom reads’, ‘non-repeatable reads’ are possible)
❖ Manual Master Aggregator management
MemSQL: Performance
❖ Adapted TPC-H
❖ OLAP Reads & OLTP writes simultaneously
❖ AWS EC2 VPC
Overview
Max

Isolation
Scalable
Open
Source
Free to try Language
PostgreSQL S Postgres-XL? Yes Yes C
NuoDB CR Yes No <5 domains C++
VoltDB S Yes Yes Yes 

(wo HA)
Java/C++
ScaleDB RC? Yes? No ? ?
ClustrixDB RR Yes No Trial

(via email req)
C ?
FoundationDB S Yes Partly <6 processes Flow(C++)
MemSQL RC Yes No ? C++
S: Serializable, RR: Read Committed, RC: Read Committed, CR: Consistent Read
Conclusions
❖ NewSQL is an established trend with a number of
options
❖ Hard to pick one because they're not on a common scale
❖ No silver bullet
❖ Growing data volume requires ever more efficient ways
to store and process it
Questions?
Links: General concepts
❖ CAP explanation from Brewer, 12 years later
❖ Scalable performance, simple explanation
❖ What is NewSQL
❖ Overview about NoSQL databases
❖ Performance loss in OLTP systems
❖ Memory price trends
❖ (wiki) Shared Nothing Architecture
❖ (wiki) Column oriented DBMS
❖ How NewSQL handles big data
❖ What is YCSB benchmark
❖ What is TPC benchmark
❖ Transactional isolation levels
Links: NuoDB
❖ http://www.infoq.com/articles/nuodb-architecture-1/
❖ http://www.infoq.com/articles/nuodb-architecture-2/
❖ http://stackoverflow.com/questions/14552091/nuodb-and-hdfs-as-
storage
❖ http://go.nuodb.com/rs/nuodb/images/NuoDB_Benchmark_Report.pdf
❖ NuoDB white paper (google has you :)
❖ https://aphyr.com/posts/292-call-me-maybe-nuodb
❖ http://dev.nuodb.com/techblog/failure-detection-and-network-partition-
management-nuodb
Links: VoltDB
❖ White paper, Technical overview (google has you)
❖ https://github.com/VoltDB/voltdb-client-erlang/blob/master/
doc/BENCHMARK1.md
❖ http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-
really-as-scalable-as-they-claim/
❖ https://voltdb.com/blog/voltdb-3-x-performance-
characteristics/
❖ http://docs.voltdb.com/UsingVoltDB/KsafeNetPart.php
❖ https://news.ycombinator.com/item?id=6639127
Links: ScaleDB
❖ http://scaledb.com/pdfs/TechnicalOverview.pdf
❖ http://www.scaledb.com/pdfs/
scaledb_multitenant.pdf
❖ http://www.percona.com/live/mysql-
conference-2013/sites/default/files/slides/
DB_Vistualization_for_PublicPrivate_Clouds.pdf
Links: Clustrix
❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_A-New-
Approach_WhitePaper.pdf
❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_Driving-the-
New-Wave_WP.pdf
❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_AWS_WP.pdf
❖ http://www.clustrix.com/wp-content/uploads/2013/10/
Clustrix_TPCC_Percona.pdf
❖ http://sergei.clustrix.com/2011/01/mongodb-vs-clustrix-comparison-
part-1.html
❖ http://docs.clustrix.com/display/CLXDOC/Consistency%2C+Fault+Tolerance
%2C+and+Availability
Links: FoundationDB
❖ https://foundationdb.com/key-value-store/white-papers
❖ http://blog.foundationdb.com/call-me-maybe-foundationdb-vs-jepsen
❖ https://foundationdb.com/acid-claims
❖ https://foundationdb.com/key-value-store/performance
❖ https://foundationdb.com/layers/sql/documentation/Concepts
❖ https://foundationdb.com/layers/sql/documentation/SQL/indexes.html
❖ https://foundationdb.com/layers/sql/performance
❖ https://foundationdb.com/key-value-store/features
❖ https://foundationdb.com/key-value-store/documentation/configuration.html
❖ https://foundationdb.com/key-value-store/documentation/beta1/developer-
guide.html
❖ https://foundationdb.com/layers/sql/documentation/Concepts/
known.limitations.html
Links: MemSQL
❖ MemSQL Whitepaper "The Modern Database
Landscape"
❖ MemSQL Whitepaper "ESG Lab Benchmark of
MemSQL's Performance”
❖ MemSQL Whitepaper “Technical overview”
❖ http://developers.memsql.com/docs/latest/concepts/
dev_concepts.html
❖ http://developers.memsql.com/docs/2.6/admin/
high_availability.html

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Uwe Printz
 
トランザクション入門
トランザクション入門 トランザクション入門
トランザクション入門 Kumazaki Hiroki
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
 
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)Takeshi Mikami
 
ファントム異常を排除する高速なトランザクション処理向けインデックス
ファントム異常を排除する高速なトランザクション処理向けインデックスファントム異常を排除する高速なトランザクション処理向けインデックス
ファントム異常を排除する高速なトランザクション処理向けインデックスSho Nakazono
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScyllaDB
 
Tutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesTutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesKarthik Ramasamy
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
データ分析を支える技術 DWH再入門
データ分析を支える技術 DWH再入門データ分析を支える技術 DWH再入門
データ分析を支える技術 DWH再入門Satoru Ishikawa
 

Was ist angesagt? (20)

Cassandra
CassandraCassandra
Cassandra
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
HDFS Design Principles
HDFS Design PrinciplesHDFS Design Principles
HDFS Design Principles
 
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
 
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
 
トランザクション入門
トランザクション入門 トランザクション入門
トランザクション入門
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)
データウェアハウス入門 (マーケティングデータ分析基盤技術勉強会)
 
ファントム異常を排除する高速なトランザクション処理向けインデックス
ファントム異常を排除する高速なトランザクション処理向けインデックスファントム異常を排除する高速なトランザクション処理向けインデックス
ファントム異常を排除する高速なトランザクション処理向けインデックス
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
 
Tutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming ArchitecturesTutorial - Modern Real Time Streaming Architectures
Tutorial - Modern Real Time Streaming Architectures
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Apache Sparkのご紹介 (後半:技術トピック)
Apache Sparkのご紹介 (後半:技術トピック)Apache Sparkのご紹介 (後半:技術トピック)
Apache Sparkのご紹介 (後半:技術トピック)
 
Cassandra in e-commerce
Cassandra in e-commerceCassandra in e-commerce
Cassandra in e-commerce
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
データ分析を支える技術 DWH再入門
データ分析を支える技術 DWH再入門データ分析を支える技術 DWH再入門
データ分析を支える技術 DWH再入門
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 

Ähnlich wie NewSQL overview, Feb 2015

Ndb cluster 80_requirements
Ndb cluster 80_requirementsNdb cluster 80_requirements
Ndb cluster 80_requirementsmikaelronstrom
 
Иван Глушков (Echo)
Иван Глушков (Echo)Иван Глушков (Echo)
Иван Глушков (Echo)Ontico
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good? Alkin Tezuysal
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
NAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerNAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerJangseon Ryu
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Alexey Rybak
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics PlatformSantanu Dey
 
LuSql: (Quickly and easily) Getting your data from your DBMS into Lucene
LuSql: (Quickly and easily) Getting your data from your DBMS into LuceneLuSql: (Quickly and easily) Getting your data from your DBMS into Lucene
LuSql: (Quickly and easily) Getting your data from your DBMS into Luceneeby
 
Openstack HA
Openstack HAOpenstack HA
Openstack HAYong Luo
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerkuchinskaya
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformMaris Elsins
 
MySQL Options in OpenStack
MySQL Options in OpenStackMySQL Options in OpenStack
MySQL Options in OpenStackTesora
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Edge performance with in memory nosql
Edge performance with in memory nosqlEdge performance with in memory nosql
Edge performance with in memory nosqlLiviu Costea
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Ontico
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowEd Balduf
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackMatt Lord
 

Ähnlich wie NewSQL overview, Feb 2015 (20)

Ndb cluster 80_requirements
Ndb cluster 80_requirementsNdb cluster 80_requirements
Ndb cluster 80_requirements
 
Иван Глушков (Echo)
Иван Глушков (Echo)Иван Глушков (Echo)
Иван Глушков (Echo)
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good?
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
NAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for ContainerNAVER Ceph Storage on ssd for Container
NAVER Ceph Storage on ssd for Container
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
LuSql: (Quickly and easily) Getting your data from your DBMS into Lucene
LuSql: (Quickly and easily) Getting your data from your DBMS into LuceneLuSql: (Quickly and easily) Getting your data from your DBMS into Lucene
LuSql: (Quickly and easily) Getting your data from your DBMS into Lucene
 
Openstack HA
Openstack HAOpenstack HA
Openstack HA
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance Platform
 
MySQL Options in OpenStack
MySQL Options in OpenStackMySQL Options in OpenStack
MySQL Options in OpenStack
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Edge performance with in memory nosql
Edge performance with in memory nosqlEdge performance with in memory nosql
Edge performance with in memory nosql
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStack
 

Mehr von Ivan Glushkov

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixirIvan Glushkov
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusIvan Glushkov
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine OverviewIvan Glushkov
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow IntroIvan Glushkov
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulIvan Glushkov
 

Mehr von Ivan Glushkov (8)

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixir
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rus
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine Overview
 
Raft in details
Raft in detailsRaft in details
Raft in details
 
Hashicorp Nomad
Hashicorp NomadHashicorp Nomad
Hashicorp Nomad
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow Intro
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
 
fp intro
fp introfp intro
fp intro
 

Kürzlich hochgeladen

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Kürzlich hochgeladen (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

NewSQL overview, Feb 2015

  • 1. Feb 2015 NewSQL Overview Ivan Glushkov @gliush ivan.glushkov@gmail.com
  • 2. ❖ MIPT ❖ MCST, Elbrus compiler project ❖ Echo, real-time social platform (PaaS) ❖ DevZen podcast (http://devzen.ru) About myself
  • 3. ❖ Relational Model in 1970 ❖ disk-oriented ❖ rows ❖ sql ❖ “One size fits all” doesn’t work: ❖ Column-oriented data warehouses for OLAP. ❖ Key-Value storages, Document storages Complexity WorkLoad focus Data WareHouses Social Networks OLTP Writes Reads SimpleComplex History of SQL
  • 4. Startups lifecycle Users Errors ❖ Start: no money, no users, open source
  • 5. Startups lifecycle Users Errors ❖ Start: no money, no users, open source ❖ Middle: more users, storage optimization
  • 6. Startups lifecycle ❖ Start: no money, no users, open source ❖ Middle: more users, storage optimization ❖ Final: plenty of users, storage failure Users Errors
  • 7. New requirements ❖ Large scale systems, with huge and growing data sets ❖ 9M messages per hour in Facebook ❖ 50M messages per day in Twitter ❖ Information is frequently generated by devices ❖ High concurrency requirements ❖ Usually, data model with some relations ❖ Often, transactional integrity
  • 8. Trends: architecture change Client Side Server Side Cloud Storage Client Side Server Side Database Consistency, transactions: Database Storage optimization: Database Scalability: Client Side Consistency, transactions: Cloud Storage optimization: Cloud Scalability: All levels
  • 9. Trends: architecture change ❖ CAP: consistency, availability, partitioning ❖ ACID: atomicity, consistency, isolation, durability ❖ BASE: basically available, soft state, eventual consistency
  • 10. Trends: architecture change ❖ ‘P’ in CAP is not discrete ❖ Managing partitions: detection, limitations in operations, recovery
  • 11. NoSQL ❖ CAP: first ‘A’, then ‘C’: finer control over availability ❖ Horizontal scaling ❖ Not a “relational model”, custom API ❖ Schemaless ❖ Types: Key-Value, Document, Graph, …
  • 12. Application-level sharding ❖ Additional application-level logic ❖ Difficulties with cross-sharding transactions ❖ More servers to maintain ❖ More components — higher prob for breakdown
  • 13. NewSQL: definition “A DBMS that delivers the scalability and flexibility promised by NoSQL while retaining the support for SQL queries and/or ACID, or to improve performance for appropriate workloads.” 451 Group
  • 14. NewSQL: definition ❖ SQL as the primary interface ❖ ACID support for transactions ❖ Non-locking concurrency control ❖ High per-node performance ❖ Scalable, shared nothing architecture Michael Stonebraker
  • 15. Shared nothing architecture ❖ No single point of failure ❖ Each node is independent and self-sufficient ❖ No shared memory or disk ❖ Scale infinitely ❖ Data partitioning ❖ Slow multi-shards requests
  • 16. Column-oriented DBMS ❖ Store content by column rather than by row ❖ Efficient in hard disk access ❖ Good for sparse and repeated data ❖ Higher data compression ❖ More reads/writes for large records with a lot of fields ❖ Better for relatively infrequent writes, lots of data throughput on reads (OLAP, analytic requests). John Smith 20 Joe Smith 30 Alice Adams 50 John:001; Joe:002; Alice:003. Smith:001,002; Adams:003. 20:001; 30:002; 50:003.
  • 17. Traditional DBMS overheads 12% 10% 11% 18% 20% 29%Buffer Management Logging Locking Index management Latching Useful work “Removing those overheads and running the database in main memory would yield orders of magnitude improvements in database performance” by Stonebraker & research group
  • 18. In-memory storage ❖ High throughput ❖ Low latency ❖ No Buffer Management ❖ If serialized, no Locking or Latching
  • 19. In-memory storage: price on-demand 3Y-reserved plan per hour 11.2 $ 3.9 $ per month 8.1K $ 2.8K $ per year 97K $ 33,7K $ Amazon price reduction Current price for 1TB (~4 instances of ‘r3.8xlarge’ type)
  • 20. NewSQL: categories ❖ New approaches: VoltDB, Clustrix, NuoDB ❖ New storage engines: TokuDB, ScaleDB ❖ Transparent clustering: ScaleBase, dbShards
  • 21. NuoDB ❖ Multi-tier architecture: ❖ Administrative: managing, stats, cli, web-ui ❖ Transactional: ACID except ‘D’, cache ❖ Storage: key-value store (‘D’ from ACID)
  • 22. NuoDB ❖ Everything is an ‘Atom’ ❖ Peer-to-peer communication, encrypted sessions ❖ MVCC + Append-only storage
  • 23. NuoDB: CAP & ACID ❖ `CP` system. Need majority of nodes to work ❖ If split to two equal parts -> stop ❖ Several consistency modes including ‘consistent_read’
  • 24. YCSB ❖ Yahoo Cloud Serving Benchmark ❖ Key-value: insert/read/update/scan ❖ Measures: ❖ Performance: latency/throughput ❖ Scaling: elastic speedup
  • 25. NuoDB: YCSB Throughput, tps/nodes 0 275 000 550 000 825 000 1 100 000 1 2 4 8 16 24 Update latency, "s 0 25 50 1 2 4 8 16 24 Read latency, "s 0 1.5 3 1 2 4 8 16 24 Hosts: 32GB, Xeon 8 cores, 1TB HDD, 1Gb LAN 5% updates, 95% reads
  • 26. VoltDB ❖ In-memory storage ❖ Stored procedure interface, async/sync proc execution ❖ Serializing all data access ❖ Horizontal partitioning ❖ Multi-master replication (“K-safety”) ❖ Snapshots + Command Logging
  • 27. VoltDB ❖ Open-source, community edition is under GPLv3. ❖ Java + C++ ❖ Partitioning and Replication control
  • 28. VoltDB: CAP & ACID ❖ Without K-safety, any node fail break the whole DB ❖ Snapshot and shutdown minor segments during network paritions ❖ Single-partition transactions are very fast ❖ Multi-partition transactions are slower (manager), try to avoid (1000s tps in ’13, no updates since)
  • 29. VoltDB: key-value bench 90%reads, 10%writes 3 nodes: 64GB, dual 2.93GHz intel 6 core processors
  • 30. VoltDB: “voter” bench 26 SQL statements per transaction
  • 31. ❖ Multi-master ❖ Shared data ❖ Cluster manager to solve
 conflicts (locks) ❖ ACID? ❖ Network Partition Handling? ❖ Scaling? ScaleDB MySQL MySQL MySQ … Mirrored Storage … Application Cluster Manager Mirrored Storage Mirrored Storage
  • 32. ClustrixDB ❖ “Query fragment” - basic primitive of the system: ❖ read/write/ execute function ❖ modify control flow ❖ perform synchronisation ❖ send rows to query fragments on another nodes ❖ Data partitions: “slices” split and moved transparently ❖ Replication: master slice for reads + slave for redundancy
  • 33. ClustrixDB ❖ “Move query to the data” ❖ Dynamic and transparent data layout ❖ Linear scale
  • 34. ClustrixDB: CAP & ACID ❖ `CP` system. Need majority of nodes to work ❖ Only ‘Repeatable Read’ isolation level
 (so, ‘fantom reads’ are possible) ❖ Distributed Lock Manager for writer-writer locks (on each node)
  • 35. TPC-C ❖ Online Transaction Processing 
 (OLTP) benchmark ❖ 9 types of tables ❖ 5 concurrent transactions of different complexity ❖ Productivity measured in “new-order transaction”
  • 36. ClustrixDB: TPC-C ❖ 5000W ~ 400GB of data ❖ Compared with Percona Mysql, Intel Xeon, 8 cores ❖ ClustrixDB nodes: “Dual 4 core Westmere processors”
  • 37. ClustrixDB: example ❖ 30M users, 10M logins per day ❖ 4.4B transactions per day ❖ 1.08/4.69 Petabytes per month writes/reads ❖ 42 nodes, 336 cores, 2TB memory, 46TB SSD
  • 38. FoundationDB ❖ KV store, ordered keys ❖ Paxos for cluster coordination ❖ Global ACID transactions, range operations ❖ Lock-free, optimistic concurrency, MVCC ❖ Good testing (deterministic simulation) ❖ Fault-tolerance (replication) ❖ SQL Layer (similar to Google F1 on top of Spanner)
  • 39. FoundationDB ❖ SSD/Memory storage engine ❖ Layers concept ❖ ‘CP’ system with Paxos-ed
 coordination centres ❖ Written in the Flow language (translated to C++11)
 with actor model support ❖ Watches, atomic operations (e.g. ‘add’)
  • 40. FoundationDB: CAP and ACID ❖ Serializable isolation with optimistic concurrency ❖ > 100 wps to the same key? Use another DB! ❖ ‘CP system’ (Paxos)
 Need majority of coordination center to work
  • 41. FoundationDB: KV Performance Scaling:
 up to 24 EC2 c3.8xlarge, 16 cores Throughput (per core)
  • 42. FoundationDB:SQL Layer ❖ SQL - layer on top of KV ->
 transactional, scalable, HA ❖ SQL Layer is stateless -> 
 scalable, fault tolerant ❖ Hierarchical schema ❖ SQL and JSON interfaces ❖ Powerful indexing (multi-table, geospatial, …)
  • 43. FoundationDB: SQL Performance Sysbench: read/write, ~80GB, 300M rows One node test
 4 core, 16GB RAM, 200GB SATA SSD Multi nodes test
 KV: 8 nodes with 1-process; 3-replication
 SQL: up to 32 nodes with 
 8-thread sysbench process
  • 44. MemSQL ❖ In-Memory Storage for OLTP ❖ Column-oriented Storage for OLAP ❖ Compiled Query Execution Plans (+cache) ❖ Local ACID transactions (no global txs for distributed) ❖ Lock-free, MVCC ❖ Fault tolerance, automatic replication, 
 redundancy (=2 by default) ❖ [Almost] no penalty for replica creation
  • 45. MemSQL ❖ Two-tiered shared-nothing architecture • Aggregators for query routing • Leaves for storage and processing ❖ Integration: • SQL • MySQL protocol • JSON API
  • 46. MemSQL: CAP & ACID ❖ `CP` system. Need majority of nodes (or half with master) to work ❖ Only ‘Read Committed’ isolation level
 (‘fantom reads’, ‘non-repeatable reads’ are possible) ❖ Manual Master Aggregator management
  • 47. MemSQL: Performance ❖ Adapted TPC-H ❖ OLAP Reads & OLTP writes simultaneously ❖ AWS EC2 VPC
  • 48. Overview Max
 Isolation Scalable Open Source Free to try Language PostgreSQL S Postgres-XL? Yes Yes C NuoDB CR Yes No <5 domains C++ VoltDB S Yes Yes Yes 
 (wo HA) Java/C++ ScaleDB RC? Yes? No ? ? ClustrixDB RR Yes No Trial
 (via email req) C ? FoundationDB S Yes Partly <6 processes Flow(C++) MemSQL RC Yes No ? C++ S: Serializable, RR: Read Committed, RC: Read Committed, CR: Consistent Read
  • 49. Conclusions ❖ NewSQL is an established trend with a number of options ❖ Hard to pick one because they're not on a common scale ❖ No silver bullet ❖ Growing data volume requires ever more efficient ways to store and process it
  • 51. Links: General concepts ❖ CAP explanation from Brewer, 12 years later ❖ Scalable performance, simple explanation ❖ What is NewSQL ❖ Overview about NoSQL databases ❖ Performance loss in OLTP systems ❖ Memory price trends ❖ (wiki) Shared Nothing Architecture ❖ (wiki) Column oriented DBMS ❖ How NewSQL handles big data ❖ What is YCSB benchmark ❖ What is TPC benchmark ❖ Transactional isolation levels
  • 52. Links: NuoDB ❖ http://www.infoq.com/articles/nuodb-architecture-1/ ❖ http://www.infoq.com/articles/nuodb-architecture-2/ ❖ http://stackoverflow.com/questions/14552091/nuodb-and-hdfs-as- storage ❖ http://go.nuodb.com/rs/nuodb/images/NuoDB_Benchmark_Report.pdf ❖ NuoDB white paper (google has you :) ❖ https://aphyr.com/posts/292-call-me-maybe-nuodb ❖ http://dev.nuodb.com/techblog/failure-detection-and-network-partition- management-nuodb
  • 53. Links: VoltDB ❖ White paper, Technical overview (google has you) ❖ https://github.com/VoltDB/voltdb-client-erlang/blob/master/ doc/BENCHMARK1.md ❖ http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb- really-as-scalable-as-they-claim/ ❖ https://voltdb.com/blog/voltdb-3-x-performance- characteristics/ ❖ http://docs.voltdb.com/UsingVoltDB/KsafeNetPart.php ❖ https://news.ycombinator.com/item?id=6639127
  • 54. Links: ScaleDB ❖ http://scaledb.com/pdfs/TechnicalOverview.pdf ❖ http://www.scaledb.com/pdfs/ scaledb_multitenant.pdf ❖ http://www.percona.com/live/mysql- conference-2013/sites/default/files/slides/ DB_Vistualization_for_PublicPrivate_Clouds.pdf
  • 55. Links: Clustrix ❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_A-New- Approach_WhitePaper.pdf ❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_Driving-the- New-Wave_WP.pdf ❖ http://www.clustrix.com/wp-content/uploads/2013/10/Clustrix_AWS_WP.pdf ❖ http://www.clustrix.com/wp-content/uploads/2013/10/ Clustrix_TPCC_Percona.pdf ❖ http://sergei.clustrix.com/2011/01/mongodb-vs-clustrix-comparison- part-1.html ❖ http://docs.clustrix.com/display/CLXDOC/Consistency%2C+Fault+Tolerance %2C+and+Availability
  • 56. Links: FoundationDB ❖ https://foundationdb.com/key-value-store/white-papers ❖ http://blog.foundationdb.com/call-me-maybe-foundationdb-vs-jepsen ❖ https://foundationdb.com/acid-claims ❖ https://foundationdb.com/key-value-store/performance ❖ https://foundationdb.com/layers/sql/documentation/Concepts ❖ https://foundationdb.com/layers/sql/documentation/SQL/indexes.html ❖ https://foundationdb.com/layers/sql/performance ❖ https://foundationdb.com/key-value-store/features ❖ https://foundationdb.com/key-value-store/documentation/configuration.html ❖ https://foundationdb.com/key-value-store/documentation/beta1/developer- guide.html ❖ https://foundationdb.com/layers/sql/documentation/Concepts/ known.limitations.html
  • 57. Links: MemSQL ❖ MemSQL Whitepaper "The Modern Database Landscape" ❖ MemSQL Whitepaper "ESG Lab Benchmark of MemSQL's Performance” ❖ MemSQL Whitepaper “Technical overview” ❖ http://developers.memsql.com/docs/latest/concepts/ dev_concepts.html ❖ http://developers.memsql.com/docs/2.6/admin/ high_availability.html