The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
An introduction to Google Spanner
1. Google Spanner
● What is it ?
● How does it work ?
● Future Scale
● Architecture
● Terms
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
2. Google Spanner – What is it ?
● The world's largest distributed database
● Internally used by Google
● Has a true time API to avoid latency problems
● Supports Google's Advertising business
● It is fault tolerant to large scale outages
● Offers very high availability and latency
– Aiming for 99% and 50 ms
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
3. Google Spanner – How does it work ?
● Has a true time API
– Atomic clocks
– GPS Clocks
– Locally determine accurate time
– No need for global time sync
● One single global name space
● Data stored globally via directory namespace
● Uses a Paxos algorithm
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
4. Google Spanner – Scale
● How big is it , what are they aiming for ?
– Aiming for 107
machines
– 1013
directories
– 1018
bytes of storage
– 1000's of storage locations
– 109
clients
● Current data centres up to 100 ms apart
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
5. Google Spanner – Architecture
● A zonemaster has 100's of spanservers
● Zonemasters assign data to spanservers which serve clients
● Location proxies help clients locate spanservers
● Universe master displays zone status information
● Placement driver automates data zone movement
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
6. Google Spanner – Architecture
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
7. Google Spanner – Architecture
● Each spanserver manages 100's of tablets
● Each spanserver has a paxos machine
● Paxos machine supports replication
● Lead replica has lock table
● Lead replica has transaction manager
● For transactions over multiple paxis groups
– 2 phase commit used
– For control of transactions
● Coordinator leader &
● Coordinator slaves used
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
8. Google Spanner – Terms
● NewSQL
– A modern RDBMS that scales like NoSQL but offers
OLTP ACID guarantees
● BigTable
– Google's storage system built on GFS
● Google F1
– Google RDBMS for the Adwords system
● Paxos
– An algorithm for determining concensus in a network of
unreliable processors
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
9. Google Spanner – Terms
● RDBMS
– Relational Database Management System
● NoSQL
– A highly optimized database for large storage volumes, it
offers a less constrained consistency model than traditional
rdbms's
● ACID
– ACID (Atomicity, Consistency, Isolation, Durability) is a set of
properties that guarantee that database transactions are
processed reliably
● OLTP
– Online transaction processing
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
10. Google Spanner – Terms
● Time Synchronisation
– The coordination of events to operate a system in unison
● Global Consistency
– Ensuring global users have a consistent view of data
● Atomic clock
– Clocks based upon atomic physics principles
● GPS clock
– Clocks that use GPS to determine time from multiple
satellite atomic clocks
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
11. Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems