SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Introduction to  NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
Agenda Introduction Objective Explore NoSQL Databases Conclusion
Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases  in each family of databases
NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
Sample Document {      "day": [ 2010, 01, 23 ],      "products": {          "apple": { "price": 10 "quantity": 6 },          "kiwi": { "price": 20 "quantity": 2 }      },      "checkout": 100  }
Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking  No joins, primary key or foreign key (UUIDs are auto assigned)  Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS  Multiple Replicas Highly distributed
Column: Hbase [cont’d] Lars George
Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
Column: Cassandra [cont’d] Eric Evans
Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
Sample Property Graph[] Todd Hoff
Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
Conclusion
Thank You!
References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Weitere ähnliche Inhalte

Was ist angesagt?

NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNodeXperts
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operationsanujaggarwal49
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 

Was ist angesagt? (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
MongoDB
MongoDBMongoDB
MongoDB
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NoSQL
NoSQLNoSQL
NoSQL
 
Mongodb vs mysql
Mongodb vs mysqlMongodb vs mysql
Mongodb vs mysql
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Data models in NoSQL
Data models in NoSQLData models in NoSQL
Data models in NoSQL
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Andere mochten auch

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL DatabasesBADR
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibitionAnna Casey
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Boris Otto
 

Andere mochten auch (13)

Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
 

Ähnlich wie Introduction to NoSQL Databases

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionKelum Senanayake
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012Appirio
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBJeff Douglas
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App ArchitectureCorey Butler
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open SourceDanilo Bordini
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspectiveAlin Pandichi
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...Ram Murat Sharma
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 Andrey Vykhodtsev
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about cephEmma Haruka Iwao
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...DneprCiklumEvents
 

Ähnlich wie Introduction to NoSQL Databases (20)

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Intro to RavenDB
Intro to RavenDBIntro to RavenDB
Intro to RavenDB
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDB
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App Architecture
 
MongoDB is the MashupDB
MongoDB is the MashupDBMongoDB is the MashupDB
MongoDB is the MashupDB
 
Microsoft Azure e Open Source
Microsoft Azure e Open SourceMicrosoft Azure e Open Source
Microsoft Azure e Open Source
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Software development - the java perspective
Software development - the java perspectiveSoftware development - the java perspective
Software development - the java perspective
 
MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...MongoDB - A next-generation database that lets you create applications never ...
MongoDB - A next-generation database that lets you create applications never ...
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
What you need to know about ceph
What you need to know about cephWhat you need to know about ceph
What you need to know about ceph
 
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...03 net saturday anton samarskyy ''document oriented databases for the .net pl...
03 net saturday anton samarskyy ''document oriented databases for the .net pl...
 

Kürzlich hochgeladen

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Kürzlich hochgeladen (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

Introduction to NoSQL Databases

  • 1. Introduction to NoSQL Databases San Diego NoSQL Meetup – Aug 2010 By Derek Stainer http://nosqldatabases.com
  • 2. Agenda Introduction Objective Explore NoSQL Databases Conclusion
  • 3. Introduction UCSD Graduate in Computer Science Java Developer for 10 years Creator of http://nosqldatabases.com Curator of NoSQL information
  • 4. Objective Deeper dive into each type of NoSQL database Discuss 1-2 NoSQL databases in each family of databases
  • 5. NoSQL Taxonomy Key/Value Document Column Graph Others Geospatial File System Object
  • 6. Key/Value Databases Global collection of Key/Value pairs Inspired by Amazon’s Dynamo and Distributed Hashtables Designed to handle massive load Multiple Types In memory i.e. Memcache On Disk i.e. Redis, SimpleDB Eventually Consistent i.e. Dynamo, Voldemort
  • 7. Key/Value: Voldemort Created by LinkedIn, now open source Inspired by Amazon’s Dynamo Written in Java Pluggable Storage BerkeleyDB, In Memory, MySQL Pluggable Serialization JSON, Thrift, Protocol Buffers, etc. Cluster Rebalancing
  • 8. Key/Value: Voldemort Versioning, based on Vector Clocks Reconciliation occurs on reads. Partitioning and Replication based on Dynamo Consistent Hashing Virtual Nodes Gossip
  • 9. Other Key/Value Stores Other Key/Value Stores Amazon’s Dynamo Riak Redis Memcache SimpleDB
  • 10. Document Databases Similar to a Key/Value database but with a major difference, value is a document Inspired by Lotus Notes Flexible Schema Any number of fields can be added Documents stored in JSON or BSON formats Examples: CouchDB, MongoDB
  • 11. Sample Document { "day": [ 2010, 01, 23 ], "products": { "apple": { "price": 10 "quantity": 6 }, "kiwi": { "price": 20 "quantity": 2 } }, "checkout": 100 }
  • 12. Document: CouchDB Development began ~ 2005 by Damien Katz former Lotus Notes Developer Couch – Cluster Of Unreliable Commodity Hardware Top level Apache Project Commercially supported by CouchIO Licensed under Apache License Written in Erlang Documents are stored in JSON
  • 13. Document: CouchDB [cont’d] B-Tree Storage Engine MVCC model, no locking No joins, primary key or foreign key (UUIDs are auto assigned) Built bi-directional replication Can even run offline, come back and sync back changes Custom persistent views using MapReduce REST API
  • 14. Document: MongoDB Development started in 2007 Commercially supported and developed by 10Gen Stores documents using BSON Supports AdHoc queries Can query against embedded objects and arrays Support multiples types of indexing
  • 15. Document: MongoDB [cont’d] Officially supported drivers available for multiple languages C, C++, Java, Javascript, Perl, PHP, Python and Ruby Community supported drivers include: Scala, Node.js, Haskell, Erlang, Smalltalk Replication uses a master/slave model Scales horizontally via sharding Written C++
  • 16. Column Family Databases Each key is associated with multiple attributes (i.e. Columns) Hybrid row/column stores Inspired by Google BigTable Examples: HBase, Cassandra
  • 17. Column: HBase Based on Google’s BigTable Apache Project TLP Cloudera (certifications, EC2 AMI’s, etc.) Layered over HDFS (Hadoop Distributed File System) Input/Output for MapReduce Jobs APIs Thrift, REST
  • 18. Column: Hbase [cont’d] Automatic partitioning Automatic re-balancing/re-partitioning Fault tolerant HDFS Multiple Replicas Highly distributed
  • 20. Column: Cassandra Created at Facebook for Inbox search Facebook -> Google Code -> ASF Commercial Support available from Riptano Features taken from both Dynamo and BigTable Dynamo – Consistent hashing, Partitioning, Replication Big Table – Column Familes, MemTables, SSTables
  • 21. Column: Cassandra [cont’d] Symmetric nodes No single point of failure Linearly scalable Ease of administration Flexible/Automated Provisioning Flexible Replica Replacement High Availability Eventual Consistency However, consistency is tuneable
  • 22. Column: Cassandra [cont’d] Partitioning Random Good distribution of data between nodes Range scans not possible Order Preserving Can lead to unbalanced nodes Range scans, Natural Order Custom Extremely fast reads/writes (low latency) Thrift API
  • 23. Column: Cassandra [cont’d] Column Basic unit of storage Column Family Collection of like records Record level atomicity Indexed Keyspace Top level namespace Usually one per application
  • 25. Column: Cassandra [cont’d] Column Details Name byte[] Queried against Determines sort order Value byte[] Opaque to Cassandra Timestamp long Conflict resolution (last write wins)
  • 26. Graph Databases Inspired by Euler Graph Theory, G=(E,V) Focused on modeling the structure of the data Property Graph Data Model Examples: Neo4j, InfiniteGraph
  • 28. Graph: Neo4j Data Model: Property Graph Nodes – Person, Place, Thing, etc. Relationships – Lives, Likes, Owns, etc. Properties on Both Primary operation is graph traversal between nodes Written in Java Embedded database
  • 29. Graph: Neo4j [cont’d] Disk-based Graph stored in custom binary format Transactional JTA/JTS, XA, 2PC, MVCC Scales Billions of nodes/relationships/properties per JVM Robust 6+ years in 24/7 production
  • 30. Graph: Neo4j [cont’d] Multiple language binds Jython, Cpython Jruby (including RESTful API) Clojure Scala (including RESTful API) Uses Social Graph i.e. Facebook Recommendation Engines Financial Audit
  • 31. Graph: Neo4j [cont’d] Licensed under AGPLv3 Dual Commercial License Available First server is free Second server Inexpensive Commercial support provided by Neo Technologies
  • 32. Other Graph Databases Other graph databases InfiniteGraph HyperGraphDB sones
  • 35. References NoSQL Databases - Part 1 – Landscape, Vineet Guptahttp://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html NoSQL for Dummies, Tobias Ivarssonhttp://www.slideshare.net/thobe/nosql-for-dummies NoSQL Databases, Marin Dimitrovhttp://www.slideshare.net/marin_dimitrov/nosql-databases-3584443 CouchDB vs. MongoDB, Gabriele Lanahttp://www.slideshare.net/gabriele.lana/couchdb-vs-mongodb-2982288 Hbase, Ryan Rawsonhttp://www.slideshare.net/adorepump/hbase-nosql Introduction to Cassandra, Gary Dusbabekhttp://www.slideshare.net/gdusbabek/introduction-to-cassandra-june-2010 Cassandra Explained, Eric Evanshttp://www.slideshare.net/jericevans/cassandra-explained Towards Robust Distributed Systems, Eric Brewerhttp://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Cassandra - A Decentralized Structured Storage System, Lakshman, Ladishttp://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
  • 36. References [cont’d] Bigtable: A Distributed Storage System for Structured Data, Google Inc.http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf Dynamo: Amazon’s Highly Available Key-value Store, Amazon Inc.http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf HBase Architecture 101 – Storage, Lars Georgehttp://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html BASE: An ACID Alternative, Dan Pritchett

Hinweis der Redaktion

  1. Surveying the NoSQL Landscape, By Derek Stainer
  2. Indexing types include, single-key, compound, unique, non-unique, and geospatial
  3. Surveying the NoSQL Landscape, By Derek Stainer
  4. Surveying the NoSQL Landscape, By Derek Stainer