6. When are RDBMS less than ideal?
Your data is stored and retrieved mainly by primary key,
without complex joins.
You have a non-trivial amount of data, and the
thought of managing lots of RDBMS shards and
replication failure scenarios gives you the fear.
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-
value-stores/
8. Some of the players
Project Voldemort
Ringo
Scalaris
Kai
Dynomite
MemcacheDB
ThruDB
CouchDB
Cassandra
HBase
Hypertable
Tokyo Cabinet/Tyrant
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/
10. Go kick the tires
• Lightning fast
• Works best for at objects
• Tokyo Tyrant for network access
http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-
store/
13. CouchDB
Apache CouchDB is a distributed, fault-tolerant and
schema-free document-oriented database accessible
via a RESTful HTTP/JSON API.
http://couchdb.apache.org/
20. SQL CouchDB
Prede ned, explicit schema Dynamic, implicit schema
Collection of named documents with varying
Uniform tables of data
structure
Normalized. Objects spread across tables. Denormalized. Docs usually self contained. Data
Duplication reduced. often duplicated.
Must know schema to read/write a complete
Must know only document name
object
Dynamic queries of static schemas Static queries of dynamic schemas
http://damienkatz.net/files/What is CouchDB.pdf
21. SQL CouchDB
Prede ned, explicit schema Dynamic, implicit schema
Collection of named documents with varying
Uniform tables of data
structure
Normalized. Objects spread across tables. Denormalized. Docs usually self contained. Data
Duplication reduced. often duplicated.
Must know schema to read/write a complete
Must know only document name
object
Dynamic queries of static schemas Static queries of dynamic schemas
30. COLLECTION
• Think table, but with no schema
• For grouping documents into smaller query sets (speed)
• Eachtop level entity in your app would have its own collection
(users, articles, etc.)
• Indexable by one or more key
31. DOCUMENT
• Stored in a collection, think record or row
• Can have _id key that works like primary keys in MySQL
• Two options for relationships: subdocument or db reference
35. MORE QUERYING
• $in, $nin, $all, $ne, $gt, $gte, $lt, $lte, $size, $where
• :fields (like :select in active record)
• :limit, :offset for pagination
• :sort ascending or descending [[‘foo’, 1], [‘bar’, -1]]
• count and group (uses map/reduce)
36. HOW DO YOU
USE IT WITH RUBY
• mongo-ruby-driver http://github.com/mongodb/mongo-ruby-
driver
• activerecord adapter http://github.com/mongodb/
activerecord-mongo-adapter
• mongorecord http://github.com/mongodb/mongo-
activerecord-ruby
37. NUNEMAPPER
(MONGOMAPPER)
• Mongo is not MySQL
• DSL for modeling domain should also teach you Mongo
• It sounded fun
• Almost finished and sitting in private GitHub repo
38. FEATURES
• Typecasting • Create and Update with
single or multiple
• Callbacks(ActiveSupport
Callbacks) • Delete and Destroy and _all
counterparts
• Validations (using my fork of
validatable) • Find: id, ids, :all, :first, :last
• Connection and database • Associations (incomplete)
can differ per document
39. EXAMPLE
class User
include MongoMapper::Document
key :name, String, :required => true, :length => 5..100
key :email, String, :required => true, :index => true
key :age, Integer, :numeric => true
key :active, Boolean, :default => true
one :address
many :articles
end
class Address
include MongoMapper::Document
key :street, String
key :city, String
key :state, String, :length => 2
key :zip, Integer, :numeric => true, :length => 5
end
40. RANDOM AWESOMENESS
• Capped collections (think memcache, actually used for
replication)
• Upserts db.collection.update({‘_id’:1}, {‘$inc’:{‘views’:1}})
• Multikeys (think tagging and full text search)
• GridFS and auto-sharding
High volume, lower value. No transactions. Drivers for several languages. All network based for speed (instead of REST).
Uses BSON for storage of objects and for building queries. Binary-encoded serialization like JSON. Allows for some representations that JSON does not.
BSON seems “blob-like” but mongo speaks BSON and can “reach in” to documents.
Powerful query language, with query optimizer. You can drop down to javascript also.
Supports master/slave.
Talk about harmony and how we are abusing mysql for key/value meta data on pages.
Talk about capped collections and how they work like memcache. Also, how they are used for replication even.
_id’s can be any type, just have to be unique in collection
find always returns a cursor that you can keep iterating through
Don’t have to typecast with mongo as it “knows” types but form submissions are always strings so typecasting has its benefits.
* note required
* note length
* note numeric
* note index
multikey - automatically index arrays of object values
ensure index on tags or _keywords