While MySQL and PostgreSQL are the usual options for database-driven web applications, developers can now consider non-relational databases as serious alternatives.
This session will present a case study of why and how we migrated from a backend built on a mix of MySQL and SQLite to MongoDB. The session will cover the following points:
- Key differences between an SQL RDBMS and Mongo,
- What made it a better fit in our case,
- Hands-on technical examples of using MongoDB from PHP5.
4. Issues with our RDBM setup
Architecture was highly distributed, number of
databases was becoming an issue
Storing similar objects with different structure
Options for scalability
Storing files
5. Many DBs
In a MySQL server (with MyISAM)...
1 database = 1 directory
1 table = more than 1 file in DB directory
Filesystem limits number of inodes per directory and it’s
not that big
Had a mix of MySQL with SQLite databases spreaded
across directory hierarchy
6. Many DBs
In a Mongo server ...
No 1:1 relation between databases and files
Stores data set of files pre-allocated with increasing
size
Number of files grows as needed
Using many collections within a single database
allowed to move everything in DB server
7. A “collection”?
RDBM model:
Database has tables which hold records
Records in a table are identical
Document-oriented storage
Database has collections which hold documents
8. Obj. with differing structure
For example, events where attributes vary based on
type of event
Event A: from, att1
Event B: from, att1, att2
Event C: from, att3, att4
What’s your schema for this?
9. tbl_events_A
id from Att1
1 Jim 1237
2 Dave 362 tbl_events_C
3 Bob 9283 id from Att3 Att4
1 Bob hello 7249
tbl_events_B 2 Bill goodbye 23091
id from Att1 Att2 3 Jim testing 2334
1 Bill 2938 23
2 Jim 632 9
3 Hugh 12832 14
10. tbl_events
id type from Att1 Att2 Att3 Att4
1 A Jim 1237 NULL NULL NULL
2 A Dave 362 NULL NULL NULL
3 B Bill 2938 23 NULL NULL
4 C Bob NULL NULL hello 7249
5 A Bob 9283 NULL NULL NULL
6 C Bill NULL NULL goodbye 23091
7 B Jim 632 9 NULL NULL
8 B Hugh 12832 14 NULL NULL
9 C Jim NULL NULL testing 2334
11. tbl_events
id type from Attributes
1 A Jim “{‘att1’:1237}”
2 A Dave “{‘att1’:362}”
3 B Bill “{‘att1’:2938, ‘att2’:23}”
4 C Bob “{‘att3’:‘hello’, ‘att4’:7249}”
5 A Bob “{‘att1’:9283}”
6 C Bill “{‘att3’:‘goodbye’, ‘att4’:2391}”
7 B Jim “{‘att1’:632, ‘att2’:9}”
8 B Hugh “{‘att1’:12832, ‘att2’:14}”
9 C Jim “{‘att3’:‘testing’, ‘att4’:2334}”
12. tbl_events tbl_events_attributes
id type from id eventId name value
1 A Jim 1 1 att1 1237
2 A Dave 2 2 att1 362
3 B Bill 3 3 att1 2938
4 C Bob 4 3 att2 23
5 A Bob 5 4 att3 hello
6 C Bill
6 4 att4 7249
7 B Jim
7 5 att1 9283
8 B Hugh
8 6 att3 goodbye
9 C Jim
9 6 att4 2391
10 7 att1 632
11 7 att2 9
...
13. Obj. with differing structure
Document-oriented storage link Mongo is schema-less
1 collection for all events
Each document has the structure applicable for its
type
Can index common attributes for queries
15. Options for scalability
MySQL - Master-slave replication
Mongo - Support master slave, replica pairs, master
master and ... auto-sharding
16. Storing files
In MySQL, you can use a table with BLOB field and
other field for file meta data
Mongo has GridFS
Built for storage of large objects
Split into chunks, also stores metadata
19. Basic concepts
A database has collections which holds documents
Documents in a collection can have any structure
Documents are JSON objects, stored as BSON
Data types:
all basic JSON types: string, integer, boolean,
double, null, array, object
Special types: date, object id, binary, regexp, code
20. Important differences
Collections instead of tables
ObjectID instead of primary keys
References instead of foreign keys
JavaScript code execution instead of stored
procedures
[NULL] instead of joins
21. Inserting data
> doc = { author: 'joe',
created : new Date('03-28-2009'),
title : 'Yet another blog post',
text : 'Here is the text...',
tags : [ 'example', 'joe' ],
comments : [
{ author: 'jim', comment: 'I disagree' },
{ author: 'nancy', comment: 'Good post' }
]
}
> db.posts.insert(doc);
27. Removing data
> db.things.remove({}); // removes all
> db.things.remove({n:1}); // removes all where n == 1
> db.things.remove({_id: myobject._id});
28. References
> p = db.postings.findOne();
{
! "_id" : ObjectId("4b866f08234ae01d21d89604"),
! "author" : "jim",
! "title" : "Brewing Methods"
}
> // get more info on author
> db.users.findOne( { _id : p.author } )
{ "_id" : "jim", "email" : "jim@gmail.com" }
30. Limitations to keep in mind
Namespace limit (24 000 collections and indexes)
Database size maxed to 2GB on 32-bit systems ... use
a 64-bit production system!
31. Licensing
MongoDB is GNU AGPL 3.0, supported drivers re
Apache License v2.0
From www.mongodb.org/display/DOCS/Licensing :
If you are using a vanilla MongoDB server from either source or binary packages you
have NO obligations. You can ignore the rest of this page.
33. SQL schema
tags
pictures
pictureId int
pictureId int
tag varchar
title varchar
creationTimestamp int
content blob
users
userId int comments
name varchar pictureId int
userId int
txt varchar
creationTimestamp int