SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Migrating to MongoDB
Why we moved from MySQL to Mongo
Getting to know Mongo
Demo app using Mongo with PHP
Reasons we looked for
alternative to RDBM setup
Issues with our RDBM setup

Architecture was highly distributed, number of
databases was becoming an issue
Storing similar objects with different structure
Options for scalability
Storing files
Many DBs
In a MySQL server (with MyISAM)...
  1 database = 1 directory
  1 table = more than 1 file in DB directory
Filesystem limits number of inodes per directory and it’s
not that big
Had a mix of MySQL with SQLite databases spreaded
across directory hierarchy
Many DBs
In a Mongo server ...
  No 1:1 relation between databases and files
  Stores data set of files pre-allocated with increasing
  size
  Number of files grows as needed
Using many collections within a single database
allowed to move everything in DB server
A “collection”?

 RDBM model:
   Database has tables which hold records
   Records in a table are identical
 Document-oriented storage
   Database has collections which hold documents
Obj. with differing structure

 For example, events where attributes vary based on
 type of event
   Event A: from, att1
   Event B: from, att1, att2
   Event C: from, att3, att4
 What’s your schema for this?
tbl_events_A
      id     from          Att1

      1      Jim           1237

      2      Dave          362                  tbl_events_C
      3      Bob           9283         id   from    Att3      Att4

                                        1    Bob     hello     7249

       tbl_events_B                     2    Bill   goodbye   23091

id   from           Att1         Att2   3    Jim    testing    2334

1    Bill       2938              23

2    Jim            632           9

3    Hugh      12832              14
tbl_events
id   type   from   Att1     Att2    Att3     Att4
1     A     Jim    1237    NULL     NULL     NULL
2     A     Dave   362     NULL     NULL     NULL
3     B     Bill   2938     23      NULL     NULL
4     C     Bob    NULL    NULL     hello    7249
5     A     Bob    9283    NULL     NULL     NULL
6     C     Bill   NULL    NULL    goodbye   23091
7     B     Jim    632       9      NULL     NULL
8     B     Hugh   12832    14      NULL     NULL
9     C     Jim    NULL    NULL    testing   2334
tbl_events
id   type   from                    Attributes
1     A     Jim                  “{‘att1’:1237}”
2     A     Dave                  “{‘att1’:362}”
3     B     Bill            “{‘att1’:2938, ‘att2’:23}”
4     C     Bob           “{‘att3’:‘hello’, ‘att4’:7249}”
5     A     Bob                  “{‘att1’:9283}”
6     C     Bill        “{‘att3’:‘goodbye’, ‘att4’:2391}”
7     B     Jim              “{‘att1’:632, ‘att2’:9}”
8     B     Hugh           “{‘att1’:12832, ‘att2’:14}”
9     C     Jim          “{‘att3’:‘testing’, ‘att4’:2334}”
tbl_events               tbl_events_attributes
id     type       from   id      eventId     name        value
1       A         Jim    1         1             att1    1237
2       A         Dave   2         2             att1    362
3       B         Bill   3         3             att1    2938
4       C         Bob    4         3             att2     23
5       A         Bob    5         4             att3    hello
6       C         Bill
                         6         4             att4    7249
7       B         Jim
                         7         5             att1    9283
8       B         Hugh
                         8         6             att3   goodbye
9       C         Jim
                         9         6             att4    2391
                         10        7             att1    632
                         11        7             att2     9
                                           ...
Obj. with differing structure

 Document-oriented storage link Mongo is schema-less
   1 collection for all events
   Each document has the structure applicable for its
   type
   Can index common attributes for queries
events collection :

{id:1,   type:’A’,   from:‘Jim’, att1:1237}
{id:2,   type:’A’,   from:‘Dave’, att1:362}
{id:5,   type:’A’,   from:‘Bob’, att1:9238}
{id:3,   type:’B’,   from:‘Bill’, att1:2938, att2:23}
{id:7,   type:’B’,   from:‘Jim’, att1:632, att2:9}
{id:8,   type:’B’,   from:‘Hugh’, att1:12832, att2:14}
{id:4,   type:’C’,   from:‘Bill’, att3:‘hello’, att4:7249}
{id:6,   type:’C’,   from:‘Jim’, att3:‘goodbye’, att4:23091}
{id:9,   type:’C’,   from:‘Hugh’, att3:‘testing’, att4:2334}
Options for scalability


 MySQL - Master-slave replication
 Mongo - Support master slave, replica pairs, master
 master and ... auto-sharding
Storing files

 In MySQL, you can use a table with BLOB field and
 other field for file meta data
 Mongo has GridFS
   Built for storage of large objects
   Split into chunks, also stores metadata
> db.fs.files.findOne();
{
! "_id" : ObjectId("4b9525096b00bd59b95f791f"),
! "filename" : "user.png",
! "length" : 43717,
! "chunkSize" : 262144,
! "uploadDate" : "Mon Mar 08 2010 11:25:45 GMT-0500 (EST)",
! "md5" : "3f6fcd4c0a51655d392fe95a99c29140",
! "mimeType" : "image/png"
}
> db.fs.chunks.findOne();
{
! "_id" : ObjectId("4b952509c568bb9fc8e3cddb"),
! "files_id" : ObjectId("4b9525096b00bd59b95f791f"),
! "n" : 0,
! "data" : BinData type: 2 len: 43721
}
Getting to know MongoDB
Basic concepts
A database has collections which holds documents
Documents in a collection can have any structure
Documents are JSON objects, stored as BSON
Data types:
  all basic JSON types: string, integer, boolean,
  double, null, array, object
  Special types: date, object id, binary, regexp, code
Important differences

 Collections instead of tables
 ObjectID instead of primary keys
 References instead of foreign keys
 JavaScript code execution instead of stored
 procedures
 [NULL] instead of joins
Inserting data
> doc = { author: 'joe',
  created : new Date('03-28-2009'),
  title : 'Yet another blog post',
  text : 'Here is the text...',
  tags : [ 'example', 'joe' ],
  comments : [
    { author: 'jim', comment: 'I disagree' },
    { author: 'nancy', comment: 'Good post' }
  ]
}
> db.posts.insert(doc);
Querying data
>   db.posts.find();
>   db.posts.find({‘author’:‘joe’});
>   db.posts.find({‘comments.author’:‘nancy’});
>   db.posts.find({‘comments.comment’: /disagree/i });

> db.posts.findOne({‘comment.author’:‘nancy’});
> db.posts.find({‘comment.author’:‘nancy’}).limit(5);

> db.posts.find({},{‘author’:true, ‘tags’:true});

> db.posts.find({‘author’:‘nancy’}).sort({‘created’:1});
Querying - advanced
features
  Support of OR conditions
  $ modifiers to introduce conditions
> db.posts.find({timestamp: {$gte:1268149684}});

  $where modifiers
> db.pictures.find({$where: function() { return
(this.creationTimestamp >= 1268149684) }})

  MapReduce
  Server-side code execution
> function getUniques() {
...   var uniques = [];
...   db.pictures.find({},{tags:true}).forEach(function(pic) {
...     pic.tags.forEach(function(tag) {
...       if (uniques.indexOf(tag) == -1) uniques.push(tag);
...     });
...   });
...   return uniques;
... }
> db.eval(getUniques);
[
! "firstTag",
! "thirdTag",
! "toto",
! "test",
! "comic",
! "secondTag"
]
Updating data
update( criteria, objNew, upsert, multi )
> db.myColl.update( { name: "Joe" }, { name: "Joe", age:
20 }, true, false );


save(object) - insert or update if _id exists
Update modifier operators

  $inc, $set, $unset, $push, $pushAll, $addToSet, $pop,
  $pull, $pullAll
> db.myColl.update({name:"Joe"}, { $set:{age:20}});

> db.posts.update({author:”Joe”},{$push:{tags:‘hockey’}});

> db.posts.update({},{$addToSet:{tags:‘hockey’}});
Removing data
> db.things.remove({});    // removes all
> db.things.remove({n:1}); // removes all where n == 1
> db.things.remove({_id: myobject._id});
References
>   p = db.postings.findOne();
{
!    "_id" : ObjectId("4b866f08234ae01d21d89604"),
!    "author" : "jim",
!    "title" : "Brewing Methods"
}
>   // get more info on author
>   db.users.findOne( { _id : p.author } )
{   "_id" : "jim", "email" : "jim@gmail.com" }
>   x = { name : 'Biology' }
{   "name" : "Biology" }
>   db.courses.save(x)
>   x
{   "name" : "Biology", "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1") }

> stu = { name : 'Joe', classes : [ new DBRef('courses', x._id) ] }
> db.students.save(stu)
> stu
{
        "name" : "Joe",
        "classes" : [
                 {
                        "$ref" : "courses",
                        "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1")
                 }
        ],
        "_id" : ObjectId("4b0552e4f0da7d1eb6f126a2")
}
> stu.classes[0]
{ "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") }

> stu.classes[0].fetch()
{ "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1"), "name" : "Biology" }
Limitations to keep in mind


 Namespace limit (24 000 collections and indexes)
 Database size maxed to 2GB on 32-bit systems ... use
 a 64-bit production system!
Licensing

   MongoDB is GNU AGPL 3.0, supported drivers re
   Apache License v2.0
   From www.mongodb.org/display/DOCS/Licensing :
If you are using a vanilla MongoDB server from either source or binary packages you
have NO obligations. You can ignore the rest of this page.
Hands-on example
SQL schema
                                                               tags
            pictures
                                                   pictureId          int
pictureId           int
                                                   tag                varchar
title               varchar

creationTimestamp   int
content             blob




             users
userId              int                   comments
name                varchar   pictureId           int

                              userId              int
                              txt                 varchar

                              creationTimestamp   int
let’s see some code ...

Weitere ähnliche Inhalte

Andere mochten auch

Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous DeploymentBrian Moon
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redisqianshi
 
Why Memcached?
Why Memcached?Why Memcached?
Why Memcached?Gear6
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for BeginnersEnoch Joshua
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Payara
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 

Andere mochten auch (13)

Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous Deployment
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
 
Why Memcached?
Why Memcached?Why Memcached?
Why Memcached?
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Mongo db
Mongo dbMongo db
Mongo db
 
Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Ähnlich wie ConFoo - Migrating To Mongo Db

Windows Azure Storage
Windows Azure StorageWindows Azure Storage
Windows Azure Storagegoodfriday
 
San Francisco Java User Group
San Francisco Java User GroupSan Francisco Java User Group
San Francisco Java User Groupkchodorow
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingBoxed Ice
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingBoxed Ice
 
Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Steve Smith
 

Ähnlich wie ConFoo - Migrating To Mongo Db (7)

Windows Azure Storage
Windows Azure StorageWindows Azure Storage
Windows Azure Storage
 
San Francisco Java User Group
San Francisco Java User GroupSan Francisco Java User Group
San Francisco Java User Group
 
Tricks
TricksTricks
Tricks
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueing
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueing
 
Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015
 
Git as NoSQL
Git as NoSQLGit as NoSQL
Git as NoSQL
 

Kürzlich hochgeladen

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Kürzlich hochgeladen (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

ConFoo - Migrating To Mongo Db

  • 1. Migrating to MongoDB Why we moved from MySQL to Mongo Getting to know Mongo Demo app using Mongo with PHP
  • 2.
  • 3. Reasons we looked for alternative to RDBM setup
  • 4. Issues with our RDBM setup Architecture was highly distributed, number of databases was becoming an issue Storing similar objects with different structure Options for scalability Storing files
  • 5. Many DBs In a MySQL server (with MyISAM)... 1 database = 1 directory 1 table = more than 1 file in DB directory Filesystem limits number of inodes per directory and it’s not that big Had a mix of MySQL with SQLite databases spreaded across directory hierarchy
  • 6. Many DBs In a Mongo server ... No 1:1 relation between databases and files Stores data set of files pre-allocated with increasing size Number of files grows as needed Using many collections within a single database allowed to move everything in DB server
  • 7. A “collection”? RDBM model: Database has tables which hold records Records in a table are identical Document-oriented storage Database has collections which hold documents
  • 8. Obj. with differing structure For example, events where attributes vary based on type of event Event A: from, att1 Event B: from, att1, att2 Event C: from, att3, att4 What’s your schema for this?
  • 9. tbl_events_A id from Att1 1 Jim 1237 2 Dave 362 tbl_events_C 3 Bob 9283 id from Att3 Att4 1 Bob hello 7249 tbl_events_B 2 Bill goodbye 23091 id from Att1 Att2 3 Jim testing 2334 1 Bill 2938 23 2 Jim 632 9 3 Hugh 12832 14
  • 10. tbl_events id type from Att1 Att2 Att3 Att4 1 A Jim 1237 NULL NULL NULL 2 A Dave 362 NULL NULL NULL 3 B Bill 2938 23 NULL NULL 4 C Bob NULL NULL hello 7249 5 A Bob 9283 NULL NULL NULL 6 C Bill NULL NULL goodbye 23091 7 B Jim 632 9 NULL NULL 8 B Hugh 12832 14 NULL NULL 9 C Jim NULL NULL testing 2334
  • 11. tbl_events id type from Attributes 1 A Jim “{‘att1’:1237}” 2 A Dave “{‘att1’:362}” 3 B Bill “{‘att1’:2938, ‘att2’:23}” 4 C Bob “{‘att3’:‘hello’, ‘att4’:7249}” 5 A Bob “{‘att1’:9283}” 6 C Bill “{‘att3’:‘goodbye’, ‘att4’:2391}” 7 B Jim “{‘att1’:632, ‘att2’:9}” 8 B Hugh “{‘att1’:12832, ‘att2’:14}” 9 C Jim “{‘att3’:‘testing’, ‘att4’:2334}”
  • 12. tbl_events tbl_events_attributes id type from id eventId name value 1 A Jim 1 1 att1 1237 2 A Dave 2 2 att1 362 3 B Bill 3 3 att1 2938 4 C Bob 4 3 att2 23 5 A Bob 5 4 att3 hello 6 C Bill 6 4 att4 7249 7 B Jim 7 5 att1 9283 8 B Hugh 8 6 att3 goodbye 9 C Jim 9 6 att4 2391 10 7 att1 632 11 7 att2 9 ...
  • 13. Obj. with differing structure Document-oriented storage link Mongo is schema-less 1 collection for all events Each document has the structure applicable for its type Can index common attributes for queries
  • 14. events collection : {id:1, type:’A’, from:‘Jim’, att1:1237} {id:2, type:’A’, from:‘Dave’, att1:362} {id:5, type:’A’, from:‘Bob’, att1:9238} {id:3, type:’B’, from:‘Bill’, att1:2938, att2:23} {id:7, type:’B’, from:‘Jim’, att1:632, att2:9} {id:8, type:’B’, from:‘Hugh’, att1:12832, att2:14} {id:4, type:’C’, from:‘Bill’, att3:‘hello’, att4:7249} {id:6, type:’C’, from:‘Jim’, att3:‘goodbye’, att4:23091} {id:9, type:’C’, from:‘Hugh’, att3:‘testing’, att4:2334}
  • 15. Options for scalability MySQL - Master-slave replication Mongo - Support master slave, replica pairs, master master and ... auto-sharding
  • 16. Storing files In MySQL, you can use a table with BLOB field and other field for file meta data Mongo has GridFS Built for storage of large objects Split into chunks, also stores metadata
  • 17. > db.fs.files.findOne(); { ! "_id" : ObjectId("4b9525096b00bd59b95f791f"), ! "filename" : "user.png", ! "length" : 43717, ! "chunkSize" : 262144, ! "uploadDate" : "Mon Mar 08 2010 11:25:45 GMT-0500 (EST)", ! "md5" : "3f6fcd4c0a51655d392fe95a99c29140", ! "mimeType" : "image/png" } > db.fs.chunks.findOne(); { ! "_id" : ObjectId("4b952509c568bb9fc8e3cddb"), ! "files_id" : ObjectId("4b9525096b00bd59b95f791f"), ! "n" : 0, ! "data" : BinData type: 2 len: 43721 }
  • 18. Getting to know MongoDB
  • 19. Basic concepts A database has collections which holds documents Documents in a collection can have any structure Documents are JSON objects, stored as BSON Data types: all basic JSON types: string, integer, boolean, double, null, array, object Special types: date, object id, binary, regexp, code
  • 20. Important differences Collections instead of tables ObjectID instead of primary keys References instead of foreign keys JavaScript code execution instead of stored procedures [NULL] instead of joins
  • 21. Inserting data > doc = { author: 'joe', created : new Date('03-28-2009'), title : 'Yet another blog post', text : 'Here is the text...', tags : [ 'example', 'joe' ], comments : [ { author: 'jim', comment: 'I disagree' }, { author: 'nancy', comment: 'Good post' } ] } > db.posts.insert(doc);
  • 22. Querying data > db.posts.find(); > db.posts.find({‘author’:‘joe’}); > db.posts.find({‘comments.author’:‘nancy’}); > db.posts.find({‘comments.comment’: /disagree/i }); > db.posts.findOne({‘comment.author’:‘nancy’}); > db.posts.find({‘comment.author’:‘nancy’}).limit(5); > db.posts.find({},{‘author’:true, ‘tags’:true}); > db.posts.find({‘author’:‘nancy’}).sort({‘created’:1});
  • 23. Querying - advanced features Support of OR conditions $ modifiers to introduce conditions > db.posts.find({timestamp: {$gte:1268149684}}); $where modifiers > db.pictures.find({$where: function() { return (this.creationTimestamp >= 1268149684) }}) MapReduce Server-side code execution
  • 24. > function getUniques() { ... var uniques = []; ... db.pictures.find({},{tags:true}).forEach(function(pic) { ... pic.tags.forEach(function(tag) { ... if (uniques.indexOf(tag) == -1) uniques.push(tag); ... }); ... }); ... return uniques; ... } > db.eval(getUniques); [ ! "firstTag", ! "thirdTag", ! "toto", ! "test", ! "comic", ! "secondTag" ]
  • 25. Updating data update( criteria, objNew, upsert, multi ) > db.myColl.update( { name: "Joe" }, { name: "Joe", age: 20 }, true, false ); save(object) - insert or update if _id exists
  • 26. Update modifier operators $inc, $set, $unset, $push, $pushAll, $addToSet, $pop, $pull, $pullAll > db.myColl.update({name:"Joe"}, { $set:{age:20}}); > db.posts.update({author:”Joe”},{$push:{tags:‘hockey’}}); > db.posts.update({},{$addToSet:{tags:‘hockey’}});
  • 27. Removing data > db.things.remove({}); // removes all > db.things.remove({n:1}); // removes all where n == 1 > db.things.remove({_id: myobject._id});
  • 28. References > p = db.postings.findOne(); { ! "_id" : ObjectId("4b866f08234ae01d21d89604"), ! "author" : "jim", ! "title" : "Brewing Methods" } > // get more info on author > db.users.findOne( { _id : p.author } ) { "_id" : "jim", "email" : "jim@gmail.com" }
  • 29. > x = { name : 'Biology' } { "name" : "Biology" } > db.courses.save(x) > x { "name" : "Biology", "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } > stu = { name : 'Joe', classes : [ new DBRef('courses', x._id) ] } > db.students.save(stu) > stu { "name" : "Joe", "classes" : [ { "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } ], "_id" : ObjectId("4b0552e4f0da7d1eb6f126a2") } > stu.classes[0] { "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } > stu.classes[0].fetch() { "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1"), "name" : "Biology" }
  • 30. Limitations to keep in mind Namespace limit (24 000 collections and indexes) Database size maxed to 2GB on 32-bit systems ... use a 64-bit production system!
  • 31. Licensing MongoDB is GNU AGPL 3.0, supported drivers re Apache License v2.0 From www.mongodb.org/display/DOCS/Licensing : If you are using a vanilla MongoDB server from either source or binary packages you have NO obligations. You can ignore the rest of this page.
  • 33. SQL schema tags pictures pictureId int pictureId int tag varchar title varchar creationTimestamp int content blob users userId int comments name varchar pictureId int userId int txt varchar creationTimestamp int
  • 34. let’s see some code ...