SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Amazon Elastic MapReduce
                             Takahiro Kamatani
                                 gumi, Inc.

                                2010/09/26


Sunday, September 26, 2010
•

                   •

                   • Amazon Elastic MapReduce

                   •



Sunday, September 26, 2010
• Twitter: @buhii
     •
           • gumi                 @           http://www.kansei.tsukuba.ac.jp/~uchiyamalab/beacon


     •
           •                 beacon


     • gumi                           @ynil
Sunday, September 26, 2010
Sunday, September 26, 2010
gumi
             • mixi,              , GREE

             • python Django

             • Amazon Web Services (EC2 + RDS)
               •


             •                                   DB



Sunday, September 26, 2010
• PV                          , UU
          • DAU              Daily Active Users
                     •

          •
                     •                ÷ DAU

          • ARPU              Average Revenue Per User
                     •


Sunday, September 26, 2010
Amazon Web Service   AWS




Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Amazon Elastic MapReduce




Sunday, September 26, 2010
MapReduce
                     Mapper        Key, Value


                                   Mapper       key
     Sort / Shuffle                 Reducer

                   Reducer         key, value


                 Mapper, Reducer


Sunday, September 26, 2010
Amazon Elastic MapReduce
       • Hadoop
       • Hadoop Streaming                          Mapper
              Reducer        Ruby, Perl, Python, PHP, R, Bash, C++


       • EC2                            job

       •

Sunday, September 26, 2010
Example Task
                •




                •
                      •
Sunday, September 26, 2010
• Mapper
                    •                  Apache Log

                    •                    ID   key
                             value
                             Reducer



Sunday, September 26, 2010
• Reducer
              • sort/shuffle                 ID
                             Reducer

                     •                 ID




Sunday, September 26, 2010
Reducer
          31758623           2010-08-20
          42346572           2010-09-05,2010-09-06
          31977736           2010-08-11,2010-08-12,2010-08-13,2010-08-14
          14007991           2010-08-16
          35995849           2010-08-12,2010-08-13,2010-08-14
          34246688           2010-08-21,2010-08-22,2010-08-23,2010-08-27
          ...




                                              PC

Sunday, September 26, 2010
Amazon Elastic Mapreduce

            • AWS
            • S3             Mapper, Reducer


                   → s3cmd, S3Fox Organizer, Cyberduck
            •                          Job           OK

                       •
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Streaming




Sunday, September 26, 2010
{Input, Output} Location, Mapper, Reducer
                                                 S3




                                         gzip
                                            Hadoop                   Extra Args
                                          -jobconf stream.recordreader.compression=gzip



                                   input Location                         Extra Args
                             -input s3n://(bucket     )/(                )/access_log.*




Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Debug




Sunday, September 26, 2010
Sunday, September 26, 2010
• Hadoop
           •                 MapReduce

           •
           •

Sunday, September 26, 2010
Hadoop

            •
                     •       S3                     gzip

                     •
            •      hadoop              EC2



            •
                                  (   20     ...)



Sunday, September 26, 2010
@ynil
                  MapReduce




               http://nlpyutori.g.hatena.ne.jp/yaruki_nil/20100911/1284089305

Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
Sunday, September 26, 2010
MapReduce
           MapReduce


                   Google

                                                           Map
           Reduce                                               Map
           Reduce
                             MapReduce   C++ Java Python



                                                Wikipedia “MapReduce”
                                                 http://ja.wikipedia.org/wiki/MapReduce




Sunday, September 26, 2010
cron
         •
                PV, UU                NFS   CSV

         •                            DB
                  →          DB


         •                   PV, UU



Sunday, September 26, 2010

Weitere ähnliche Inhalte

Ähnlich wie ソーシャルアプリでの Amazon Elastic MapReduce 活用事例

GeekCamp SG 2009 - CouchApps with CouchDB
GeekCamp SG 2009 - CouchApps with CouchDBGeekCamp SG 2009 - CouchApps with CouchDB
GeekCamp SG 2009 - CouchApps with CouchDBArun Thampi
 
Community in action leroy merlin case study - nuxeo world 2010
Community in action   leroy merlin case study - nuxeo world 2010Community in action   leroy merlin case study - nuxeo world 2010
Community in action leroy merlin case study - nuxeo world 2010Nuxeo
 
Responsive web design - Drupal theming
Responsive web design - Drupal themingResponsive web design - Drupal theming
Responsive web design - Drupal themingadifferentdesign
 
ScaleCamp 2009 - Last.fm vs Xbox
ScaleCamp 2009 - Last.fm vs XboxScaleCamp 2009 - Last.fm vs Xbox
ScaleCamp 2009 - Last.fm vs Xboxdavidsingleton
 
COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用Tatsuya Sasaki
 
Migrando do App Engine para o Heroku
Migrando do App Engine para o HerokuMigrando do App Engine para o Heroku
Migrando do App Engine para o HerokuFilipe Ximenes
 
Chef - Configuration Management for the Cloud
Chef - Configuration Management for the CloudChef - Configuration Management for the Cloud
Chef - Configuration Management for the CloudJames Casey
 

Ähnlich wie ソーシャルアプリでの Amazon Elastic MapReduce 活用事例 (8)

GeekCamp SG 2009 - CouchApps with CouchDB
GeekCamp SG 2009 - CouchApps with CouchDBGeekCamp SG 2009 - CouchApps with CouchDB
GeekCamp SG 2009 - CouchApps with CouchDB
 
Community in action leroy merlin case study - nuxeo world 2010
Community in action   leroy merlin case study - nuxeo world 2010Community in action   leroy merlin case study - nuxeo world 2010
Community in action leroy merlin case study - nuxeo world 2010
 
Responsive web design - Drupal theming
Responsive web design - Drupal themingResponsive web design - Drupal theming
Responsive web design - Drupal theming
 
ScaleCamp 2009 - Last.fm vs Xbox
ScaleCamp 2009 - Last.fm vs XboxScaleCamp 2009 - Last.fm vs Xbox
ScaleCamp 2009 - Last.fm vs Xbox
 
COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用
 
Migrando do App Engine para o Heroku
Migrando do App Engine para o HerokuMigrando do App Engine para o Heroku
Migrando do App Engine para o Heroku
 
Rango
RangoRango
Rango
 
Chef - Configuration Management for the Cloud
Chef - Configuration Management for the CloudChef - Configuration Management for the Cloud
Chef - Configuration Management for the Cloud
 

ソーシャルアプリでの Amazon Elastic MapReduce 活用事例