SlideShare a Scribd company logo
1 of 17
INTRODUCTION TO BIG DATA
 What is BIG DATA?
 Characteristics of Big Data
 What is BIG DATA Analysis?
 Traditional vs. Current Analytics Trends
 BIG Data using Hadoop!
 Hadoop History
 Hadoop – High Level Architecture
 Hadoop Variants
 Hadoop Skills
 NOSQL Introduction
 Big Data – Case Studies
Topics Covered
Table of Contents
2 | Oh! Session - Introduction to Big Data
What is BIG DATA?
Big Data, simply put, is data which is very BIG!
3 | Oh! Session - Introduction to Big Data
Big data is new and “ginormous”
& scary – very, very scary term.
No, wait. It is not.
Big data is a term for data sets that are so large or complex that
traditional data processing applications are inadequate.
Examples of Big Data:
SOCIAL MEDIA ACTIVITY – like Facebook, Twitter, LinkedIn, etc.
FINANCIAL TRANSACTIONS – Internet Banking logs, Share Market, etc.
LOCATION TRACKING – Global Positioning System data, etc.
WEB BEHAVIOUR – Internet browsing, Google searches, etc.
Characteristics of BIG DATA?
Big data can be described by the following characteristics:
4 | Oh! Session - Introduction to Big Data
 Volume
 The Quantity of generated & stored data. Size determines big data.
 Variety
 The Type And Nature of the data.
 Velocity
 The Speed of data generation.
 Variability
 Inconsistency of the data set
 Veracity
 The Quality of captured data can vary greatly, affecting accurate analysis.
What is BIG DATA ANALYSIS?
5 | Oh! Session - Introduction to Big Data
Big data analytics is the process of examining large data sets containing a variety of data
types i.e. Big Data – to uncover hidden patterns, unknown correlations, market trends, customer
preferences and other useful business information.
Benefits of Big Data Analytics
The analytical findings done on the Big Data can lead to:
•more effective marketing
•new revenue opportunities
•better customer service
•improved operational efficiency
•competitive advantages over rival
organizations
•& other business benefits.
Traditional vs. Current Analytics Trends
6 | Oh! Session - Introduction to Big Data
Data processing and Analytics: The old way
Traditionally, data processing analytics followed
creation of modest amounts of structured data via
enterprise applications (CRM, ERP, etc.)
The modeled & cleansed data loaded into an
enterprise data warehouse.
The extent of complexity of data analyzed was limited
to relational data only, thus TERADATA, EXADATA &
NETEZZA was running the show.
Data processing and Analytics : The New way
Currently, data is growing exponentially and the
variety has grown from text & relational (i.e.
structured) to a mix of structured, semi-structured &
un-structured data.
The analytical tools-set had to change for handling
the un-structured part of data which is why
technologies like Hadoop, SPARK, NOSQL have
become famous and have reduced the cost by
providing open source systems & resilience with
parallel processing.
BIG Data using Hadoop!
Why Hadoop?
The most well known technology, which is open source, Java-based framework
helping manage structured and unstructured data is Hadoop
It is Flexible, Scalable, Robust, Cost effective, adaptive to upcoming technologies.
7 | Oh! Session - Introduction to Big Data
Hadoop in Action:
Hadoop is a great framework for advertising companies as well. It keeps a good track of the millions of
clicks on the ads and how the users are responding to the ads posted by the big Ad agencies! 
•Facebook – over 1.3 billion active users – storing, managing & keeping track of all profiles along with the
related posts, comments, images, videos, and so on.
•LinkedIn – managing over 1 billion personalized recommendations/week using Map Reduce & HDFS
features!
•Walmart – Helping handle more than 1 million customer transactions/hour
•Twitter – Managing and handling 85 million tweets from users/day
•Google – Managing more than 1 terabyte of data/hour
•eBay – handling and managing 80 terabytes of data/day and suggesting additional suitable products to
their customers
•Spadac.com – helps run spatial intelligence & predictive analytics on huge volumes of data for providing
actionable intelligence to its customers
Hadoop History!!
Brief Historical Timeline of Hadoop
8 | Oh! Session - Introduction to Big Data
Hadoop – High Level Architecture
9 | Oh! Session - Introduction to Big Data
Hadoop Variants
Major variants for Hadoop and their distribution
10 | Oh! Session - Introduction to Big Data
1. Cloudera Hadoop(CDH)
2. HortonWorks
3. MapR
Hadoop Skills
11 | Oh! Session - Introduction to Big Data
Big Data – Case Studies
12 | Oh! Session - Introduction to Big Data
1. 2012 US Presidential Election
• Barack Obama's Big Data won the
US election
2. Data Storage
• NetApp
3. Human Sciences
• NextBio
Data in this model is stored inside documents.
Documents are not typically forced to have a schema and
therefore are flexible and easy to change.
No Joins required
MONGODB
What is MONGODB?
13 | Oh! Session - Introduction to Big Data
MONGODB
Use of HADOOP with MONGODB
14 | Oh! Session - Introduction to Big Data
MONGODB
 Replicatation Possible
 Horizontal scalable
 Master Slave concept
 We can use Commodity Hardware
MONGODB
Similarities with HADOOP
15 | Oh! Session - Introduction to Big Data
HADOOP
 Replication Possible
 Horizontal scalable
 Master Slave concept
 We can use Commodity Hardware
MONGODB
 Data stores in a Database
 Data serialize
 Data can be writable any time
MONGODB
Differences with HADOOP
16 | Oh! Session - Introduction to Big Data
HADOOP
 Data stores in a File system
 Data parallelism
 One time Writable
Thank You
Feel Free to drop your queries to:
Benoy Daniel Benoy.daniel@axa-tech.com
Bibhusisa Pattanaik Bibhusisa.Pattanaik@axa-tech.com

More Related Content

What's hot

Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case Muh Saleh
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data ScienceBrijeshGoyani
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
Big data – a brief overview
Big data – a brief overviewBig data – a brief overview
Big data – a brief overviewDorai Thodla
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...yashbheda
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using HadoopNagamani Gurram
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshersrajkamaltibacademy
 
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Edureka!
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data AnalyticsTUSHAR GARG
 
An exploration in analysis and visualization
An exploration in analysis and visualizationAn exploration in analysis and visualization
An exploration in analysis and visualizationDorai Thodla
 

What's hot (20)

Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Big data.
Big data.Big data.
Big data.
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big data – a brief overview
Big data – a brief overviewBig data – a brief overview
Big data – a brief overview
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Big Data
Big DataBig Data
Big Data
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
 
Overview of Bigdata Analytics
Overview of Bigdata Analytics Overview of Bigdata Analytics
Overview of Bigdata Analytics
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using Hadoop
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
An exploration in analysis and visualization
An exploration in analysis and visualizationAn exploration in analysis and visualization
An exploration in analysis and visualization
 

Viewers also liked

Το τέχνασμα του Θεμιστοκλή
Το τέχνασμα του ΘεμιστοκλήΤο τέχνασμα του Θεμιστοκλή
Το τέχνασμα του Θεμιστοκλήkostism
 
مدل های مرجع
مدل های مرجعمدل های مرجع
مدل های مرجعSalar Saket
 
GDC12: Social Gaming
GDC12: Social GamingGDC12: Social Gaming
GDC12: Social GamingVolker Hirsch
 
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίας
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας ΣοφίαςΤο αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίας
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίαςgiakotz
 
Lyseis st 2017_03_10
Lyseis st 2017_03_10Lyseis st 2017_03_10
Lyseis st 2017_03_10kstskollias
 
Parque Linear Arrudas BH
Parque Linear Arrudas BHParque Linear Arrudas BH
Parque Linear Arrudas BHJordan Ferreira
 
La Restauración de las 12 Puertas - 5ª La puerta del Muladar
La Restauración de las 12 Puertas - 5ª La puerta del MuladarLa Restauración de las 12 Puertas - 5ª La puerta del Muladar
La Restauración de las 12 Puertas - 5ª La puerta del MuladarValentin Moraleja
 
20429981 examen moquegua2009 (1) (1)
20429981 examen moquegua2009 (1) (1)20429981 examen moquegua2009 (1) (1)
20429981 examen moquegua2009 (1) (1)Mario Quispe Quispe
 

Viewers also liked (16)

Nombre común
Nombre comúnNombre común
Nombre común
 
CHOCOLATES COMPRA IMPULSIVA EN CHECK OUT
CHOCOLATES COMPRA IMPULSIVA EN CHECK OUTCHOCOLATES COMPRA IMPULSIVA EN CHECK OUT
CHOCOLATES COMPRA IMPULSIVA EN CHECK OUT
 
Internet y-el-docente
Internet y-el-docenteInternet y-el-docente
Internet y-el-docente
 
Το τέχνασμα του Θεμιστοκλή
Το τέχνασμα του ΘεμιστοκλήΤο τέχνασμα του Θεμιστοκλή
Το τέχνασμα του Θεμιστοκλή
 
Making the Ask Workshop for Valley Gives (4 1-16)
Making the Ask Workshop for Valley Gives (4 1-16)Making the Ask Workshop for Valley Gives (4 1-16)
Making the Ask Workshop for Valley Gives (4 1-16)
 
مدل های مرجع
مدل های مرجعمدل های مرجع
مدل های مرجع
 
Hackeando sua Produtividade
Hackeando sua ProdutividadeHackeando sua Produtividade
Hackeando sua Produtividade
 
GDC12: Social Gaming
GDC12: Social GamingGDC12: Social Gaming
GDC12: Social Gaming
 
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίας
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας ΣοφίαςΤο αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίας
Το αντισεισμικό αρχιτεκτονικό σχέδιο της Αγίας Σοφίας
 
Lyseis st 2017_03_10
Lyseis st 2017_03_10Lyseis st 2017_03_10
Lyseis st 2017_03_10
 
Parque Linear Arrudas BH
Parque Linear Arrudas BHParque Linear Arrudas BH
Parque Linear Arrudas BH
 
Instructivo sistema de gestión de tickets
Instructivo sistema de gestión de ticketsInstructivo sistema de gestión de tickets
Instructivo sistema de gestión de tickets
 
Βιογραφια Πηνελοπης Δελτα
Βιογραφια Πηνελοπης Δελτα  Βιογραφια Πηνελοπης Δελτα
Βιογραφια Πηνελοπης Δελτα
 
La Restauración de las 12 Puertas - 5ª La puerta del Muladar
La Restauración de las 12 Puertas - 5ª La puerta del MuladarLa Restauración de las 12 Puertas - 5ª La puerta del Muladar
La Restauración de las 12 Puertas - 5ª La puerta del Muladar
 
Βιβλιοπαρουσιαση ιωαννιδου -Το παλιοπαιδο
Βιβλιοπαρουσιαση  ιωαννιδου -Το παλιοπαιδοΒιβλιοπαρουσιαση  ιωαννιδου -Το παλιοπαιδο
Βιβλιοπαρουσιαση ιωαννιδου -Το παλιοπαιδο
 
20429981 examen moquegua2009 (1) (1)
20429981 examen moquegua2009 (1) (1)20429981 examen moquegua2009 (1) (1)
20429981 examen moquegua2009 (1) (1)
 

Similar to Introduction to Big Data Analytics

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopSamiraChandan
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfPoornimaShetty27
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfSreenivasa Harish
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Big data and you
Big data and you Big data and you
Big data and you IBM
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data miningEmran Hossain
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 

Similar to Introduction to Big Data Analytics (20)

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Big Data
Big DataBig Data
Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Bigdata
BigdataBigdata
Bigdata
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and Hadoop
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Big data and you
Big data and you Big data and you
Big data and you
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 

Recently uploaded

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Introduction to Big Data Analytics

  • 2.  What is BIG DATA?  Characteristics of Big Data  What is BIG DATA Analysis?  Traditional vs. Current Analytics Trends  BIG Data using Hadoop!  Hadoop History  Hadoop – High Level Architecture  Hadoop Variants  Hadoop Skills  NOSQL Introduction  Big Data – Case Studies Topics Covered Table of Contents 2 | Oh! Session - Introduction to Big Data
  • 3. What is BIG DATA? Big Data, simply put, is data which is very BIG! 3 | Oh! Session - Introduction to Big Data Big data is new and “ginormous” & scary – very, very scary term. No, wait. It is not. Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate. Examples of Big Data: SOCIAL MEDIA ACTIVITY – like Facebook, Twitter, LinkedIn, etc. FINANCIAL TRANSACTIONS – Internet Banking logs, Share Market, etc. LOCATION TRACKING – Global Positioning System data, etc. WEB BEHAVIOUR – Internet browsing, Google searches, etc.
  • 4. Characteristics of BIG DATA? Big data can be described by the following characteristics: 4 | Oh! Session - Introduction to Big Data  Volume  The Quantity of generated & stored data. Size determines big data.  Variety  The Type And Nature of the data.  Velocity  The Speed of data generation.  Variability  Inconsistency of the data set  Veracity  The Quality of captured data can vary greatly, affecting accurate analysis.
  • 5. What is BIG DATA ANALYSIS? 5 | Oh! Session - Introduction to Big Data Big data analytics is the process of examining large data sets containing a variety of data types i.e. Big Data – to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Benefits of Big Data Analytics The analytical findings done on the Big Data can lead to: •more effective marketing •new revenue opportunities •better customer service •improved operational efficiency •competitive advantages over rival organizations •& other business benefits.
  • 6. Traditional vs. Current Analytics Trends 6 | Oh! Session - Introduction to Big Data Data processing and Analytics: The old way Traditionally, data processing analytics followed creation of modest amounts of structured data via enterprise applications (CRM, ERP, etc.) The modeled & cleansed data loaded into an enterprise data warehouse. The extent of complexity of data analyzed was limited to relational data only, thus TERADATA, EXADATA & NETEZZA was running the show. Data processing and Analytics : The New way Currently, data is growing exponentially and the variety has grown from text & relational (i.e. structured) to a mix of structured, semi-structured & un-structured data. The analytical tools-set had to change for handling the un-structured part of data which is why technologies like Hadoop, SPARK, NOSQL have become famous and have reduced the cost by providing open source systems & resilience with parallel processing.
  • 7. BIG Data using Hadoop! Why Hadoop? The most well known technology, which is open source, Java-based framework helping manage structured and unstructured data is Hadoop It is Flexible, Scalable, Robust, Cost effective, adaptive to upcoming technologies. 7 | Oh! Session - Introduction to Big Data Hadoop in Action: Hadoop is a great framework for advertising companies as well. It keeps a good track of the millions of clicks on the ads and how the users are responding to the ads posted by the big Ad agencies!  •Facebook – over 1.3 billion active users – storing, managing & keeping track of all profiles along with the related posts, comments, images, videos, and so on. •LinkedIn – managing over 1 billion personalized recommendations/week using Map Reduce & HDFS features! •Walmart – Helping handle more than 1 million customer transactions/hour •Twitter – Managing and handling 85 million tweets from users/day •Google – Managing more than 1 terabyte of data/hour •eBay – handling and managing 80 terabytes of data/day and suggesting additional suitable products to their customers •Spadac.com – helps run spatial intelligence & predictive analytics on huge volumes of data for providing actionable intelligence to its customers
  • 8. Hadoop History!! Brief Historical Timeline of Hadoop 8 | Oh! Session - Introduction to Big Data
  • 9. Hadoop – High Level Architecture 9 | Oh! Session - Introduction to Big Data
  • 10. Hadoop Variants Major variants for Hadoop and their distribution 10 | Oh! Session - Introduction to Big Data 1. Cloudera Hadoop(CDH) 2. HortonWorks 3. MapR
  • 11. Hadoop Skills 11 | Oh! Session - Introduction to Big Data
  • 12. Big Data – Case Studies 12 | Oh! Session - Introduction to Big Data 1. 2012 US Presidential Election • Barack Obama's Big Data won the US election 2. Data Storage • NetApp 3. Human Sciences • NextBio
  • 13. Data in this model is stored inside documents. Documents are not typically forced to have a schema and therefore are flexible and easy to change. No Joins required MONGODB What is MONGODB? 13 | Oh! Session - Introduction to Big Data
  • 14. MONGODB Use of HADOOP with MONGODB 14 | Oh! Session - Introduction to Big Data
  • 15. MONGODB  Replicatation Possible  Horizontal scalable  Master Slave concept  We can use Commodity Hardware MONGODB Similarities with HADOOP 15 | Oh! Session - Introduction to Big Data HADOOP  Replication Possible  Horizontal scalable  Master Slave concept  We can use Commodity Hardware
  • 16. MONGODB  Data stores in a Database  Data serialize  Data can be writable any time MONGODB Differences with HADOOP 16 | Oh! Session - Introduction to Big Data HADOOP  Data stores in a File system  Data parallelism  One time Writable
  • 17. Thank You Feel Free to drop your queries to: Benoy Daniel Benoy.daniel@axa-tech.com Bibhusisa Pattanaik Bibhusisa.Pattanaik@axa-tech.com

Editor's Notes

  1. 1. Flexible: As it is a known fact that only 20% of data in organizations is structured, and the rest is all unstructured, it is very crucial to manage unstructured data which goes unattended. Hadoop manages different types of Big Data, whether structured or unstructured, encoded or formatted, or any other type of data and makes it useful for decision making process. Moreover, Hadoop is simple, relevant and schema-less! Though Hadoop generally supports Java Programming, any programming language can be used in Hadoop with the help of the MapReduce technique. Though Hadoop works best on Windows and Linux, it can also work on other operating systems like BSD and OS X. 2.  Scalable Hadoop is a scalable platform, in the sense that new nodes can be easily added in the system as and when required without altering the data formats, how data is loaded, how programs are written, or even without modifying the existing applications. Hadoop is an open source platform and runs on industry-standard hardware. Moreover, Hadoop is also fault tolerant – this means, even if a node gets lost or goes out of service, the system automatically reallocates work to another location of the data and continues processing as if nothing had happened! 3. Robust Ecosystem: Hadoop has a very robust and a rich ecosystem that is well suited to meet the analytical needs of developers, web start-ups and other organizations. Hadoop Ecosystem consists of various related projects such as MapReduce, Hive, HBase, Zookeeper, HCatalog, Apache Pig, which make Hadoop very competent to deliver a broad spectrum of services. 4. Hadoop is getting more “Real-Time”! Did you ever wonder how to stream information into a cluster and analyze it in real time? Hadoop has the answer for it. Yes, Hadoop’s competencies are getting more and more real-time. Hadoop also provides a standard approach to a wide set of APIs for big data analytics comprising MapReduce, query languages and database access, and so on. 6. Cost Effective: Loaded with such great features, the icing on the cake is that Hadoop generates cost benefits by bringing massively parallel computing to commodity servers, resulting in a substantial reduction in the cost per terabyte of storage, which in turn makes it reasonable to model all your data. The basic idea behind Hadoop is to perform cost-effective data analysis present across world wide web! 7.  Upcoming Technologies using Hadoop: With reinforcing its capabilities, Hadoop is leading to phenomenal technical advancements. For instance, HBase will soon become a vital Platform for Blob Stores (Binary Large Objects) and for Lightweight OLTP (Online Transaction Processing). Hadoop has also begun serving as a strong foundation for new-school graph and NoSQL databases, and better versions of relational databases.