What is Big DATA | Hadoop online training

•

4 gefällt mir•661 views

we provide best Hadoop devlopment and hadoop admin online training. Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. hadoop training, hadoop online training, hadoop training in bangalore, hadoop training in hyderabad, best hadoop training institutes, hadoop online training in chicago, hadoop training in mumbai, hadoop training in pune, hadoop training institutes ameerpet

Bildung

What is Big Data?
• There are humungous amount of data, available which have a
lot of meaningful insights – they need to be analysed
• Existing Online Transaction Processing (OLTP) and Business
Intelligence (BI) are not easily scalable considering cost, effort,
and manageability aspect.
• It is not just volume, but also the variety and velocity of data.
• Big data is a terminology that refers to challenges that we are
facing due to exponential volume, variety and velocity of data.

Shorter Time to React
• Data that enters your organization and has some kind of value
for a limited window of time
• This window usually shuts well before the data has been
transformed and loaded into a data warehouse for deeper
analysis.
• The higher the volumes of data entering your organization per
second, the bigger your challenge.

Data Economics
• Why Volume is good ?
– No individual record is particularly valuable
– Having every record is incredibly valuable
• Why storage decision is important ?
• How much value can I extract from every byte of data verses
the cost to save that data?
– If value > cost – then keep it online, on DB or filer
– If cost > value – I discard it or archive on tape (expensive to
throw data)

Data Storage
Schema Structured Un Structured
Storage Medium RDBMS Filers
Storage Reliability Very reliable Very reliable
Processing ability Very reliable unstructured schema
poses challenges
Location of
processing
SQL queries pull data
to server
Random means to
retrieve sense
Impact of data
increase
Cost increases
linearly
Cost increases
linearly
Support for Big Data No No

Big Data Approach
Big Data refer to
technologies that
can capture, process
and analyze data.

No SQL Database Types
• Key-value store
– Key can be custom or auto generated
– Value can be complex objects like XML, BLOB, JSON
etc
– Popular : DynamoDB, Azure Table Store (ATS), Riak
• Column store
– Data is stored as families of columns; high scalability
with very high performance architecture
– Examples : HBase, Cassandra, Vertica and Hypertable

No SQL Database Types
• Document database
– Designed to store, retrieve & manage document
oriented information; expands on key-value store
– Example: MongoDB, CouchDB
• Graph database
– Designed for data that whose relations are well
represented in graphs, usually with nodes
connected to edges
– Examples : Neo4J and Polyglot

Analytical Database
• An analytical database is a type of database built to store,
manage, and consume big data.
• Optimized for processing advanced analytics that involves
highly complex queries on terabytes of data and complex
statistical processing, data mining, and NLP (natural language
processing).
• Examples of analytical databases are Vertica (acquired by HP),
Aster Data (acquired by Teradata), Greenplum (acquired by
EMC), and so on.

Preprocess & Store
• Scenario
– Data getting continuously generated in large volume
– Need to pre-process before loading into target systems

Real Time Actions
• Scenario
– Manage actions to be taken
on continuously changing
data in real time

Sears – Competes on Big Data
• They have data of over 100 million customers, which they
analyse to offer real-time, relevant offers to their customers.
• The solution was 3 years in the making, which included
programming that would capture, analyze, and report on
customer activity at an individual level, across all 4,000
locations.
• Sears has a Hadoop cluster of 300-nodes that is populated
with over 2 petabytes of structure customer transaction data,
sales data and supply chain data.
• Results: Sears achieved an active member base in the 8 digits,
exceeding the projected 36 month membership target in 17
months.

Compound Annual Growth Rate
IDC Report Analysis

Empfohlen

Hadoop demo pptPhil Young

Qlikview-online-training | Qlikview Server training | Qlikview Designersuresh

Tableau online trainingsuresh

Salesforce online training || Salesforce Integration | salesforce lightningsuresh

Oracle PL/SQL online training | PL/SQL online Trainingsuresh

Grc 10 trainingsuresh

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Empfohlen

Hadoop demo pptPhil Young

Qlikview-online-training | Qlikview Server training | Qlikview Designersuresh

Tableau online trainingsuresh

Salesforce online training || Salesforce Integration | salesforce lightningsuresh

Oracle PL/SQL online training | PL/SQL online Trainingsuresh

Grc 10 trainingsuresh

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW

Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith

How to Make a Duplicate of Your Odoo 17 DatabaseCeline George

ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri

Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO

4.11.24 Poverty and Inequality in America.pptxmary850239

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxExcellence Foundation for South Sudan

Textual Evidence in Reading and Writing of SHSMae Pangan

How to Fix XML SyntaxError in Odoo the 17Celine George

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar

Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar

Expanded definition: technical and operationalssuser3e220a

4.16.24 21st Century Movements for Black Lives.pptxmary850239

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1

prashanth updated resume 2024 for Teaching ProfessionSri Sairam College Of Engineering Bengaluru

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43

week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar

Concurrency Control in Database Management systemChristalin Nelson

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW

Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith

How to Make a Duplicate of Your Odoo 17 DatabaseCeline George

ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri

Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO

4.11.24 Poverty and Inequality in America.pptxmary850239

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxExcellence Foundation for South Sudan

Textual Evidence in Reading and Writing of SHSMae Pangan

How to Fix XML SyntaxError in Odoo the 17Celine George

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar

Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar

Expanded definition: technical and operationalssuser3e220a

4.16.24 21st Century Movements for Black Lives.pptxmary850239

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1

prashanth updated resume 2024 for Teaching ProfessionSri Sairam College Of Engineering Bengaluru

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43

week 1 cookery 8 fourth - quarter .pptxJonalynLegaspi2

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar

Concurrency Control in Database Management systemChristalin Nelson

Kürzlich hochgeladen (20)

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW

Mental Health Awareness - a toolkit for supporting young minds

How to Make a Duplicate of Your Odoo 17 Database

ICS2208 Lecture6 Notes for SL spaces.pdf

Daily Lesson Plan in Mathematics Quarter 4

4.11.24 Poverty and Inequality in America.pptx

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx

Textual Evidence in Reading and Writing of SHS

How to Fix XML SyntaxError in Odoo the 17

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...

Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx

Expanded definition: technical and operational

4.16.24 21st Century Movements for Black Lives.pptx

ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv

prashanth updated resume 2024 for Teaching Profession

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx

Grade Three -ELLNA-REVIEWER-ENGLISH.pptx

week 1 cookery 8 fourth - quarter .pptx

Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx

Concurrency Control in Database Management system

Empfohlen

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

Empfohlen (20)

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

What is Big DATA | Hadoop online training

2. WHAT IS BIG DATA?

3. What is Big Data? • There are humungous amount of data, available which have a lot of meaningful insights – they need to be analysed • Existing Online Transaction Processing (OLTP) and Business Intelligence (BI) are not easily scalable considering cost, effort, and manageability aspect. • It is not just volume, but also the variety and velocity of data. • Big data is a terminology that refers to challenges that we are facing due to exponential volume, variety and velocity of data.

4. Three V’s of Big Data

5. Three V’s of Big Data

7. THE CHALLENGE

8. Background

9. Shorter Time to React • Data that enters your organization and has some kind of value for a limited window of time • This window usually shuts well before the data has been transformed and loaded into a data warehouse for deeper analysis. • The higher the volumes of data entering your organization per second, the bigger your challenge.

10. Data Economics • Why Volume is good ? – No individual record is particularly valuable – Having every record is incredibly valuable • Why storage decision is important ? • How much value can I extract from every byte of data verses the cost to save that data? – If value > cost – then keep it online, on DB or filer – If cost > value – I discard it or archive on tape (expensive to throw data)

11. Data Storage Schema Structured Un Structured Storage Medium RDBMS Filers Storage Reliability Very reliable Very reliable Processing ability Very reliable unstructured schema poses challenges Location of processing SQL queries pull data to server Random means to retrieve sense Impact of data increase Cost increases linearly Cost increases linearly Support for Big Data No No

12. BIG DATA’S APPROACH

13. Big Data Approach Big Data refer to technologies that can capture, process and analyze data.

14. No SQL Database Types • Key-value store – Key can be custom or auto generated – Value can be complex objects like XML, BLOB, JSON etc – Popular : DynamoDB, Azure Table Store (ATS), Riak • Column store – Data is stored as families of columns; high scalability with very high performance architecture – Examples : HBase, Cassandra, Vertica and Hypertable

15. No SQL Database Types • Document database – Designed to store, retrieve & manage document oriented information; expands on key-value store – Example: MongoDB, CouchDB • Graph database – Designed for data that whose relations are well represented in graphs, usually with nodes connected to edges – Examples : Neo4J and Polyglot

16. Analytical Database • An analytical database is a type of database built to store, manage, and consume big data. • Optimized for processing advanced analytics that involves highly complex queries on terabytes of data and complex statistical processing, data mining, and NLP (natural language processing). • Examples of analytical databases are Vertica (acquired by HP), Aster Data (acquired by Teradata), Greenplum (acquired by EMC), and so on.

17. BIG DATA USE CASE PATTERNS

18. Preprocess & Store • Scenario – Data getting continuously generated in large volume – Need to pre-process before loading into target systems

19. Real Time Actions • Scenario – Manage actions to be taken on continuously changing data in real time

20. Credit Card Issuer

21. Sears – Competes on Big Data • They have data of over 100 million customers, which they analyse to offer real-time, relevant offers to their customers. • The solution was 3 years in the making, which included programming that would capture, analyze, and report on customer activity at an individual level, across all 4,000 locations. • Sears has a Hadoop cluster of 300-nodes that is populated with over 2 petabytes of structure customer transaction data, sales data and supply chain data. • Results: Sears achieved an active member base in the 8 digits, exceeding the projected 36 month membership target in 17 months.

22. THE FUTURE OF BIG DATA

23. Compound Annual Growth Rate IDC Report Analysis

24. Careers in Big Data

25. THE END Next : Hadoop