Suche senden
Hochladen
Hadoop Trends
•
19 gefällt mir
•
7,778 views
Hortonworks
Folgen
Eer
Weniger lesen
Mehr lesen
Technologie
Diashow-Anzeige
Melden
Teilen
Diashow-Anzeige
Melden
Teilen
1 von 28
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
Hive ppt (1)
Hive ppt (1)
marwa baich
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
DataWorks Summit
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
DataWorks Summit
TP1 Big Data - MapReduce
TP1 Big Data - MapReduce
Amal Abid
Android-Tp4: stockage
Android-Tp4: stockage
Lilia Sfaxi
Chapitre 2 hadoop
Chapitre 2 hadoop
Mouna Torjmen
Empfohlen
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
Hive ppt (1)
Hive ppt (1)
marwa baich
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward Berlin 2017: Aris Kyriakos Koliopoulos - Drivetribe's Kappa Arc...
Flink Forward
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
DataWorks Summit
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
DataWorks Summit
TP1 Big Data - MapReduce
TP1 Big Data - MapReduce
Amal Abid
Android-Tp4: stockage
Android-Tp4: stockage
Lilia Sfaxi
Chapitre 2 hadoop
Chapitre 2 hadoop
Mouna Torjmen
hadoop
hadoop
Deep Mehta
UML Diagrammes Dynamiques
UML Diagrammes Dynamiques
'Farouk' 'BEN GHARSSALLAH'
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Databricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
Cloudera Hadoop Distribution
Cloudera Hadoop Distribution
Thisara Pramuditha
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
DataWorks Summit/Hadoop Summit
Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
Docker Tours Meetup #1 - Introduction à Docker
Docker Tours Meetup #1 - Introduction à Docker
Thibaut Marmin
BigData_Chp4: NOSQL
BigData_Chp4: NOSQL
Lilia Sfaxi
Impala presentation
Impala presentation
trihug
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all together
Lilia Sfaxi
Hadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
Les BD NoSQL
Les BD NoSQL
Minyar Sassi Hidri
Cloud Computing: Hadoop
Cloud Computing: Hadoop
darugar
Presentation PFE
Presentation PFE
asma amri
PFE : ITIL - Gestion de parc informatique
PFE : ITIL - Gestion de parc informatique
chammem
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
CCG
Airflow presentation
Airflow presentation
Anant Corporation
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
DataWorks Summit/Hadoop Summit
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big Data
Lilia Sfaxi
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
m_hepburn
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
Hortonworks
Weitere ähnliche Inhalte
Was ist angesagt?
hadoop
hadoop
Deep Mehta
UML Diagrammes Dynamiques
UML Diagrammes Dynamiques
'Farouk' 'BEN GHARSSALLAH'
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Databricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
Cloudera Hadoop Distribution
Cloudera Hadoop Distribution
Thisara Pramuditha
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
DataWorks Summit/Hadoop Summit
Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
Docker Tours Meetup #1 - Introduction à Docker
Docker Tours Meetup #1 - Introduction à Docker
Thibaut Marmin
BigData_Chp4: NOSQL
BigData_Chp4: NOSQL
Lilia Sfaxi
Impala presentation
Impala presentation
trihug
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all together
Lilia Sfaxi
Hadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
Les BD NoSQL
Les BD NoSQL
Minyar Sassi Hidri
Cloud Computing: Hadoop
Cloud Computing: Hadoop
darugar
Presentation PFE
Presentation PFE
asma amri
PFE : ITIL - Gestion de parc informatique
PFE : ITIL - Gestion de parc informatique
chammem
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
CCG
Airflow presentation
Airflow presentation
Anant Corporation
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
DataWorks Summit/Hadoop Summit
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big Data
Lilia Sfaxi
Was ist angesagt?
(20)
hadoop
hadoop
UML Diagrammes Dynamiques
UML Diagrammes Dynamiques
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
Databricks Platform.pptx
Databricks Platform.pptx
Cloudera Hadoop Distribution
Cloudera Hadoop Distribution
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
Apache Hadoop 3
Apache Hadoop 3
Docker Tours Meetup #1 - Introduction à Docker
Docker Tours Meetup #1 - Introduction à Docker
BigData_Chp4: NOSQL
BigData_Chp4: NOSQL
Impala presentation
Impala presentation
BigData_Chp5: Putting it all together
BigData_Chp5: Putting it all together
Hadoop and Big Data
Hadoop and Big Data
Les BD NoSQL
Les BD NoSQL
Cloud Computing: Hadoop
Cloud Computing: Hadoop
Presentation PFE
Presentation PFE
PFE : ITIL - Gestion de parc informatique
PFE : ITIL - Gestion de parc informatique
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
Airflow presentation
Airflow presentation
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
BigData_Chp1: Introduction à la Big Data
BigData_Chp1: Introduction à la Big Data
Ähnlich wie Hadoop Trends
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
m_hepburn
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
Hortonworks
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Teradata Aster
Hadoop as data refinery
Hadoop as data refinery
Steve Loughran
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
JAX London
Introduction to Hadoop
Introduction to Hadoop
POSSCON
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
Hortonworks
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
Why hadoop for data science?
Why hadoop for data science?
Hortonworks
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hortonworks
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
Roby Chen
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Slim Baltagi
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
201305 hadoop jpl-v3
201305 hadoop jpl-v3
Eric Baldeschwieler
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
Ähnlich wie Hadoop Trends
(20)
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop as data refinery
Hadoop as data refinery
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
Introduction to Hadoop
Introduction to Hadoop
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Why hadoop for data science?
Why hadoop for data science?
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
201305 hadoop jpl-v3
201305 hadoop jpl-v3
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Mehr von Hortonworks
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
HDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
Mehr von Hortonworks
(20)
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Kürzlich hochgeladen
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Zilliz
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Fwdays
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Wonjun Hwang
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Zilliz
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
charlottematthew16
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Safe Software
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
RankYa
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Fwdays
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
gvaughan
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
hariprasad279825
Kürzlich hochgeladen
(20)
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
Hadoop Trends
1.
Trends and usage
of Apache Hadoop Eric Baldeschwieler CEO Hortonworks Twitter: @jeric14, @hortonworks January 2012 © Hortonworks Inc. 2011 Page 1
2.
Agenda • Define terms
– What is Hadoop? Why does Hadoop matter? • What drives Hadoop adoption? • Observed Trends Architecting the Future of Big Data Page 2 © Hortonworks Inc. 2011
3.
Hortonworks Vision We
believe that by 2015, more than half the world's data will be processed by Apache Hadoop How to achieve that vision??? Enable ecosystem around enterprise-viable platform. Page 3 © Hortonworks Inc. 2011
4.
What is Apache
Hadoop? • Solution for big data – Deals with complexities of high volume, velocity & variety of data • Set of open source projects • Transforms commodity hardware into a service that: – Stores petabytes of data reliably – Allows huge distributed computations • Key attributes: – Redundant and reliable (no data loss) One of the best examples of – Extremely powerful open source driving innovation – Batch processing centric and creating a market – Easy to program distributed apps – Runs on commodity hardware Page 4 © Hortonworks Inc. 2011
5.
Hortonworks Data Platform
(HDP) Key Components of “Standard Hadoop” Open Source Stack Core Apache Hadoop Related Hadoop Projects Open APIs for: • Data Integration • Data Movement • App Job Management • System Management Pig Hive (Data Flow) (SQL) (Columnar NoSQL Store) HBase MapReduce Zookeeper (Coordination) (Distributed Programing Framework) HCatalog (Table & Schema Management) HDFS (Hadoop Distributed File System) Page 5 © Hortonworks Inc. 2011
6.
Big Data Trailblazers
and Use Cases data analyzing web logs analytics advertising optimization machine learning mail anti-spam text mining web search content optimization customer trend analysis ad selection video & audio processing data mining user interest prediction social media Page 6 © Hortonworks Inc. 2011
7.
Yahoo!, Apache Hadoop
& Hortonworks http://www.wired.com/wiredenterprise/2011/10/how-yahoo-spawned-hadoop Yahoo! embraced Apache Hadoop, an open source platform, to crunch epic amounts of data using an army of dirt-cheap servers 2006 Hadoop at Yahoo! 40K+ Servers 170PB Storage 5M+ Monthly Jobs 1000+ Active Users 2011 Yahoo! spun off 22+ engineers into Hortonworks, a company focused on advancing open source Apache Hadoop for the broader market Page 7 © Hortonworks Inc. 2011
8.
What drives Hadoop
adoption? Architecting the Future of Big Data Page 8 © Hortonworks Inc. 2011
9.
Market Drivers for
Apache Hadoop • Business drivers – High-value projects that require use of more data Gartner predicts 800% data growth – Belief that there is great ROI in mastering big data over next 5 years • Financial drivers – Growing cost of data systems as percentage of IT spend – Cost advantage of commodity hardware + open source – Enables departmental-level big data strategies 80-90% of data produced today is unstructured • Technical drivers – Existing solutions failing under growing requirements – 3Vs - Volume, velocity, variety – Proliferation of unstructured data © Hortonworks Inc. 2011 9 © Hortonworks Inc. 2011
10.
Every Market has
Big Data Digital data is personal, everywhere, increasingly accessible, and will continue to grow exponentially Source: McKinsey & Company report. Big data: The next frontier for innovation, competition, and productivity. May 2011. Page 10 © Hortonworks Inc. 2011
11.
Broader Use Case
Opportunities Financial Services Healthcare • Detect/prevent fraud • Patient monitoring • Model and manage risk • Predictive modeling • Personalize banking/insurance products • Compliance, Archival, text search • Compliance, Archival, … • Data driven research Retail Web / Social / Mobile • Behavior analysis • Sentiment analysis • Cross selling, recommendation engines • Web log, image, and video analysis • Optimize pricing, placement, design • Personalization • Optimize inventory and distribution • Billing, Reporting, Network Analysis Manufacturing Government • Simulation, Analysis, Design • Detect/prevent fraud • Improve service via product sensor data • Security & Intelligence • “Digital factory” for lean manufacturing • Support open data initiatives Page 11 © Hortonworks Inc. 2011
12.
Observed Trends
Architecting the Future of Big Data Page 12 © Hortonworks Inc. 2011
13.
Trend: Agile Data • The
old way – Operational systems keep only current records, short history – Analytics systems keep only conformed / cleaned / digested data – Unstructured data locked away in operational silos – Archives offline – Inflexible, new questions require system redesigns • The new trend – Keep raw data in Hadoop for a long time – Able to produce a new analytics view on-demand – Keep a new copy of data that was previously on in silos – Can directly do new reports, experiments at low incremental cost – New products / services can be added very quickly – Agile outcome justifies new infrastructure Architecting the Future of Big Data Page 13 © Hortonworks Inc. 2011
14.
Traditional Enterprise Data
Architecture Data Silos Traditional Data Warehouses, Serving Applications BI & Analytics Web NoSQL Traditional ETL & Data BI / Serving RDMS … Message buses EDW Marts Analytics Serving Social Sensor Text Logs Media Data Systems … Unstructured Systems Page 14 © Hortonworks Inc. 2011
15.
Agile Data Architecture
w/Hadoop Connecting All of Your Big Data Traditional Data Warehouses, Serving Applications BI & Analytics Web NoSQL Traditional ETL & Data BI / Serving RDMS … Message buses EDW Marts Analytics EsTsL (s = Store) Custom Analytics Serving Social Sensor Text Logs Media Data Systems … Unstructured Systems Page 15 © Hortonworks Inc. 2011
16.
Trend: Data driven
development • Limited runtime logic driven by huge lookup tables • Data computed offline on Hadoop – Machine learning, other expensive computation offline – Personalization, classification, fraud, value analysis… • Application development requires data science – Huge amounts of actually observed data key to modern services – Hadoop used as the science platform Architecting the Future of Big Data Page 16 © Hortonworks Inc. 2011
17.
CASE STUDY
YAHOO! HOMEPAGE • Serving Maps SCIENCE » Machine learning to build ever • Users -‐ Interests HADOOP better categorization models CLUSTER • Five Minute USER CATEGORIZATION Produc7on BEHAVIOR MODELS (weekly) • Weekly PRODUCTION Categoriza7on HADOOP » Identify user interests using models SERVING CLUSTER Categorization models MAPS (every 5 minutes) USER BEHAVIOR SERVING SYSTEMS ENGAGED USERS Build customized home pages with latest data (thousands / second) Copyright Yahoo 2011 17
18.
CASE STUDY
YAHOO! HOMEPAGE Personalized for each visitor Result: twice the engagement Recommended links News Interests Top Searches +79% clicks +160% clicks +43% clicks vs. randomly selected vs. one size fits all vs. editor selected Copyright Yahoo 2011 Hortonworks Inc. 2011 © 18
19.
Trend: Specialization of
Data Systems • Hadoop does not replace existing systems – It adds new capabilities to the enterprise – It can offload things that are not done efficiently in current systems – Especially in scale out situations • Specialization of traditional data components – Use OLTP systems just for transactions – Use OLAP systems for interactive analysis • Hadoop has LOTS of bandwidth to storage and CPU – Pull reporting out OLTP systems – Pull ELT out of OLAP systems Architecting the Future of Big Data Page 19 © Hortonworks Inc. 2011
20.
Hadoop and OLTP
Systems MPP Processing of Online Transactions Hadoop used to Process Reports • Mission critical • Free up 50+% processing power for • Manages transactions & serves reports transaction processing system • Significant cost savings due to commodity nature of Hadoop Web Site Transaction Reports Processing Web Systems Site $$$ Transaction Logs Web Site Page 20 © Hortonworks Inc. 2011
21.
Hadoop and OLAP
Systems Fast loading, raw data staging, ELT & long-term archival Allow analysts to use tools they know (The Agile Data Zone) (Take advantage of huge ecosystem of BI and Analytics tooling) Web Hadoop EDW Mobile Social Online Archival Other logs Page 21 © Hortonworks Inc. 2011
22.
TRENDS: Instrument Clouds
of Things Clouds of things logging to Hadoop HDFS + Map-Reduce Websites Or HBase Mobile phones, Enterprise devices… + Analysis Things Things Things Things Things Things Page 22 © Hortonworks Inc. 2011
23.
Trend: Many POCs,
Few Production Systems • The problem – Hadoop is still a young technology – Hard to find knowledgeable staff – Integration with existing systems • Hadoop market is maturing at speed – Emerging ecosystem of Hadoop platform solutions providers – Apache Hadoop continues to get better – Hadoop training and support available form several vendors Architecting the Future of Big Data Page 23 © Hortonworks Inc. 2011
24.
Growth in Hadoop
Ecosystem • Hardware vendors, Public Cloud (IAAS, PAAS) – Storage, Appliances, Preloaded commodity boxes, cloud • Data Systems – All the major vendors announced Hadoop plans / products in 2011 • BI, Analytics and ETL – Hadoop integrations emerging • Dedicated Hadoop Applications – Datamere, Karmashere, Platfora, … • Systems Integrators – Regional and Global providers available Architecting the Future of Big Data Page 24 © Hortonworks Inc. 2011
25.
Hadoop Continues to
Improve Apache community, including Hortonworks investing to improve Hadoop: • Make Hadoop an Open, Extensible, and Enterprise Viable Platform • Enable More Applications to Run on Apache Hadoop “Hadoop.Beyond” Platform actively evolving “Hadoop.Next” (Hadoop 0.23) HA, Next-gen HDFS & MapReduce “Hadoop.Now” Extension & Integration APIs (Hadoop 1.0) Most stable version ever HBase, security, WebHDFS Page 25 © Hortonworks Inc. 2011
26.
Hortonworks – Approachable
Hadoop • Apache Hadoop Leadership – Delivered every major release since 0.1 – Driving innovation across entire stack – Experience managing world’s largest deployment – Access to Yahoo’s 1,000+ Hadoop users and 40k+ nodes for testing, QA, etc. • Business Focus – Provide 100% open source product – Hortonworks Data Platform Expert Role-based Training – Help customers and partners overcome Hadoop knowledge gaps Full Lifecycle Support and Services – Help organizations successfully develop and deploy solutions based on Hadoop Evaluate Pilot Production Architecting the Future of Big Data Page 26 © Hortonworks Inc. 2011
27.
Trend: Finding More
Value Over Time • Hadoop is usually brought in to solve a specific problem – Build seach indexes for Yahoo – Manage web site logs for Facebook – Users using EC2 to do data processing at Amazon – Simple reporting when existing tools don’t scale • Once your data is in Hadoop more users find value • Once you have Hadoop, folks add more data Architecting the Future of Big Data Page 27 © Hortonworks Inc. 2011
28.
Thank You! Questions? Eric
Baldeschwieler @jeric14 @hortonworks Page 28 © Hortonworks Inc. 2011
Jetzt herunterladen