SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
The Big Data
Landscape
Entering a New Era of Scale

2
Convergence of Technology Disrupters
Create Opportunity
Cloud

Mobile

Big Data

Social

Internet of
Things
NetApp Confidential - Internal Use Only
Unstructured Data Growth Dominates
Revenue Share by Segment

Traditional structured

Traditional unstructured

Traditional replicated

Content depots / public cloud

 Traditional Structured and
Replicated Data mix shift is
driven by:
− Efficiency (Dedup,
Compr, Thin Prov, SATA)
− Growth in new category
of storage consumers
using cloud / content
depots
 Unstructured Data (files
and objects) in traditional
storage + Content depots /
Cloud) will be the largest
storage category by 2014
− Content depots / Cloud
expected to be 95%
unstructured data
Not Even to The “Peak”
VISIBILITY

Peak of Inflated Expectations
Plateau of Productivity

Slope of Enlightenment
Trough of Disillusionment
Technology Trigger
TIME

40 Zettabytes

5 Billion

Estimated size of the
digital universe in 2020

Smart phones

30 Billion

80%

Pieces of new content to
Facebook per month

Unstructured
data

5
Big Data Is All Data From Everywhere
Fundamentally changes your business

 Transactional Data
The Jet way

 Machine Data

 Social Data
 Enterprise Content
The Call Center
Big Data Vendor Landscape
A Lot of Hype and Buzz – Everyone is Jumping In
Funding for Hadoop and NoSQL

451 Research

400

350

Cloudera series D
10gen series D
MapR series B
DataStax series B
Neo Technology series A
Opera Solutions series A
Platfora series A
Couchbase series C

300
250
200
150
100

Cloudera series C
Cloudera series B
MapR series A

50
0

Jan-08

Nov-11

 Market is expected to grow from $3.2 billion
in 2010 to $16.9 billion in 2015

 NoSQL $2Bn PA by 2015
 Most firms are taking a pragmatic approach
 Big data is in the very early stages of maturity

"The Big Data market is expanding rapidly …
For technology buyers, opportunities exist to
use Big Data technology to improve
operational efficiency and to drive innovation.
Use cases are already present across
industries and geographic regions."
Dan Vesset, Vice President, IDC

 Best practices are not mature
IDC Big Data Survey
7
Data Growth Impact on Business
Complexity

“Big Data” refers to datasets whose size is
beyond the ability of typical tools to capture,
store, manage and analyze

Speed

Volume

Business Velocity

Information Becomes
a Propellant to Business

Inflection
Point

2010

Data Becomes a
Burden to IT Infrastructure

2020

8
Why Should You Care?
It’s the Value of Your Data

 Top line revenue
– Leverage their data
assets into business
advantage






5 Billion Records
Anywhere, Anytime
Faster time to market
50% Increase in Revenue






Over 1PB of data
Growth of 175% YOY
90 days of data within
24 hours of a failure

 Bottom Line savings
– Lower the cost of
compliance
– Manage ever growing
data efficiently

9
NetApp Big Data
Why NetApp?
Practical solutions that solve today’s problems

Get
Control
Break
Through
Gain
Insight

NetApp helps you turn your
exploding data from threat to
opportunity. Manage your data
effectively and affordably.
Break through the limits. With
NetApp, you can take on even the
most massive and complex data
projects.
Turn insight to action. NetApp helps
you get to clarity and insight faster
and more reliably.

11
Experience Managing Data at Scale
NetApp’s Largest Customer

100 PB

4 Customers

50 PB

10 Customers
20 PB

50 Customers

10 PB

100 Customers
12
NetApp Big Data Strategy
Open
Best-of-Breed
Choice

 Best of breed storage for Big
Data Applications
 Create deep integration and
value add
 Build on open standards with
best-in-class partnerships
 Validate with Ecosystem
Leaders
– Complete server, network and
storage “Racks”
– Delivered via trusted high-value
partners

13
Industry-Leading Storage Innovation
Corporate
Data Centers

Cloud
Data Centers

Flash Arrays
for ultra-high performance

E-Series
Clustered Data ONTAP
for Shared Infrastructure

for price-performance at scale

StorageGRID
for web scale object storage

14
Big Data Building Blocks
Applications
Big Bandwidth

Big Analytics

Ingest, Process, Stream

Reduce, Analyze, Report

Retain, Distribute

Retain, Distribute
Extract

Big Content
Retain forever, multi-site distribution
Store
Retrieve

Cloud
Private/Public
15
16
Analytics Oriented Business Processing
Business Applications
Query-based
Retrieval

Commit

Transaction Processing

Transaction granular data
resilience, recoverability &
protection at line speeds

Memory Ingest
Disk/Flash Tier
Performance
optimized query
service

Realtime Analytics
Federated Database Store
(Build/Buy/Partner)

Persisted
Commit

Data organization
optimized by query
interface

RDBMS

Columnar DB

Document Store

K-V Store

General Purpose DB
 Data organized to
align with schemas
 Fixed consistency
model
 Complex queries
supported
 Volume based data
management

Analytics Oriented
 Data organized in
column files
 Tabular interface
without rigid schemas
 Fast column scans
 Multiple consistency
models
 Transaction granular
data management

Transaction Oriented
 Data organized in
data structures in
memory
 Schemaless
transaction store for
structured data
 High transactional
performance

Metadata Service
Oriented
 Data organized in key
value pairs
 Suitable for metadata
services with CMS’
 Associated with
object services
Analytics Technologies to look out for!
Old World

New World
Graph
DBs
(Niche)

Key-Value
Stores
(Content/Object
Service)

Row-oriented
RDBMS’

Document
Stores
(Transaction
Oriented)

Columnar
DBs
(Analytics
Oriented)

Datacenter Multi - Datacenter

Relational DBs
• ACID constrained
• Complete query set
• Limited availability

• High consistency
• Rich query set
• Good availability

• Tuneable consistency
• Limited query set
• Highest/WAN availability
Analytics & Enterprise Apps Environment
Reporting/Dashboard/Visualization
Applications
OLAP

Analytics

ETL

Data Management

ETL

OLAP

OLTP

Storage File Systems
Mobile Devices
Location/GPS
Logs
Sensors
Applications

Other
Data
Sources

Content
Repositories

Shared Storage
Infrastructure

Storage
Data
Management

NFS/sNFS/pNFS
Storage

(All other storage, i.e. internal DAS)

NetApp Confidential – Limited Use

19
Some problems require an Enterprise Class
Hadoop solution
Enterprise Class Hadoop

Enterprise Class Hadoop

Packaged ready-to-deploy modular compute
intensive Hadoop cluster

Compute Power

 Compute intensive applications
 Video, imaging analysis
 Extremely tight Service Level expectations
 Severe financial consequences if the
data analytic application or service is
run late

Commodity, Off the Shelf Hadoop
Values associated with early adopters of
Hadoop





Social Media Space
Contributors to Apache
Strong bias to JBOD
Skeptical of ALL vendors

Packaged ready-to-deploy modular Hadoop
cluster
 The data has intrinsic value $$$
 Capacity and compute requirements
expanding very fast
 Higher storage performance
 Real human consequences if the system
fails (Threats, treatments, financial losses)
 System has to allow for asymmetric growth

Enterprise Class Hadoop
Packaged ready-to-deploy modular storage
intensive Hadoop cluster
 Storage intensive applications
 Additional CPUs does not help run time
 Financial ticker data analysis
 Extremely tight Service Level expectations
 Need deeper storage per datanode

Storage Capacity
NetApp Confidential – Limited Use

20
NetApp Open Solution for Hadoop
 Easy to Deploy, Manage and Scale
 Uses High Performance storage

HDFS
NameNode
FAS2040
Secondary
NameNode

– Resilient and Compact
– RAID Protection of Data
– Less Network Congestion

 Raw Capacity and density
Map
Reduce
JobTracker

DataNodes /
TaskTracker

:

– 120TB or 180TB in 4U
– Fully serviceable storage system
4 separate shared
nothing partitions

E2660
DataNodes /
TaskTracker

 Reliability
– Hardware RAID & hot swap prevent
job restart due to node go off-line in
case of media failure
– Reliable metadata (Name Node)

Enterprise Class Hadoop
NetApp Confidential – Limited Use

21
NetApp Open Solution for Hadoop
Validated Benefits for the Enterprise

 Improved cluster performance by 62%
 Completed jobs 200% faster under
drive failure
 Delivered linear performance scalability
as nodes, data grew
 Per-server capacity increase of 1.5x

The NetApp Open Solution for Hadoop improves capacity
and performance efficiency and recoverability compared to
a server-based DAS deployment.
- ESG, 2012
Optimizing Performance and Stay Healthy
Source: Cisco: http://bit.ly/yL54Ts

Availability and
Resiliency

Burst Handling and
Queuing

Oversubscription
Ratio

Network Overhead

Data Node Network
Speed

Network
Latency

Useful Work

Source: Garrett, Brian and Lockner, Julie, “NetApp Open Solution for Hadoop”, ESG Report,
May 2012, http://bit.ly/LyYG0t

23
DAS vs. NetApp footprint
DAS Option




2RU, CPU: 2x8 cores, RAM: 48GB, Disk:
24 TB
1 Rack(42RU): 20 servers (320 cores,
960GB, 480TB)
6 Racks: 1920 cores, 5.7TB RAM, 2.8 PB
Storage (120 servers)

NetApp Option





1RU, CPU: 2x8 cores, RAM: 48GB, Disk: 2
TB (8TB Max(Optional PIXI Boot Diskless)
1 Rack (42RU)
 CPU and Memory: 24 servers(6:1),
384 cores, 1.152TB
 Storage: 4 E2660 720TB
4 Racks: 1536 cores, 4.6TB, 2.8 PB (96
servers)
Case Study: ASUP NetApp Analytics

Data Mart

Extract
Transform
Load
Data
Warehouse

Data Mart

Gateways

ETL

Data Warehouse

• 800K ASUPs
every week
• 40% coming
over the
weekend

• Data needs
to be
parsed
and loaded
in 15
minutes

• Only 5% of data goes into
the data warehouse, rest
unstructured, yet it’s growing
7-10 TB per month
• No easy way to access this
unstructured content

Reporting
• Numerous mining
requests are not
satisfied currently
• Huge untapped
potential of
valuable insight

Finally, the incoming load doubles every 16 months!
NetApp Proprietary - Limited Use Only

25
Case Study: NetApp Large-Scale Analytics

CHALLENGE

NETAPP
SOLUTION

4 weeks to run a query
on
24 billion unstructured
records

Impossible to run a
query:
240 billion unstructured
records

BENEFITS

Time reduced from
4 weeks to 10.5
hours

10-node
Hadoop
Cluster

Previously
impossible, now
achievable in just 18
hours

NetApp Proprietary - Limited Use Only

26
Integrated Big Data Solutions and Expertise
 Planning and implementation expertise for Big Data
 Turn-key solution stacks and Big Data services
Big Data System Integrators Solutions Built on NetApp®

27
Next Steps - Team with the Experts
 Strategic Assessment
– Business goals
– Data growth needs
– Use case discovery (partner
delivery)

 Consult
– Solution architecture and design
(NetApp delivery)
Support options:
Global support available
from NetApp and partners

 Deploy
– Installation and implementation
(NetApp delivery)
– Solution implementation (partner
delivery)
28
NetApp Confidential - Internal Use Only

Weitere ähnliche Inhalte

Was ist angesagt?

Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015DataWorks Summit
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThomas Kelly, PMP
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Dataconomy Media
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...Denodo
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...Capgemini
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7mmathipra
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overviewjdijcks
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopDavid Yahalom
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopCCG
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraCloudera, Inc.
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprisekayalvizhi kandasamy
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionDataStax
 

Was ist angesagt? (20)

Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Analytics in a Day Virtual Workshop
Analytics in a Day Virtual WorkshopAnalytics in a Day Virtual Workshop
Analytics in a Day Virtual Workshop
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data Solution
 

Ähnlich wie Exploring the Wider World of Big Data- Vasalis Kapsalis

Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big DataNetApp
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data InsightsDataWorks Summit
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin RobbinsData Con LA
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeMicrosoft
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02email2jl
 
Introduction to Harnessing Big Data
Introduction to Harnessing Big DataIntroduction to Harnessing Big Data
Introduction to Harnessing Big DataPaul Barsch
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big DataDataWorks Summit
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data GrowthRainStor
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 

Ähnlich wie Exploring the Wider World of Big Data- Vasalis Kapsalis (20)

Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platforms
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
 
Introduction to Harnessing Big Data
Introduction to Harnessing Big DataIntroduction to Harnessing Big Data
Introduction to Harnessing Big Data
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data Growth
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 

Mehr von NetAppUK

EMEA IDC Numbers CY17Q1 NetApp
EMEA IDC Numbers CY17Q1 NetApp EMEA IDC Numbers CY17Q1 NetApp
EMEA IDC Numbers CY17Q1 NetApp NetAppUK
 
Virtualizing Telco Networks
Virtualizing Telco NetworksVirtualizing Telco Networks
Virtualizing Telco NetworksNetAppUK
 
NetApp Cloud GDPR Response Survey - EMEA Findings II
NetApp Cloud GDPR Response Survey - EMEA Findings IINetApp Cloud GDPR Response Survey - EMEA Findings II
NetApp Cloud GDPR Response Survey - EMEA Findings IINetAppUK
 
EMEA IDC Market Share Q4 - All Flash Array
EMEA IDC Market Share Q4 - All Flash ArrayEMEA IDC Market Share Q4 - All Flash Array
EMEA IDC Market Share Q4 - All Flash ArrayNetAppUK
 
World Cup Infographic
World Cup InfographicWorld Cup Infographic
World Cup InfographicNetAppUK
 
2012 London Olympic Infographic
2012 London Olympic Infographic2012 London Olympic Infographic
2012 London Olympic InfographicNetAppUK
 
Data in Movies Infographic
Data in Movies InfographicData in Movies Infographic
Data in Movies InfographicNetAppUK
 
10 Good Reasons - NetApp StorageGRID
10 Good Reasons - NetApp StorageGRID10 Good Reasons - NetApp StorageGRID
10 Good Reasons - NetApp StorageGRIDNetAppUK
 
10 Good Reasons - NetApp Ransomware Protection
10 Good Reasons - NetApp Ransomware Protection 10 Good Reasons - NetApp Ransomware Protection
10 Good Reasons - NetApp Ransomware Protection NetAppUK
 
10 Good Reasons - NetApp OnCommand Insight
10 Good Reasons - NetApp OnCommand Insight 10 Good Reasons - NetApp OnCommand Insight
10 Good Reasons - NetApp OnCommand Insight NetAppUK
 
10 Good Reasons - NetApp for OpenStack
10 Good Reasons - NetApp for OpenStack 10 Good Reasons - NetApp for OpenStack
10 Good Reasons - NetApp for OpenStack NetAppUK
 
10 Good Reasons - NetApp for Healthcare
10 Good Reasons - NetApp for Healthcare10 Good Reasons - NetApp for Healthcare
10 Good Reasons - NetApp for HealthcareNetAppUK
 
10 Good Reasons - NetApp for Finance
10 Good Reasons - NetApp for Finance10 Good Reasons - NetApp for Finance
10 Good Reasons - NetApp for FinanceNetAppUK
 
10 Good Reasons - NetApp for Devops
10 Good Reasons - NetApp for Devops 10 Good Reasons - NetApp for Devops
10 Good Reasons - NetApp for Devops NetAppUK
 
10 Good Reasons - NetApp for Automotive
10 Good Reasons - NetApp for Automotive10 Good Reasons - NetApp for Automotive
10 Good Reasons - NetApp for AutomotiveNetAppUK
 
10 Good Reasons - NetApp for Analytics
10 Good Reasons - NetApp for Analytics 10 Good Reasons - NetApp for Analytics
10 Good Reasons - NetApp for Analytics NetAppUK
 
10 Good Reasons - NetApp FlexPod
10 Good Reasons - NetApp FlexPod10 Good Reasons - NetApp FlexPod
10 Good Reasons - NetApp FlexPodNetAppUK
 
10 Good Reasons - NetApp Data Fabric
10 Good Reasons - NetApp Data Fabric10 Good Reasons - NetApp Data Fabric
10 Good Reasons - NetApp Data FabricNetAppUK
 
10 Good Reasons - NetApp AltaVault
10 Good Reasons - NetApp AltaVault10 Good Reasons - NetApp AltaVault
10 Good Reasons - NetApp AltaVaultNetAppUK
 
IoT Big Data Coffee Maker
IoT Big Data Coffee MakerIoT Big Data Coffee Maker
IoT Big Data Coffee MakerNetAppUK
 

Mehr von NetAppUK (20)

EMEA IDC Numbers CY17Q1 NetApp
EMEA IDC Numbers CY17Q1 NetApp EMEA IDC Numbers CY17Q1 NetApp
EMEA IDC Numbers CY17Q1 NetApp
 
Virtualizing Telco Networks
Virtualizing Telco NetworksVirtualizing Telco Networks
Virtualizing Telco Networks
 
NetApp Cloud GDPR Response Survey - EMEA Findings II
NetApp Cloud GDPR Response Survey - EMEA Findings IINetApp Cloud GDPR Response Survey - EMEA Findings II
NetApp Cloud GDPR Response Survey - EMEA Findings II
 
EMEA IDC Market Share Q4 - All Flash Array
EMEA IDC Market Share Q4 - All Flash ArrayEMEA IDC Market Share Q4 - All Flash Array
EMEA IDC Market Share Q4 - All Flash Array
 
World Cup Infographic
World Cup InfographicWorld Cup Infographic
World Cup Infographic
 
2012 London Olympic Infographic
2012 London Olympic Infographic2012 London Olympic Infographic
2012 London Olympic Infographic
 
Data in Movies Infographic
Data in Movies InfographicData in Movies Infographic
Data in Movies Infographic
 
10 Good Reasons - NetApp StorageGRID
10 Good Reasons - NetApp StorageGRID10 Good Reasons - NetApp StorageGRID
10 Good Reasons - NetApp StorageGRID
 
10 Good Reasons - NetApp Ransomware Protection
10 Good Reasons - NetApp Ransomware Protection 10 Good Reasons - NetApp Ransomware Protection
10 Good Reasons - NetApp Ransomware Protection
 
10 Good Reasons - NetApp OnCommand Insight
10 Good Reasons - NetApp OnCommand Insight 10 Good Reasons - NetApp OnCommand Insight
10 Good Reasons - NetApp OnCommand Insight
 
10 Good Reasons - NetApp for OpenStack
10 Good Reasons - NetApp for OpenStack 10 Good Reasons - NetApp for OpenStack
10 Good Reasons - NetApp for OpenStack
 
10 Good Reasons - NetApp for Healthcare
10 Good Reasons - NetApp for Healthcare10 Good Reasons - NetApp for Healthcare
10 Good Reasons - NetApp for Healthcare
 
10 Good Reasons - NetApp for Finance
10 Good Reasons - NetApp for Finance10 Good Reasons - NetApp for Finance
10 Good Reasons - NetApp for Finance
 
10 Good Reasons - NetApp for Devops
10 Good Reasons - NetApp for Devops 10 Good Reasons - NetApp for Devops
10 Good Reasons - NetApp for Devops
 
10 Good Reasons - NetApp for Automotive
10 Good Reasons - NetApp for Automotive10 Good Reasons - NetApp for Automotive
10 Good Reasons - NetApp for Automotive
 
10 Good Reasons - NetApp for Analytics
10 Good Reasons - NetApp for Analytics 10 Good Reasons - NetApp for Analytics
10 Good Reasons - NetApp for Analytics
 
10 Good Reasons - NetApp FlexPod
10 Good Reasons - NetApp FlexPod10 Good Reasons - NetApp FlexPod
10 Good Reasons - NetApp FlexPod
 
10 Good Reasons - NetApp Data Fabric
10 Good Reasons - NetApp Data Fabric10 Good Reasons - NetApp Data Fabric
10 Good Reasons - NetApp Data Fabric
 
10 Good Reasons - NetApp AltaVault
10 Good Reasons - NetApp AltaVault10 Good Reasons - NetApp AltaVault
10 Good Reasons - NetApp AltaVault
 
IoT Big Data Coffee Maker
IoT Big Data Coffee MakerIoT Big Data Coffee Maker
IoT Big Data Coffee Maker
 

Kürzlich hochgeladen

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Kürzlich hochgeladen (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Exploring the Wider World of Big Data- Vasalis Kapsalis

  • 2. Entering a New Era of Scale 2
  • 3. Convergence of Technology Disrupters Create Opportunity Cloud Mobile Big Data Social Internet of Things NetApp Confidential - Internal Use Only
  • 4. Unstructured Data Growth Dominates Revenue Share by Segment Traditional structured Traditional unstructured Traditional replicated Content depots / public cloud  Traditional Structured and Replicated Data mix shift is driven by: − Efficiency (Dedup, Compr, Thin Prov, SATA) − Growth in new category of storage consumers using cloud / content depots  Unstructured Data (files and objects) in traditional storage + Content depots / Cloud) will be the largest storage category by 2014 − Content depots / Cloud expected to be 95% unstructured data
  • 5. Not Even to The “Peak” VISIBILITY Peak of Inflated Expectations Plateau of Productivity Slope of Enlightenment Trough of Disillusionment Technology Trigger TIME 40 Zettabytes 5 Billion Estimated size of the digital universe in 2020 Smart phones 30 Billion 80% Pieces of new content to Facebook per month Unstructured data 5
  • 6. Big Data Is All Data From Everywhere Fundamentally changes your business  Transactional Data The Jet way  Machine Data  Social Data  Enterprise Content The Call Center
  • 7. Big Data Vendor Landscape A Lot of Hype and Buzz – Everyone is Jumping In Funding for Hadoop and NoSQL 451 Research 400 350 Cloudera series D 10gen series D MapR series B DataStax series B Neo Technology series A Opera Solutions series A Platfora series A Couchbase series C 300 250 200 150 100 Cloudera series C Cloudera series B MapR series A 50 0 Jan-08 Nov-11  Market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015  NoSQL $2Bn PA by 2015  Most firms are taking a pragmatic approach  Big data is in the very early stages of maturity "The Big Data market is expanding rapidly … For technology buyers, opportunities exist to use Big Data technology to improve operational efficiency and to drive innovation. Use cases are already present across industries and geographic regions." Dan Vesset, Vice President, IDC  Best practices are not mature IDC Big Data Survey 7
  • 8. Data Growth Impact on Business Complexity “Big Data” refers to datasets whose size is beyond the ability of typical tools to capture, store, manage and analyze Speed Volume Business Velocity Information Becomes a Propellant to Business Inflection Point 2010 Data Becomes a Burden to IT Infrastructure 2020 8
  • 9. Why Should You Care? It’s the Value of Your Data  Top line revenue – Leverage their data assets into business advantage     5 Billion Records Anywhere, Anytime Faster time to market 50% Increase in Revenue     Over 1PB of data Growth of 175% YOY 90 days of data within 24 hours of a failure  Bottom Line savings – Lower the cost of compliance – Manage ever growing data efficiently 9
  • 11. Why NetApp? Practical solutions that solve today’s problems Get Control Break Through Gain Insight NetApp helps you turn your exploding data from threat to opportunity. Manage your data effectively and affordably. Break through the limits. With NetApp, you can take on even the most massive and complex data projects. Turn insight to action. NetApp helps you get to clarity and insight faster and more reliably. 11
  • 12. Experience Managing Data at Scale NetApp’s Largest Customer 100 PB 4 Customers 50 PB 10 Customers 20 PB 50 Customers 10 PB 100 Customers 12
  • 13. NetApp Big Data Strategy Open Best-of-Breed Choice  Best of breed storage for Big Data Applications  Create deep integration and value add  Build on open standards with best-in-class partnerships  Validate with Ecosystem Leaders – Complete server, network and storage “Racks” – Delivered via trusted high-value partners 13
  • 14. Industry-Leading Storage Innovation Corporate Data Centers Cloud Data Centers Flash Arrays for ultra-high performance E-Series Clustered Data ONTAP for Shared Infrastructure for price-performance at scale StorageGRID for web scale object storage 14
  • 15. Big Data Building Blocks Applications Big Bandwidth Big Analytics Ingest, Process, Stream Reduce, Analyze, Report Retain, Distribute Retain, Distribute Extract Big Content Retain forever, multi-site distribution Store Retrieve Cloud Private/Public 15
  • 16. 16
  • 17. Analytics Oriented Business Processing Business Applications Query-based Retrieval Commit Transaction Processing Transaction granular data resilience, recoverability & protection at line speeds Memory Ingest Disk/Flash Tier Performance optimized query service Realtime Analytics Federated Database Store (Build/Buy/Partner) Persisted Commit Data organization optimized by query interface RDBMS Columnar DB Document Store K-V Store General Purpose DB  Data organized to align with schemas  Fixed consistency model  Complex queries supported  Volume based data management Analytics Oriented  Data organized in column files  Tabular interface without rigid schemas  Fast column scans  Multiple consistency models  Transaction granular data management Transaction Oriented  Data organized in data structures in memory  Schemaless transaction store for structured data  High transactional performance Metadata Service Oriented  Data organized in key value pairs  Suitable for metadata services with CMS’  Associated with object services
  • 18. Analytics Technologies to look out for! Old World New World Graph DBs (Niche) Key-Value Stores (Content/Object Service) Row-oriented RDBMS’ Document Stores (Transaction Oriented) Columnar DBs (Analytics Oriented) Datacenter Multi - Datacenter Relational DBs • ACID constrained • Complete query set • Limited availability • High consistency • Rich query set • Good availability • Tuneable consistency • Limited query set • Highest/WAN availability
  • 19. Analytics & Enterprise Apps Environment Reporting/Dashboard/Visualization Applications OLAP Analytics ETL Data Management ETL OLAP OLTP Storage File Systems Mobile Devices Location/GPS Logs Sensors Applications Other Data Sources Content Repositories Shared Storage Infrastructure Storage Data Management NFS/sNFS/pNFS Storage (All other storage, i.e. internal DAS) NetApp Confidential – Limited Use 19
  • 20. Some problems require an Enterprise Class Hadoop solution Enterprise Class Hadoop Enterprise Class Hadoop Packaged ready-to-deploy modular compute intensive Hadoop cluster Compute Power  Compute intensive applications  Video, imaging analysis  Extremely tight Service Level expectations  Severe financial consequences if the data analytic application or service is run late Commodity, Off the Shelf Hadoop Values associated with early adopters of Hadoop     Social Media Space Contributors to Apache Strong bias to JBOD Skeptical of ALL vendors Packaged ready-to-deploy modular Hadoop cluster  The data has intrinsic value $$$  Capacity and compute requirements expanding very fast  Higher storage performance  Real human consequences if the system fails (Threats, treatments, financial losses)  System has to allow for asymmetric growth Enterprise Class Hadoop Packaged ready-to-deploy modular storage intensive Hadoop cluster  Storage intensive applications  Additional CPUs does not help run time  Financial ticker data analysis  Extremely tight Service Level expectations  Need deeper storage per datanode Storage Capacity NetApp Confidential – Limited Use 20
  • 21. NetApp Open Solution for Hadoop  Easy to Deploy, Manage and Scale  Uses High Performance storage HDFS NameNode FAS2040 Secondary NameNode – Resilient and Compact – RAID Protection of Data – Less Network Congestion  Raw Capacity and density Map Reduce JobTracker DataNodes / TaskTracker : – 120TB or 180TB in 4U – Fully serviceable storage system 4 separate shared nothing partitions E2660 DataNodes / TaskTracker  Reliability – Hardware RAID & hot swap prevent job restart due to node go off-line in case of media failure – Reliable metadata (Name Node) Enterprise Class Hadoop NetApp Confidential – Limited Use 21
  • 22. NetApp Open Solution for Hadoop Validated Benefits for the Enterprise  Improved cluster performance by 62%  Completed jobs 200% faster under drive failure  Delivered linear performance scalability as nodes, data grew  Per-server capacity increase of 1.5x The NetApp Open Solution for Hadoop improves capacity and performance efficiency and recoverability compared to a server-based DAS deployment. - ESG, 2012
  • 23. Optimizing Performance and Stay Healthy Source: Cisco: http://bit.ly/yL54Ts Availability and Resiliency Burst Handling and Queuing Oversubscription Ratio Network Overhead Data Node Network Speed Network Latency Useful Work Source: Garrett, Brian and Lockner, Julie, “NetApp Open Solution for Hadoop”, ESG Report, May 2012, http://bit.ly/LyYG0t 23
  • 24. DAS vs. NetApp footprint DAS Option    2RU, CPU: 2x8 cores, RAM: 48GB, Disk: 24 TB 1 Rack(42RU): 20 servers (320 cores, 960GB, 480TB) 6 Racks: 1920 cores, 5.7TB RAM, 2.8 PB Storage (120 servers) NetApp Option    1RU, CPU: 2x8 cores, RAM: 48GB, Disk: 2 TB (8TB Max(Optional PIXI Boot Diskless) 1 Rack (42RU)  CPU and Memory: 24 servers(6:1), 384 cores, 1.152TB  Storage: 4 E2660 720TB 4 Racks: 1536 cores, 4.6TB, 2.8 PB (96 servers)
  • 25. Case Study: ASUP NetApp Analytics Data Mart Extract Transform Load Data Warehouse Data Mart Gateways ETL Data Warehouse • 800K ASUPs every week • 40% coming over the weekend • Data needs to be parsed and loaded in 15 minutes • Only 5% of data goes into the data warehouse, rest unstructured, yet it’s growing 7-10 TB per month • No easy way to access this unstructured content Reporting • Numerous mining requests are not satisfied currently • Huge untapped potential of valuable insight Finally, the incoming load doubles every 16 months! NetApp Proprietary - Limited Use Only 25
  • 26. Case Study: NetApp Large-Scale Analytics CHALLENGE NETAPP SOLUTION 4 weeks to run a query on 24 billion unstructured records Impossible to run a query: 240 billion unstructured records BENEFITS Time reduced from 4 weeks to 10.5 hours 10-node Hadoop Cluster Previously impossible, now achievable in just 18 hours NetApp Proprietary - Limited Use Only 26
  • 27. Integrated Big Data Solutions and Expertise  Planning and implementation expertise for Big Data  Turn-key solution stacks and Big Data services Big Data System Integrators Solutions Built on NetApp® 27
  • 28. Next Steps - Team with the Experts  Strategic Assessment – Business goals – Data growth needs – Use case discovery (partner delivery)  Consult – Solution architecture and design (NetApp delivery) Support options: Global support available from NetApp and partners  Deploy – Installation and implementation (NetApp delivery) – Solution implementation (partner delivery) 28
  • 29. NetApp Confidential - Internal Use Only