Introduction to Hortonworks Data Platform for Windows

Quick House Keeping Rule

• Q&A panel is available if you have any questions during the
webinar
• There will be time for Q&A at the end
• We will record the webinar for future viewing
• All attendees will receive a copy of the slides and recording

Page 1
© Hortonworks Inc. 2013

Introducing Hortonworks Data Platform for Windows

Enterprise Apache Hadoop for Windows Environments

March 2013

© Hortonworks Inc. 2013 Page 2

Our Speakers

John Kreisa
VP, Strategic Marketing

Saptak Sen
Sr. Product Manager

Rohit Bakshi
Product Manager

Page 3

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 4

Polling Question
Where are you with Hadoop?

__ We are running it in production

__ We have it running in our labs

__ We are just investigating Hadoop

__ What is Hadoop?

Page 5

Agenda





• Q&A

Page 6

Why Apache Hadoop on Windows?
• According to IDC Windows Server held 73% market share in 2012
– Hadoop was traditionally built for Linux servers so there are a large number of
underserved organizations

• Apache Hadoop: de-facto platform for processing
massive amounts of unstructured data
– Complementary to existing Microsoft technologies
– There is a huge untapped community of Windows developers
and ecosystem partners

• A strong Microsoft-Hortonworks partnership and 18 months of
development makes this a natural next step

Page 7

What Makes Up Big Data?

Transactions + Interactions
Petabytes
BIG DATA Mobile Web + Observations
Sentiment SMS/MMS

User Click Stream
= BIG DATA
Speech to Text

Social Interactions & Feeds
Terabytes WEB Web logs
Spatial & GPS Coordinates
A/B testing
Sensors / RFID / Devices
Behavioral Targeting
Gigabytes CRM Business Data Feeds
Dynamic Pricing
Segmentation External Demographics
Search Marketing
Customer Touches User Generated Content
ERP
Megabytes Affiliate Networks
Purchase detail Support Contacts HD Video, Audio, Images
Dynamic Funnels
Purchase record
Offer details Offer history Product/Service Logs
Payment record

Increasing Data Variety and Complexity

Page 8

Big Data: Big and Getting Bigger Fast!
• Unstructured data growth exceeds 80% year/year in most enterprises
– Machine-generated data is a key driver in data growth

• IDC projects digital universe will reach 40 zettabytes (ZB) by 2020
– 1 ZB = 1,000,000,000,000 GBs!
– Projected to increase 15x by 2020

• According to 2012 Barclays CIO study big data
outranks virtualization as #1 spending initiative

*2012 IDC Digital Universe Study Page 9

Enter Apache Hadoop
The core of the next generation data platform…

OSS that delivers high-scale
HADOOP CORE HDFS MAP REDUCE
storage & processing with
enterprise-ready platform
PLATFORM SERVICES Enterprise Readiness
services

Hortonworkers are the original
architects, operators, and builders of
core Hadoop

Page 10

Agenda





• Q&A

Page 11

Introducing HDP for Windows

OPERATIONAL DATA Hortonworks
SERVICES SERVICES
Data Platform (HDP)
Manage &
Manage & Store,
Store,
Operate at
Operate at Process and
Process and For Windows
Scale
Scale Access Data
Access Data

• 100% Open Source
HADOOP CORE
Distributed Enterprise Hadoop
Storage & Processing

• Component and version
compatible with Microsoft
HDInsight
HORTONWORKS
DATA PLATFORM (HDP) • Availability
For Windows
• Beta release available now

• GA early 2Q 2013

Page 12

Hortonworks Data Platform for Windows
HDP: the first and only distribution available on Windows & Linux
• Enterprise-grade Apache Hadoop on Windows
– Enables same experience for Hadoop on Windows & Linux

• More partners, more developers for Hadoop
– Makes native Apache Hadoop available to Windows ecosystem
– More options for Windows focused organizations

• Hortonworks focus: Enterprise Apache Hadoop for all platforms
– Trusted reliable production-ready distribution for on-premise Hadoop on Windows
deployments

• Built with joint investment and contributions from Microsoft
– Deep engineering relationship ensures tight integration and maximum performance

Page 13

Hortonworks: Best In Class Hadoop Support
• Experienced enterprise support team
– Experience supporting enterprise clients in production
– Core engineers have real operational
experience: built and supported 44+K nodes in production
– Extensive experience in commercial big data offerings
including HDP, MapR, Karmasphere

• Global 24x7 operation – support based in Sunnyvale, UK & India

• Stringent case management processes ensures high quality customer
service & responsiveness

Page 14

Transferring Our Hadoop Expertise to You
The expert source for
Apache Hadoop training & certification

• World class training programs designed to
help you learn fast
– Role-based hands on classes with 50% lab time
– New HDP on Windows course

• Expert consulting services
– Programs designed to transfer knowledge

• Industry leading Hadoop Sandbox program
– Fastest way to learn Apache Hadoop
– Multi-level tutorials for wide applicability
– Customizable and updateable

Page 15

Hortonworks Snapshot

We develop, distribute and support
the ONLY 100% open source
Headquarters: Palo Alto, CA
Employees: 180+ and growing
Enterprise Hadoop distribution
Investors: Benchmark, Index, Yahoo

Develop Distribute Support
• We employ the core • We distribute the only 100% • We are uniquely positioned
architects, builders and Open Source Enterprise to deliver the highest quality
operators of Apache Hadoop Hadoop Distribution: of Hadoop support
Hortonworks Data
• We drive innovation within Platform • We enable the ecosystem to
Apache Software work better with Hadoop
Foundation projects • We engineer, test & certify
HDP for enterprise usage

Endorsed by Strategic Partners

Page 16

Agenda





• Q&A

Page 17

Microsoft Big Data
Microsoft Big Data
– Simplifies data management for IT
– Enables IT and users to easily enrich their data with the world’s data, and
– Delivers agility to end users through familiar tools like Excel

Page 18

Microsoft End-To-End Big Data Platform

Page 19

Agenda





• Q&A

Page 20

Enhancing the Core of Apache Hadoop
Deliver high-scale
storage & processing
with enterprise-ready
platform services
WEBHDFS MAP REDUCE Unique Focus Areas:
HADOOP CORE
HDFS • Bigger, faster, more flexible
Continued focus on speed & scale and
PLATFORM SERVICES Enterprise Readiness enabling near-real-time apps

• Tested & certified at scale
Run ~1300 system tests on large clusters
for every release
Hortonworkers are the architects,
operators, and builders of core Hadoop
• Enterprise-ready services
High availability, disaster recovery,
snapshots, security, …

Page 21

Data Services for Full Data Lifecycle

DATA
Provide data services to
SERVICES store, process & access
SQOOP PIG HIVE data in many ways
HCATALOG
Unique Focus Areas:
Distributed • Apache HCatalog
HADOOP CORE Storage & Processing Metadata services for consistent table
access to Hadoop data
PLATFORM SERVICES Enterprise Readiness • Apache Hive
Explore & process Hadoop data via SQL &
ODBC-compliant BI tools

Hortonworks enables Hadoop data to be
accessed via existing tools & systems

Page 22


DATA
SQOOP
PIG HIVE data in many ways
HCATALOG
Unique Focus Areas:


Page 23


DATA
SQOOP
PIG HIVE data in many ways
HCATALOG
Unique Focus Areas:


Page 24

Operational Services for Ease of Use

OPERATIONAL DATA
Include complete
SERVICES SERVICES operational services for
Store, productive operations
Oozie Process and
Access Data & management

Distributed • Apache Oozie:
HADOOP CORE Storage & Processing Manage and schedule job execution for
Hadoop jobs

Only Hortonworks provides a complete
open source Hadoop management tool

Page 25

Inside HDP for Windows

OPERATIONAL DATA Hortonworks
SERVICES SERVICES
Data Platform (HDP)
Manage & Store, HIVE
PIG
Oozie
Operate at Process and For Windows
Scale Access Data
SQOOP HCATALOG
• 100% Open Source
WEBHDFS
Distributed Enterprise Hadoop
HADOOP CORE Storage & ProcessingREDUCE
HDFS MAP
• Component and version
PLATFORM SERVICES compatible with Microsoft
HDInsight
HORTONWORKS
DATA PLATFORM (HDP) • Availability
For Windows
• Beta release available now

• GA early 2Q 2012

Page 26

Seamless Interoperability with Your Microsoft Tools

• Integrated with Microsoft tools
APPLICATIONS

for native big data analysis
– Bi-directional connectors for SQL
Microsoft Applications Server and SQL Azure through SQOOP
– Excel ODBC integration through Hive

• Addressing demand for Hadoop
on Windows
– Ideal for Windows customers with
DATA SYSTEMS

HORTONWORKS Hadoop operational experience
DATA PLATFORM
For Windows • Enables all common Hadoop
workloads
– Data refinement and ETL offload for
high-volume data landing
– Data exploration for discovery of new
business opportunities
DATA SOURCES

Traditional Sources New Sources
OLTP, (RDBMS, OLTP, OLAP) (web logs, email, sensor data, social media)
MOBILE
POS DATA
SYSTEMS

Page 27

Demo Time!

Excel integration with HDP
• Interact with HDP through Excel
• Use Data Explorer to explore and turn raw data
into valuable information

Page 28

Maximize Your Hadoop Deployment Choice
• Use HDP for Windows for on-premises deployment on Windows Server
– Ideal for Windows users with Hadoop experience
– Perfect next step for those who are ready to move from POC to production

• Use HDInsight for Microsoft tooling and Management and Provisioning
– HDInsight Service that offers full benefit of Windows Azure (e.g. elasticity & low cost) –
available in Preview today
– HDInsight Server for full integration of Hadoop with Microsoft tools on premises –
Developer Preview available today

• Full interoperability and deployment choice across platforms
– Implement big data applications that run on-premise & cloud
– By leveraging open source HDP, enables seamless interoperability across
environments: Linux, Windows, Windows Azure

Page 29

Next Steps
Download Hortonworks Sandbox
www.hortonworks.com/sandbox

Download Hortonworks Data Platform for
Windows (Beta)
www.hortonworks.com/download

Follow…
@hortonworks, @hortonworks_U

Page 30

Introduction to Hortonworks Data Platform for Windows

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Introduction to Hortonworks Data Platform for Windows

Ähnlich wie Introduction to Hortonworks Data Platform for Windows (20)

Mehr von Hortonworks

Mehr von Hortonworks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Introduction to Hortonworks Data Platform for Windows

Hinweis der Redaktion