SlideShare ist ein Scribd-Unternehmen logo
1 von 45
• Introduction to R
• Applications of R at Microsoft
• R Products at Microsoft
• What’s coming for R at Microsoft
• Q&A
April 6, 2015
“This acquisition will help customers use advanced analytics within Microsoft data platforms.“
INTRODUCTION
TO R
• Most widely used data analysis software
• Most powerful statistical programming language
• Create beautiful and unique data visualizations
• Thriving open-source community
• Fills the talent gap
www.revolutionanalytics.com/what-is-r
• 1993: Research project in Auckland, NZ
• 1995: Released as open-source software
• 1997: R core group formed
• 2000: R 1.0.0 released
• 2003: R Foundation formed in Austria
• 2004: First international user conference
• 2007: Revolution Analytics founded
• 2009: New York Times article on R
• 2013: Revolution R Open released
• 2015: Microsoft acquires Revolution
Analytics 7
Photo credit: Robert Gentleman
blog.revolutionanalytics.com/popularity
R Usage Growth
Rexer Data Miner Survey, 2007-2013
• Rexer Data Miner Survey • IEEE Spectrum, July 2014
#9: R
Language Popularity
IEEE Spectrum Top Programming Languages
New York Times, June 25 2009
(3 hours after Michael Jackson’s death)
R AT
MICROSOFT
What
happened?
Why did
it happen?
What will
happen?
How can we
make it happen?
Traditional BI Advanced Analytics
• System monitoring & alerting
• Capacity Planning
• TruSkill Matchmaking System
• Player Churn
• Game design
• In-game purchase optimization
• Fraud detection
• Player communities
MICROSOFT
PRODUCTS
WITH R
• Enhanced Open Source R distribution
• Compatible with all R-related software
• Multi-threaded for performance
• Focus on reproducibility
• Open source (GPLv2 license)
• Available for Windows, Mac OS X, Ubuntu,
Red Hat and OpenSUSE
• Download from
mran.revolutionanalytics.com
15
• Built on latest R engine
• 100% compatible with
• Designed to work with RStudio
16
• Multithreaded library replaces
standard BLAS/LAPACK algorithms
• High-performance algorithms
• Sequential  Parallel
• No need to change any R code
• Included with RRO binary
distributions
17
More at Revolutions blog
Adapted from http://xkcd.com/234/
CC BY-NC 2.5
• Static CRAN mirror
• Daily CRAN snapshots
mran.revolutionanalytics.com/snapshot
• Easily write and share scripts synced to a specific snapshot
19
CRAN
RRDaily
snapshots
http://mran.revolutionanalytics.com/snapshot/
checkpoint
package
library(checkpoint)
checkpoint("2014-09-17")
CRAN mirror
http://cran.revolutionanalytics.com/
checkpoint
server
Midnight
UTC
• Easy to use: add 2 lines to the top of each script
• For the package author:
• For a script collaborator:
20
• Download
Revolution R Open
• Learn about R and
RRO
• Daily CRAN
snapshots
• Explore Packages
• Explore Task Views
21
Trends
R FOR
BIG DATA
• Toolkits for data scientists and numerical analysts to create custom
parallel and distributed algorithms
• Mainly useful for “embarrassingly parallel” problems, where
parallel components work with small amounts of data
• Big Data Predictive Analytics mostly not embarrassingly parallel
Details at projects.revolutionanalytics.com
24
is….
the only big data big analytics platform
based on open source R
the defacto statistical computing language for
modern analytics
 Naïve Bayes
 Data import – Delimited, Fixed, SAS, SPSS,
OBDC
 Variable creation & transformation
 Recode variables
 Factor variables
 Missing value handling
 Sort, Merge, Split
 Aggregate by category (means, sums)
 Min / Max, Mean, Median (approx.)
 Quantiles (approx.)
 Standard Deviation
 Variance
 Correlation
 Covariance
 Sum of Squares (cross product matrix for set
variables)
 Pairwise Cross tabs
 Risk Ratio & Odds Ratio
 Cross-Tabulation of Data (standard tables & long
form)
 Marginal Summaries of Cross Tabulations
 Chi Square Test
 Kendall Rank Correlation
 Fisher’s Exact Test
 Student’s t-Test
 Subsample (observations & variables)
 Random Sampling
Data Step Statistical Tests
Sampling
Descriptive Statistics
 Sum of Squares (cross product matrix for set
variables)
 Multiple Linear Regression
 Generalized Linear Models (GLM) exponential
family distributions: binomial, Gaussian, inverse
Gaussian, Poisson, Tweedie. Standard link
functions: cauchit, identity, log, logit, probit. User
defined distributions & link functions.
 Covariance & Correlation Matrices
 Logistic Regression
 Classification & Regression Trees
 Predictions/scoring for models
 Residuals for all models
Predictive Models  K-Means
 Decision Trees
 Decision Forests
 Gradient Boosted Decision Trees
Cluster Analysis
Classification
Simulation
Variable Selection
 Stepwise Regression
 Simulation (e.g. Monte Carlo)
 Parallel Random Number Generation
Combination
New in
v7.3
 PEMA-R API
 rxDataStep
 rxExec
Coming
in v7.4
• ETL
• Marketing channel data
• Behavioral variables
• Promotional data
• Overlay data
• Exploratory data analysis
• Time-to-event models
• GAM survival models
• Scoring for inference
• Scoring for prediction
• 5 billion scores per day
per retailer
CUSTOM DATA
FORMAT
CUSTOM VARIABLES
(PMML)
R IN THE CLOUD
• Exposing the expertise of data scientists as APIs
• Bringing the utility of data science to applications
• Addressing the Data Science talent gap
Azure: Huge infrastructure scale
19 Regions ONLINE…huge datacenter capacity around the world…and we’re growing
 100+ datacenters
 One of the top 3 networks in the world (coverage, speed, connections)
 2 x AWS and 6x Google number of offered regions
 G Series – Largest VM available in the market – 32 cores, 448GB Ram, SSD…
Operational Announced
Central US
Iowa
West US
California
North Europe
Ireland
East US
Virginia
East US 2
Virginia
US Gov
Virginia
North Central US
Illinois
US Gov
Iowa
South Central US
Texas
Brazil South
Sao Paulo
West Europe
Netherlands
China North *
Beijing
China South *
Shanghai
Japan East
Saitama
Japan West
OsakaIndia West
TBD
India East
TBD
East Asia
Hong Kong
SE Asia
Singapore
Australia West
Melbourne
Australia East
Sydney
* Operated by 21Vianet
http://blog.revolutionanalytics.com/2015/06/r-build-keynote.html/
WHAT’S
COMING FOR R
AT MICROSOFT
40
Data Scientist
Interact directly with data
Built-in to SQL Server
Data Developer/DBA
Manage data and
analytics together
SQL Server 2016
Built-in in-database analytics
Example Solutions
• Fraud detection
• Salesforecasting
• Warehouse efficiency
• Predictive maintenance
Relational Data
Analytic Library
T-SQL Interface
Extensibility
?
R
RIntegration
010010
100100
010101
Microsoft Azure
Machine Learning Marketplace
New R scripts
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
rows
minutes
R on a
server
pulling data
via SQL
R on a server
Invoking RRE
ScaleR Inside
the EDW
Thank you
Download Revolution R Open:
mran.revolutionanalytics.com
More at:
blog.revolutionanalytics.com
David Smith
R Community Lead
Revolution Analytics
@revodavid
davidsmi@microsoft.com
46
More at deployr.revolutionanalytics.com
R at Microsoft

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data – A New Testing Challenge
Big Data – A New Testing ChallengeBig Data – A New Testing Challenge
Big Data – A New Testing ChallengeTEST Huddle
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for HadoopWilly Marroquin (WillyDevNET)
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyftmarkgrover
 
The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi Databricks
 
DeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsDeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsRevolution Analytics
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Revolution Analytics
 
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...Databricks
 
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...Spark Summit
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
 
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargQA or the Highway
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists jlacefie
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Revolution Analytics
 
Bay Area Apache Flink Meetup Community Update August 2015
Bay Area Apache Flink Meetup Community Update August 2015Bay Area Apache Flink Meetup Community Update August 2015
Bay Area Apache Flink Meetup Community Update August 2015Henry Saputra
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTurkish Testing Board
 
Phar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the RealityPhar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the RealityDatabricks
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 

Was ist angesagt? (20)

R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Big Data – A New Testing Challenge
Big Data – A New Testing ChallengeBig Data – A New Testing Challenge
Big Data – A New Testing Challenge
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
 
The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi
 
DeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence ApplicationsDeployR: Revolution R Enterprise with Business Intelligence Applications
DeployR: Revolution R Enterprise with Business Intelligence Applications
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14
 
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork with Than...
 
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
Insights Without Tradeoffs Using Structured Streaming keynote by Michael Armb...
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
 
Bay Area Apache Flink Meetup Community Update August 2015
Bay Area Apache Flink Meetup Community Update August 2015Bay Area Apache Flink Meetup Community Update August 2015
Bay Area Apache Flink Meetup Community Update August 2015
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
 
Phar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the RealityPhar Data Platform: From the Lakehouse Paradigm to the Reality
Phar Data Platform: From the Lakehouse Paradigm to the Reality
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 

Andere mochten auch

Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
SQL Server 2016 Everything built-in FULL deck
SQL Server 2016 Everything built-in FULL deckSQL Server 2016 Everything built-in FULL deck
SQL Server 2016 Everything built-in FULL deckHamid J. Fard
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
SQL Server 2016 novelties
SQL Server 2016 noveltiesSQL Server 2016 novelties
SQL Server 2016 noveltiesMSDEVMTL
 
What's New in SQL Server 2016 for BI
What's New in SQL Server 2016 for BIWhat's New in SQL Server 2016 for BI
What's New in SQL Server 2016 for BITeo Lachev
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computingBAINIDA
 
Data Science con Microsoft R Server y SQL Server 2016
Data Science con Microsoft R Server y SQL Server 2016Data Science con Microsoft R Server y SQL Server 2016
Data Science con Microsoft R Server y SQL Server 2016SpanishPASSVC
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
SQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesSQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesGianluca Sartori
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceRevolution Analytics
 
Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365Osterman Research, Inc.
 
Distributed Computing Patterns in R
Distributed Computing Patterns in RDistributed Computing Patterns in R
Distributed Computing Patterns in Rarmstrtw
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageRevolution Analytics
 
Microsoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownMicrosoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownLynn Langit
 

Andere mochten auch (19)

Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
SQL Server 2016 Everything built-in FULL deck
SQL Server 2016 Everything built-in FULL deckSQL Server 2016 Everything built-in FULL deck
SQL Server 2016 Everything built-in FULL deck
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
SQL Server 2016 novelties
SQL Server 2016 noveltiesSQL Server 2016 novelties
SQL Server 2016 novelties
 
What's New in SQL Server 2016 for BI
What's New in SQL Server 2016 for BIWhat's New in SQL Server 2016 for BI
What's New in SQL Server 2016 for BI
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
microsoft r server for distributed computing
microsoft r server for distributed computingmicrosoft r server for distributed computing
microsoft r server for distributed computing
 
Data Science con Microsoft R Server y SQL Server 2016
Data Science con Microsoft R Server y SQL Server 2016Data Science con Microsoft R Server y SQL Server 2016
Data Science con Microsoft R Server y SQL Server 2016
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
SQL Server 2016 New Security Features
SQL Server 2016 New Security FeaturesSQL Server 2016 New Security Features
SQL Server 2016 New Security Features
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the Marketplace
 
Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365
 
Revolution R: 100% R and more
Revolution R: 100% R and moreRevolution R: 100% R and more
Revolution R: 100% R and more
 
Distributed Computing Patterns in R
Distributed Computing Patterns in RDistributed Computing Patterns in R
Distributed Computing Patterns in R
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint Package
 
MARKETING RESEARCH SURVEY
MARKETING RESEARCH SURVEY MARKETING RESEARCH SURVEY
MARKETING RESEARCH SURVEY
 
Microsoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownMicrosoft Machine Learning Smackdown
Microsoft Machine Learning Smackdown
 

Ähnlich wie R at Microsoft

Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAlex Palamides
 
Robert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelRobert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelMSDEVMTL
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobsBill Jacobs
 
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Data Con LA
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Mark Tabladillo
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaSpark Summit
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsJen Aman
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersRevolution Analytics
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Herman Wu
 
Revolution R: 100% R and more
Revolution R: 100% R and moreRevolution R: 100% R and more
Revolution R: 100% R and moreMasayoshi Ootsuka
 
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4jNeo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4jNeo4j
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 
DataMass Summit - Machine Learning for Big Data in SQL Server
DataMass Summit - Machine Learning for Big Data  in SQL ServerDataMass Summit - Machine Learning for Big Data  in SQL Server
DataMass Summit - Machine Learning for Big Data in SQL ServerŁukasz Grala
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Dataconomy Media
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormRevolution Analytics
 

Ähnlich wie R at Microsoft (20)

Michal Marušan: Scalable R
Michal Marušan: Scalable RMichal Marušan: Scalable R
Michal Marušan: Scalable R
 
Analytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using RAnalytics Beyond RAM Capacity using R
Analytics Beyond RAM Capacity using R
 
Robert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelRobert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans Excel
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobs
 
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629
 
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion StoicaRISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
RISELab: Enabling Intelligent Real-Time Decisions keynote by Ion Stoica
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習 Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
Azure 機器學習 - 使用Python, R, Spark, CNTK 深度學習
 
Revolution R: 100% R and more
Revolution R: 100% R and moreRevolution R: 100% R and more
Revolution R: 100% R and more
 
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4jNeo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
Neo4j GraphTalks Oslo - Graph Your Business - Rik Van Bruggen, Neo4j
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
DataMass Summit - Machine Learning for Big Data in SQL Server
DataMass Summit - Machine Learning for Big Data  in SQL ServerDataMass Summit - Machine Learning for Big Data  in SQL Server
DataMass Summit - Machine Learning for Big Data in SQL Server
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 

Mehr von Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solutionRevolution Analytics
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceRevolution Analytics
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R OpenRevolution Analytics
 

Mehr von Revolution Analytics (14)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R Conference
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R Open
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 

Kürzlich hochgeladen

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Kürzlich hochgeladen (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

R at Microsoft

  • 1.
  • 2. • Introduction to R • Applications of R at Microsoft • R Products at Microsoft • What’s coming for R at Microsoft • Q&A
  • 3. April 6, 2015 “This acquisition will help customers use advanced analytics within Microsoft data platforms.“
  • 5. • Most widely used data analysis software • Most powerful statistical programming language • Create beautiful and unique data visualizations • Thriving open-source community • Fills the talent gap www.revolutionanalytics.com/what-is-r
  • 6.
  • 7. • 1993: Research project in Auckland, NZ • 1995: Released as open-source software • 1997: R core group formed • 2000: R 1.0.0 released • 2003: R Foundation formed in Austria • 2004: First international user conference • 2007: Revolution Analytics founded • 2009: New York Times article on R • 2013: Revolution R Open released • 2015: Microsoft acquires Revolution Analytics 7 Photo credit: Robert Gentleman
  • 8. blog.revolutionanalytics.com/popularity R Usage Growth Rexer Data Miner Survey, 2007-2013 • Rexer Data Miner Survey • IEEE Spectrum, July 2014 #9: R Language Popularity IEEE Spectrum Top Programming Languages
  • 9. New York Times, June 25 2009 (3 hours after Michael Jackson’s death)
  • 11. What happened? Why did it happen? What will happen? How can we make it happen? Traditional BI Advanced Analytics
  • 12. • System monitoring & alerting • Capacity Planning
  • 13. • TruSkill Matchmaking System • Player Churn • Game design • In-game purchase optimization • Fraud detection • Player communities
  • 15. • Enhanced Open Source R distribution • Compatible with all R-related software • Multi-threaded for performance • Focus on reproducibility • Open source (GPLv2 license) • Available for Windows, Mac OS X, Ubuntu, Red Hat and OpenSUSE • Download from mran.revolutionanalytics.com 15
  • 16. • Built on latest R engine • 100% compatible with • Designed to work with RStudio 16
  • 17. • Multithreaded library replaces standard BLAS/LAPACK algorithms • High-performance algorithms • Sequential  Parallel • No need to change any R code • Included with RRO binary distributions 17 More at Revolutions blog
  • 19. • Static CRAN mirror • Daily CRAN snapshots mran.revolutionanalytics.com/snapshot • Easily write and share scripts synced to a specific snapshot 19 CRAN RRDaily snapshots http://mran.revolutionanalytics.com/snapshot/ checkpoint package library(checkpoint) checkpoint("2014-09-17") CRAN mirror http://cran.revolutionanalytics.com/ checkpoint server Midnight UTC
  • 20. • Easy to use: add 2 lines to the top of each script • For the package author: • For a script collaborator: 20
  • 21. • Download Revolution R Open • Learn about R and RRO • Daily CRAN snapshots • Explore Packages • Explore Task Views 21
  • 24. • Toolkits for data scientists and numerical analysts to create custom parallel and distributed algorithms • Mainly useful for “embarrassingly parallel” problems, where parallel components work with small amounts of data • Big Data Predictive Analytics mostly not embarrassingly parallel Details at projects.revolutionanalytics.com 24
  • 25. is…. the only big data big analytics platform based on open source R the defacto statistical computing language for modern analytics
  • 26.
  • 27.  Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing value handling  Sort, Merge, Split  Aggregate by category (means, sums)  Min / Max, Mean, Median (approx.)  Quantiles (approx.)  Standard Deviation  Variance  Correlation  Covariance  Sum of Squares (cross product matrix for set variables)  Pairwise Cross tabs  Risk Ratio & Odds Ratio  Cross-Tabulation of Data (standard tables & long form)  Marginal Summaries of Cross Tabulations  Chi Square Test  Kendall Rank Correlation  Fisher’s Exact Test  Student’s t-Test  Subsample (observations & variables)  Random Sampling Data Step Statistical Tests Sampling Descriptive Statistics  Sum of Squares (cross product matrix for set variables)  Multiple Linear Regression  Generalized Linear Models (GLM) exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions: cauchit, identity, log, logit, probit. User defined distributions & link functions.  Covariance & Correlation Matrices  Logistic Regression  Classification & Regression Trees  Predictions/scoring for models  Residuals for all models Predictive Models  K-Means  Decision Trees  Decision Forests  Gradient Boosted Decision Trees Cluster Analysis Classification Simulation Variable Selection  Stepwise Regression  Simulation (e.g. Monte Carlo)  Parallel Random Number Generation Combination New in v7.3  PEMA-R API  rxDataStep  rxExec Coming in v7.4
  • 28.
  • 29. • ETL • Marketing channel data • Behavioral variables • Promotional data • Overlay data • Exploratory data analysis • Time-to-event models • GAM survival models • Scoring for inference • Scoring for prediction • 5 billion scores per day per retailer CUSTOM DATA FORMAT CUSTOM VARIABLES (PMML)
  • 30. R IN THE CLOUD
  • 31. • Exposing the expertise of data scientists as APIs • Bringing the utility of data science to applications • Addressing the Data Science talent gap
  • 32. Azure: Huge infrastructure scale 19 Regions ONLINE…huge datacenter capacity around the world…and we’re growing  100+ datacenters  One of the top 3 networks in the world (coverage, speed, connections)  2 x AWS and 6x Google number of offered regions  G Series – Largest VM available in the market – 32 cores, 448GB Ram, SSD… Operational Announced Central US Iowa West US California North Europe Ireland East US Virginia East US 2 Virginia US Gov Virginia North Central US Illinois US Gov Iowa South Central US Texas Brazil South Sao Paulo West Europe Netherlands China North * Beijing China South * Shanghai Japan East Saitama Japan West OsakaIndia West TBD India East TBD East Asia Hong Kong SE Asia Singapore Australia West Melbourne Australia East Sydney * Operated by 21Vianet
  • 33.
  • 34.
  • 35.
  • 38. 40
  • 39. Data Scientist Interact directly with data Built-in to SQL Server Data Developer/DBA Manage data and analytics together SQL Server 2016 Built-in in-database analytics Example Solutions • Fraud detection • Salesforecasting • Warehouse efficiency • Predictive maintenance Relational Data Analytic Library T-SQL Interface Extensibility ? R RIntegration 010010 100100 010101 Microsoft Azure Machine Learning Marketplace New R scripts 010010 100100 010101 010010 100100 010101 010010 100100 010101 010010 100100 010101 010010 100100 010101
  • 40. rows minutes R on a server pulling data via SQL R on a server Invoking RRE ScaleR Inside the EDW
  • 41.
  • 42. Thank you Download Revolution R Open: mran.revolutionanalytics.com More at: blog.revolutionanalytics.com David Smith R Community Lead Revolution Analytics @revodavid davidsmi@microsoft.com
  • 43.

Hinweis der Redaktion

  1. Xbox: http://blog.revolutionanalytics.com/2014/05/microsoft-uses-r-for-xbox-matchmaking.html Other gaming http://blog.revolutionanalytics.com/2013/06/how-big-data-and-statistical-modeling-are-changing-video-games.html
  2. Infinite scale inexpensively Tons of data from which you actually have to get value Customers that have a very high expectation of service and connection – Pier 1 great example Influx of new talent to fill a very big gap McKinsey says is 300 thousand in US alone But the market this new talent is entering is still filled with barriers
  3. Enterprise readiness Performance architecture Big Data analytics Data source integration Development tools Deployment tools
  4. Demographics: consumer, product, market Actions: web clicks, email clicks, mobile app usage, call center logs, social, search … Outcomes: impressions, touches, orders (retail, online, mobile) Strategic allocation
  5. Outcome is “buying” instead of “dying”
  6. Over the last few years we’ve truly delivered a huge infrastructure to enable us to grow our services at scale around the globe. Whether it’s our flagship facilities in Quincy, Washington or Boydton, Virginia, or some of the newly announced facilities in Shanghai, Australia and Brazil, it really is key for us to make smart investments around the world to deliver services in a resilient and reliable fashion.   A lot of people ask, what goes into site selection at Microsoft and how do we decide where to place our datacenter investments? There are over thirty-five factors in our site selection criteria. But really, the top elements are around proximity to customers and energy and fiber infrastructure, insuring that we have the capacity and the growth platforms to be able to grow our services.   Another key element is about skilled workforce. We need to insure that we have the right people to run and operate our datacenters on a day to day basis.
  7. Work done in conjunction with major Teradata user and household name in silicon valley. Chart shows results of moving R algorithm execution inside Teradata EDW – achieving combined benefits from scaling computation and slashing data movement.