SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
CAN WE AUTOMATE
PREDICTIVE ANALYTICS?
Thomas W. Dinsmore
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci
Can we automate predictive analytics?
• Buzz about automation
• Degrees of automation
• Some history
• Where we are today
• The last mile
• The impact of automation
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Now
Future
Never
0% 20% 40% 60% 80%
19%
76%
5%
0%
8%
15%
23%
30%
Years
1-2 2-5 5-10 10-20 20-50 >50
6%
8%
16%
28%
14%
4%
When will most expert level data scientist tasks…be automated?
Source: kdnuggets.com
Thomas W. Dinsmore
– Mark Ansermino, Director of Pediatric Anesthesia, University of British Columbia
“We are convinced the machine can do better than human
anesthesiologists”
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Thomas W. Dinsmore
Levels of Autonomy
• Level 0: Driver completely controls
• Level 1: Individual controls automated
• Level 2: At least two controls automated together
• Level 3: Driver can cede control under certain conditions
• Level 4: Vehicle controls all functions for the entire trip
National Highway Traffic Safety Administration
Thomas W. Dinsmore
1995: Unica PRW
• Optimized neural network
specification
• 1998: branded as Model One
• Automated model selection
• Now called IBM PredictiveInsight
(Enterprise Marketing Management)
Thomas W. Dinsmore
Late 1990s: MarketSwitch
• “Fire your SAS programmers!”
• “Russian rocket scientists”
• Bought by Experian
• Automation replaced by services
Thomas W. Dinsmore
Late 1990s: KXEN
• Structural risk minimization for
model selection
• Original release: rudimentary UI
• Repositioned as easy to use tool for
marketers
• SAP purchased for $40 million in
2013
Thomas W. Dinsmore
SAS and SPSS
SAS Rapid Modeler
• Add-in to SAS Enterprise Miner
• Macros for outlier ID, missing value
treatment, variable selection and
model selection
• User specifies data set, response
measure and depth of search
SPSS Modeler
• Automated data prep features
handle missing value treatment,
outlier ID, date/time prep, binning,
etc
• Auto Classifier, Auto Numeric and
Auto Cluster handle model selection
across defined search plan
Thomas W. Dinsmore
Open Source
caret
• R package
• Suite of tools to automate model
selection
• Includes preprocessing tools for
tasks like dummy coding and
feature selection
• Supports 40+ R packages, ~ 200
techniques
MLBase
• Joint project of AMPLab and Brown
DMRG
• Develop scalable machine learning
platform on Spark
• ML Optimizer translates user spec
into a test plan
• Currently in development (alpha
release postponed from 2014)
Thomas W. Dinsmore
Startups
Thomas W. Dinsmore
DataRobot
• Builds smart test plans
• Seeded with library of Kaggle-winning techniques
• Users can add or extend techniques with R or Python
• Leverages clusters to quickly run large-scale experiments
• User controls depth of automation
• Designed for rapid model deployment and integration
Thomas W. Dinsmore
Levels of Autonomy
• Level 0: Analyst completely controls
• Level 1: Individual features automated
• Level 2: At least two features automated together
• Level 3: Analyst can cede control under certain conditions
• Level 4: Platform controls all functions end to end
Predictive Analytics Platforms
Thomas W. Dinsmore
Level 4 Automated Analytics
Model Scoring
• Predictive models developed offline
• Models uploaded through PMML
• Scoring built into an automated
process
Unsupervised Learning
• Anomaly detection
• Social networks
• Topic modeling or taste profiles for
personalization
Thomas W. Dinsmore
“Data science
is 1% science
and 99% data.”
Thomas W. Dinsmore
Data sources are complex and diverse
Thomas W. Dinsmore
Enterprise data:
Thomas W. Dinsmore
It’s still a mess.
Thomas W. Dinsmore
For good results, analytic methods require specific transformations
Logistic Regression
Naive Bayes Classifier
Dummy code categorical
predictors
Bin numeric predictors
Thomas W. Dinsmore
We can pre-build data source connections
Thomas W. Dinsmore
Conventional Wisdom
• For good results, make the data
perfect, e.g.:
• Find and remove anomalies
• Replace missing data
• Consumes time, but worth it
The Right Way
• Investigate and act on anomalies, but
do not remove them
• Use techniques that can handle
missing data
• Your predictive model has to work
with dirty data, you should too
Work with data “as is”
Thomas W. Dinsmore
Data Marshaling Data Cleansing Data Transformation Model Training Model Selection
Model Training
Model Training
{ }
The Conventional Wisdom Test and Learn
Bring data transformation into the test and learn cycle
Thomas W. Dinsmore
Data Marshaling Data Cleansing
Data Transformation
Model Training Model Selection
Model Training
Model Training
{ }
Test and Learn
Data Transformation
Data Transformation
Bring data transformation into the test and learn cycle
Thomas W. Dinsmore
“The doctor will see you now.”
Thomas W. Dinsmore
How often are results of your analytics used?
0% 25% 50% 75% 100%
1%5%28%50%16%
Always Most of the time Sometimes Rarely Never
2013 Rexer Data Miners Survey
Thomas W. Dinsmore
Why your analysis isn’t used
• You do not understand the client’s business problem
• You do not understand the deployment environment
• The client does not understand your work
Thomas W. Dinsmore
Automation lets data scientists
spend more time collaborating,
less time crunching
Wrangle the data
Define
the
problem
Explain
your
work
Develop models
From this:
Wrangle the
data
Define the problem Explain your work
Develop
models
To this:
Thomas W. Dinsmore
Can we automate predictive analytics?
• Buzz about automation
• Degrees of automation
• Some history
• Where we are today
• The last mile
• The impact of automation
• We already have — almost
• The last mile is a steep challenge
• Automation will not replace data
scientists — it will make them more
effective
Thomas W. Dinsmore
Questions
Thomas W. Dinsmore
Thank You
Thomas W. Dinsmore
The Big Analytics Blog: www.thomaswdinsmore.com
email: thomaswdinsmore@gmail.com
@thomaswdinsmore
CAN WE AUTOMATE
PREDICTIVE ANALYTICS?
Thomas W. Dinsmore
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci

Weitere ähnliche Inhalte

Was ist angesagt?

Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsHesen Peng
 
Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...tboubez
 
VSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsVSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsBigML, Inc
 
DIY market segmentation 20170125
DIY market segmentation 20170125DIY market segmentation 20170125
DIY market segmentation 20170125Displayr
 
Pca(principal components analysis)
Pca(principal components analysis)Pca(principal components analysis)
Pca(principal components analysis)kalung0313
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationSara Hooker
 
DIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slidesDIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slidesDisplayr
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchEshanAgarwal4
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsJen Stirrup
 
MLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in MLMLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in MLBigML, Inc
 
Prediction of House Sales Price
Prediction of House Sales PricePrediction of House Sales Price
Prediction of House Sales PriceAnirvan Ghosh
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision TreesSara Hooker
 
Slides for automate or die (presentation)
Slides for automate or die (presentation)Slides for automate or die (presentation)
Slides for automate or die (presentation)Displayr
 
Module 1.2 data preparation
Module 1.2  data preparationModule 1.2  data preparation
Module 1.2 data preparationSara Hooker
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear RegressionSara Hooker
 

Was ist angesagt? (20)

L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actions
 
Simple math for anomaly detection toufic boubez - metafor software - monito...
Simple math for anomaly detection   toufic boubez - metafor software - monito...Simple math for anomaly detection   toufic boubez - metafor software - monito...
Simple math for anomaly detection toufic boubez - metafor software - monito...
 
VSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsVSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic Regressions
 
DIY market segmentation 20170125
DIY market segmentation 20170125DIY market segmentation 20170125
DIY market segmentation 20170125
 
Pca(principal components analysis)
Pca(principal components analysis)Pca(principal components analysis)
Pca(principal components analysis)
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
Explore ml day 2
Explore ml day 2Explore ml day 2
Explore ml day 2
 
DIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slidesDIY Driver Analysis Webinar slides
DIY Driver Analysis Webinar slides
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Borderline Smote
Borderline SmoteBorderline Smote
Borderline Smote
 
MLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in MLMLSEV Virtual. State of the Art in ML
MLSEV Virtual. State of the Art in ML
 
Prediction of House Sales Price
Prediction of House Sales PricePrediction of House Sales Price
Prediction of House Sales Price
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Slides for automate or die (presentation)
Slides for automate or die (presentation)Slides for automate or die (presentation)
Slides for automate or die (presentation)
 
Module 1.2 data preparation
Module 1.2  data preparationModule 1.2  data preparation
Module 1.2 data preparation
 
Explore ML day 1
Explore ML day 1Explore ML day 1
Explore ML day 1
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 

Andere mochten auch

Slideburst #7 - Next Best Action in All Digital Channels
Slideburst #7 - Next Best Action in All Digital ChannelsSlideburst #7 - Next Best Action in All Digital Channels
Slideburst #7 - Next Best Action in All Digital ChannelsPatrik Svensson
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best OfferMichel Bruley
 
Beginning to Spatial Data in SQL Server 2008
Beginning to Spatial Data in SQL Server 2008Beginning to Spatial Data in SQL Server 2008
Beginning to Spatial Data in SQL Server 2008Tobias Koprowski
 
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLA
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLAPLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLA
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLATobias Koprowski
 
Scott Bennett - Shell Game - Whistleblowing Report
Scott Bennett - Shell Game - Whistleblowing ReportScott Bennett - Shell Game - Whistleblowing Report
Scott Bennett - Shell Game - Whistleblowing ReportExopolitics Hungary
 
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheClouds
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheCloudsKoprowskiT_SQLRelayBirmingham_SQLSecurityInTheClouds
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheCloudsTobias Koprowski
 
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...Tomasz Kopacz
 
KoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersKoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersTobias Koprowski
 
KoprowskiT_SQLAzureLandingInBelfast
KoprowskiT_SQLAzureLandingInBelfastKoprowskiT_SQLAzureLandingInBelfast
KoprowskiT_SQLAzureLandingInBelfastTobias Koprowski
 
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBegan
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBeganKoprowskiT_SQLSatMoscow_2AMaDisaterJustBegan
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBeganTobias Koprowski
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Tobias Koprowski
 
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBegan
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBeganKoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBegan
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBeganTobias Koprowski
 
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLA
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLAWysoka Dostępność SQL Server 2008 w kontekscie umów SLA
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLATobias Koprowski
 
Eventuosity For Event Producers and Service Providers
Eventuosity For Event Producers and Service ProvidersEventuosity For Event Producers and Service Providers
Eventuosity For Event Producers and Service ProvidersJustin Panzer
 
Презентация стратегической игры MatriX Urban
Презентация стратегической игры MatriX UrbanПрезентация стратегической игры MatriX Urban
Презентация стратегической игры MatriX UrbanАндрей Донских
 
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...Tobias Koprowski
 
Cabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoCabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoDave Gardner
 

Andere mochten auch (19)

Das Next Best Offer-Konzept
Das Next Best Offer-KonzeptDas Next Best Offer-Konzept
Das Next Best Offer-Konzept
 
Slideburst #7 - Next Best Action in All Digital Channels
Slideburst #7 - Next Best Action in All Digital ChannelsSlideburst #7 - Next Best Action in All Digital Channels
Slideburst #7 - Next Best Action in All Digital Channels
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best Offer
 
Beginning to Spatial Data in SQL Server 2008
Beginning to Spatial Data in SQL Server 2008Beginning to Spatial Data in SQL Server 2008
Beginning to Spatial Data in SQL Server 2008
 
State of Nation - Feb 2017
State of Nation - Feb 2017State of Nation - Feb 2017
State of Nation - Feb 2017
 
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLA
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLAPLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLA
PLSSUG Meeting - Wysoka dostepność SQL Server 2008 w kontekscie umów SLA
 
Scott Bennett - Shell Game - Whistleblowing Report
Scott Bennett - Shell Game - Whistleblowing ReportScott Bennett - Shell Game - Whistleblowing Report
Scott Bennett - Shell Game - Whistleblowing Report
 
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheClouds
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheCloudsKoprowskiT_SQLRelayBirmingham_SQLSecurityInTheClouds
KoprowskiT_SQLRelayBirmingham_SQLSecurityInTheClouds
 
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
 
KoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginnersKoprowskiT_SQLSatMoscow_WASDforBeginners
KoprowskiT_SQLSatMoscow_WASDforBeginners
 
KoprowskiT_SQLAzureLandingInBelfast
KoprowskiT_SQLAzureLandingInBelfastKoprowskiT_SQLAzureLandingInBelfast
KoprowskiT_SQLAzureLandingInBelfast
 
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBegan
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBeganKoprowskiT_SQLSatMoscow_2AMaDisaterJustBegan
KoprowskiT_SQLSatMoscow_2AMaDisaterJustBegan
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008
 
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBegan
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBeganKoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBegan
KoprowskiT_PASSEastMidsFEB16_2AMaDisasterJustBegan
 
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLA
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLAWysoka Dostępność SQL Server 2008 w kontekscie umów SLA
Wysoka Dostępność SQL Server 2008 w kontekscie umów SLA
 
Eventuosity For Event Producers and Service Providers
Eventuosity For Event Producers and Service ProvidersEventuosity For Event Producers and Service Providers
Eventuosity For Event Producers and Service Providers
 
Презентация стратегической игры MatriX Urban
Презентация стратегической игры MatriX UrbanПрезентация стратегической игры MatriX Urban
Презентация стратегической игры MatriX Urban
 
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...
Virtual Study Beta Exam 71-663 Exchange 2010 Designing And Deploying Messagin...
 
Cabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoCabs, Cassandra, and Hailo
Cabs, Cassandra, and Hailo
 

Ähnlich wie Can We Automate Predictive Analytics

From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMFrom Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMMichał Łopuszyński
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantLynne Thomas
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causationPeter Varhol
 
The Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning CybersecurityThe Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning CybersecurityInterset
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMProduct School
 
Problem management foundation - Tools
Problem management foundation - ToolsProblem management foundation - Tools
Problem management foundation - ToolsRonald Bartels
 
How to Get the Most Out of Security Tools
How to Get the Most Out of Security ToolsHow to Get the Most Out of Security Tools
How to Get the Most Out of Security ToolsSecurity Innovation
 
Text mining why people need to be part of the process
Text mining   why people need to be part of the processText mining   why people need to be part of the process
Text mining why people need to be part of the processPhilo Janus
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
Achieving and Measuring Success with the Security Awareness Maturity Model
Achieving and Measuring Success with  the Security Awareness Maturity ModelAchieving and Measuring Success with  the Security Awareness Maturity Model
Achieving and Measuring Success with the Security Awareness Maturity ModelPriyanka Aash
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorRudradeb Mitra
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceAnnie Flippo
 
The Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionThe Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionHuman Capital Media
 
Accelerator Innovation Network Event: Session 2
Accelerator Innovation Network Event: Session 2 Accelerator Innovation Network Event: Session 2
Accelerator Innovation Network Event: Session 2 Heather-Fiona Egan
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Ali Alkan
 
Testing Is How You Avoid Looking Stupid
Testing Is How You Avoid Looking StupidTesting Is How You Avoid Looking Stupid
Testing Is How You Avoid Looking StupidSteve Branam
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a professionJose Quesada
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career pathRubikal
 

Ähnlich wie Can We Automate Predictive Analytics (20)

From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMFrom Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causation
 
The Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning CybersecurityThe Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning Cybersecurity
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
 
Problem management foundation - Tools
Problem management foundation - ToolsProblem management foundation - Tools
Problem management foundation - Tools
 
How to Get the Most Out of Security Tools
How to Get the Most Out of Security ToolsHow to Get the Most Out of Security Tools
How to Get the Most Out of Security Tools
 
Text mining why people need to be part of the process
Text mining   why people need to be part of the processText mining   why people need to be part of the process
Text mining why people need to be part of the process
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Achieving and Measuring Success with the Security Awareness Maturity Model
Achieving and Measuring Success with  the Security Awareness Maturity ModelAchieving and Measuring Success with  the Security Awareness Maturity Model
Achieving and Measuring Success with the Security Awareness Maturity Model
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
The Role of Analytics in Talent Acquisition
The Role of Analytics in Talent AcquisitionThe Role of Analytics in Talent Acquisition
The Role of Analytics in Talent Acquisition
 
Accelerator Innovation Network Event: Session 2
Accelerator Innovation Network Event: Session 2 Accelerator Innovation Network Event: Session 2
Accelerator Innovation Network Event: Session 2
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Testing Is How You Avoid Looking Stupid
Testing Is How You Avoid Looking StupidTesting Is How You Avoid Looking Stupid
Testing Is How You Avoid Looking Stupid
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a profession
 
Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
 

Mehr von odsc

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer odsc
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discoveryodsc
 
API Driven Development
API Driven Development API Driven Development
API Driven Development odsc
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysisodsc
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Upodsc
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hiveodsc
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depthodsc
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Informationodsc
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet odsc
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLodsc
 
Beyond Names
Beyond NamesBeyond Names
Beyond Namesodsc
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500odsc
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Dataodsc
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Scienceodsc
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions odsc
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learnodsc
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Toolsodsc
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypseodsc
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science odsc
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Researchodsc
 

Mehr von odsc (20)

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
 

Kürzlich hochgeladen

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Kürzlich hochgeladen (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Can We Automate Predictive Analytics

  • 1. CAN WE AUTOMATE PREDICTIVE ANALYTICS? Thomas W. Dinsmore O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  • 2. Can we automate predictive analytics? • Buzz about automation • Degrees of automation • Some history • Where we are today • The last mile • The impact of automation Thomas W. Dinsmore
  • 5. Now Future Never 0% 20% 40% 60% 80% 19% 76% 5% 0% 8% 15% 23% 30% Years 1-2 2-5 5-10 10-20 20-50 >50 6% 8% 16% 28% 14% 4% When will most expert level data scientist tasks…be automated? Source: kdnuggets.com Thomas W. Dinsmore
  • 6. – Mark Ansermino, Director of Pediatric Anesthesia, University of British Columbia “We are convinced the machine can do better than human anesthesiologists” Thomas W. Dinsmore
  • 14. Levels of Autonomy • Level 0: Driver completely controls • Level 1: Individual controls automated • Level 2: At least two controls automated together • Level 3: Driver can cede control under certain conditions • Level 4: Vehicle controls all functions for the entire trip National Highway Traffic Safety Administration Thomas W. Dinsmore
  • 15. 1995: Unica PRW • Optimized neural network specification • 1998: branded as Model One • Automated model selection • Now called IBM PredictiveInsight (Enterprise Marketing Management) Thomas W. Dinsmore
  • 16. Late 1990s: MarketSwitch • “Fire your SAS programmers!” • “Russian rocket scientists” • Bought by Experian • Automation replaced by services Thomas W. Dinsmore
  • 17. Late 1990s: KXEN • Structural risk minimization for model selection • Original release: rudimentary UI • Repositioned as easy to use tool for marketers • SAP purchased for $40 million in 2013 Thomas W. Dinsmore
  • 18. SAS and SPSS SAS Rapid Modeler • Add-in to SAS Enterprise Miner • Macros for outlier ID, missing value treatment, variable selection and model selection • User specifies data set, response measure and depth of search SPSS Modeler • Automated data prep features handle missing value treatment, outlier ID, date/time prep, binning, etc • Auto Classifier, Auto Numeric and Auto Cluster handle model selection across defined search plan Thomas W. Dinsmore
  • 19. Open Source caret • R package • Suite of tools to automate model selection • Includes preprocessing tools for tasks like dummy coding and feature selection • Supports 40+ R packages, ~ 200 techniques MLBase • Joint project of AMPLab and Brown DMRG • Develop scalable machine learning platform on Spark • ML Optimizer translates user spec into a test plan • Currently in development (alpha release postponed from 2014) Thomas W. Dinsmore
  • 21. DataRobot • Builds smart test plans • Seeded with library of Kaggle-winning techniques • Users can add or extend techniques with R or Python • Leverages clusters to quickly run large-scale experiments • User controls depth of automation • Designed for rapid model deployment and integration Thomas W. Dinsmore
  • 22. Levels of Autonomy • Level 0: Analyst completely controls • Level 1: Individual features automated • Level 2: At least two features automated together • Level 3: Analyst can cede control under certain conditions • Level 4: Platform controls all functions end to end Predictive Analytics Platforms Thomas W. Dinsmore
  • 23. Level 4 Automated Analytics Model Scoring • Predictive models developed offline • Models uploaded through PMML • Scoring built into an automated process Unsupervised Learning • Anomaly detection • Social networks • Topic modeling or taste profiles for personalization Thomas W. Dinsmore
  • 24. “Data science is 1% science and 99% data.” Thomas W. Dinsmore
  • 25. Data sources are complex and diverse Thomas W. Dinsmore
  • 27. It’s still a mess. Thomas W. Dinsmore
  • 28. For good results, analytic methods require specific transformations Logistic Regression Naive Bayes Classifier Dummy code categorical predictors Bin numeric predictors Thomas W. Dinsmore
  • 29. We can pre-build data source connections Thomas W. Dinsmore
  • 30. Conventional Wisdom • For good results, make the data perfect, e.g.: • Find and remove anomalies • Replace missing data • Consumes time, but worth it The Right Way • Investigate and act on anomalies, but do not remove them • Use techniques that can handle missing data • Your predictive model has to work with dirty data, you should too Work with data “as is” Thomas W. Dinsmore
  • 31. Data Marshaling Data Cleansing Data Transformation Model Training Model Selection Model Training Model Training { } The Conventional Wisdom Test and Learn Bring data transformation into the test and learn cycle Thomas W. Dinsmore
  • 32. Data Marshaling Data Cleansing Data Transformation Model Training Model Selection Model Training Model Training { } Test and Learn Data Transformation Data Transformation Bring data transformation into the test and learn cycle Thomas W. Dinsmore
  • 33. “The doctor will see you now.” Thomas W. Dinsmore
  • 34. How often are results of your analytics used? 0% 25% 50% 75% 100% 1%5%28%50%16% Always Most of the time Sometimes Rarely Never 2013 Rexer Data Miners Survey Thomas W. Dinsmore
  • 35. Why your analysis isn’t used • You do not understand the client’s business problem • You do not understand the deployment environment • The client does not understand your work Thomas W. Dinsmore
  • 36. Automation lets data scientists spend more time collaborating, less time crunching Wrangle the data Define the problem Explain your work Develop models From this: Wrangle the data Define the problem Explain your work Develop models To this: Thomas W. Dinsmore
  • 37. Can we automate predictive analytics? • Buzz about automation • Degrees of automation • Some history • Where we are today • The last mile • The impact of automation • We already have — almost • The last mile is a steep challenge • Automation will not replace data scientists — it will make them more effective Thomas W. Dinsmore
  • 39. Thank You Thomas W. Dinsmore The Big Analytics Blog: www.thomaswdinsmore.com email: thomaswdinsmore@gmail.com @thomaswdinsmore
  • 40. CAN WE AUTOMATE PREDICTIVE ANALYTICS? Thomas W. Dinsmore O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci