SlideShare ist ein Scribd-Unternehmen logo
1 von 71
A Practical Guidance to the Enterprise
Machine Learning Platform Ecosystem
About Us
• Helping great companies become great software companies
• Building software solutions powered by disruptive enterprise software trends
-Machine learning and data science
-Cyber-security
-Enterprise IOT
-Powered by Cloud and Mobile
• Bringing innovation from startups and academic institutions to the enterprise
• Award winning agencies: Inc 500, American Business Awards, International Business Awards
About This Webinar
• Research that brings together big enterprise software trends,
exciting startups and academic research
• Best practices based on real world implementation experience
• No sales pitches
• Cloud vs. On-Premise machine learning
• Cloud machine learning platforms
• Azure machine learning
• AWS machine learning
• Databricks
• Watson developer cloud
• Others…
• On-premise machine learning platforms
• Revolution analytics
• Dato
• Spark Mlib
• TensorFlow
• Others…
Agenda
Enterprise Data Science
“data science”
Modern Machine Learning
• Advances in storage, compute and data science research are
making machine learning as part of mainstream technology
platforms
• Big data movement
• Machine learning platforms are optimized with developer-friendly
interfaces
• Platform as a service providers have drastically lowered the
entry point for machine learning applications
• R and Python are leading the charge
Cloud vs. On-Premise
machine learning platforms
Cloud Machine Learning Platforms: Benefits
• Service abstraction layer over the machine learning infrastructure
• Rich visual modeling tools
• Rich monitoring and tracking interfaces
• Combine multiple platforms: R, Python, etc
• Enable programmatic access to ML models
Cloud machine Learning Platforms:: Challenges
• Integration with on-premise data stores
• Extensibility
• Security and privacy
On-Premise machine Learning Platforms: Benefits
• Control
• Security
• Integration with on-premise data stores
• Integrated with R and Python machine learning frameworks
On-Premise machine Learning Platforms: Challenges
• Code-based modeling interfaces
• Scalability
• Tightly coupled with Hadoop distributions
• Monitoring and management
• Data quality and curation
Cloud Machine Learning Platforms
• Azure Machine Learning
• AWS machine learning
• Databricks
• Watson developer cloud
The Leaders
Azure Machine Learning
Azure Machine Learning
• Native machine learning capabilities as part of the Azure cloud
• Elastic infrastructure that scale based on the model requirements
• Support over 30 supervised and unsupervised machine learning
algorithms
• Integration with R and Python machine learning libraries
• Expose machine learning models via programmable interfaces
• Integrated with the Cortana Analytics suite
• Integrated with PowerBI
• Supports both supervised and
unsupervised models
• Integrated with Azure HDInsight
• Large library of models and sample
gallery
• Support for R and Python code
Visual Model Creation
• Visual dashboard to track the
execution of ML models
• Track execution of different steps
within a ML model
• Integrated monitoring experience
with other Azure services
Rich Monitoring and Management Interface
• Expose machine learning models as
Web Services APIs
• Integrate ML Models with Azure API
Gateway
• Retrain and extend models via ML
APIs
Programmatic Access to ML Models
AWS Machine Learning
AWS Machine Learning
• Native machine learning service in AWS
• Provide data exploration and visualization tools
• Supports supervised and unsupervised algorithms
• Integrated data transformation models
• APIs for dynamically creating machine learning models
• Programmatic creation of machine
learning models
• Large number of algorithms and recipes
• Data transformation models included in
the language
Sophisticated ML Model Authoring
• Sophisticated monitoring for
evaluating ML models
• Integrated with AWS Cloud Watch
• KPIs that evaluate the efficiency of
ML models
Monitoring ML Model Execution
• Optimized DSL for data
transformation
• Recipes that abstract common
transformations
• Reuse transformation recipes
across ML models
Embedded Data Transformation
• Sophisticated monitoring for
evaluating ML models
• Integrated with AWS Cloud Watch
• KPIs that evaluate the efficiency of
ML models
Monitoring ML Model Execution
Databricks
Databricks Machine Learning
• Scaling Spark machine learning pipelines
• Integrated data visualization tools
• Sophisticated ML monitoring tools
• Combine Python, Scala and R in a single platform
• Implementing machine learning
models using Notebooks
• Publishing notebooks to a
centralized catalog
• Leverage Python, Scala or R to
implement machine learning models
Notebooks Based Authoring
• Integrate data visualization into
machine learning pipelines
• Reuse data visualization
notebooks across applications
• Evaluate the efficiency of
machine learning pipelines using
visualizations
Machine Learning Data Visualization
• Monitor the execution of machine
learning pipelines
• Run machine learning pipelines
manually
• Rapidly modify and deploy machine
learning pipelines
Monitoring and Management
Watson Developer Cloud
• Personality Insights
• Tradeoff Analytics
• Relationship Extraction
• Concept Insights
• Speech to Text
• Text to Speech
• Visual Recognition
• Natural Language Classifier
• Language Identification
• Language Translation
• Question and Answer
• Concept Expansion
• Message Resonance
• AlchemyAPI Services
Large Variety of Cognitive Services
• Access services via REST APIs
• SDKs available for different
languages
• Integration with different
services in the BlueMix
platform
Rich Developer Interfaces
Relationship Extraction Concept Expansion Message Resonance
User Modeling
Complex Algorithms – Simple Interfaces
Other Interesting Platforms
• Microsoft’s Project Oxford https://www.projectoxford.ai/
• BigML https://bigml.com/
On-premise machine
learning platforms
The Leaders
• Revolution Analytics (Microsoft)
• Spark Mlib + Spark R
• Dato
• TensorFlow
• Others: PredictionIO, Scikit-learn…
Revolution Analytics
All of Open Source R plus:
• Big Data scalability
• High-performance analytics
• Development and deployment tools
• Data source connectivity
• Application integration framework
• Multi-platform architecture
• Support, Training and Services
Revolution Analytics (Microsoft)
DistributedR
ScaleR
ConnectR
DeployR
In the Cloud Amazon AWS
Workstations & Servers Windows
Red Hat and SUSE Linux
Clustered Systems IBM Platform LSF
Microsoft HPC
EDW IBM Netezza
Teradata
Hadoop Hortonworks
Cloudera
Write Once, Deploy Anywhere
DeployR does not provide any application UI.
3 integration modes embed real-time R results
into existing interfaces
Web app, mobile app, desktop app, BI tool,
Excel, …
RBroker Framework :
Simple, high-performance API for Java, .NET
and Javascript apps Supports transactional,
on-demand analytics on a stateless R session
Client Libraries:
Flexible control of R services from Java,
.NET and Javascript apps Also supports
stateful R integrations (e.g. complex GUIs)
DeployR Web Services API:
Integrate R using almost any client languages
Integrate R Scripts Into Third Party Applications
Spark Mlib + SparkR
• It is built on Apache Spark, a fast and
general engine for large-scale data
processing
• Run programs up to 100x faster than Hadoop
MapReduce in memory, or 10x faster on disk.
• Write applications quickly in Java, Scala,
or Python.
Spark Mlib
• Integrated with Spark SQL for data
queries and transformations
• Integrated with Spark GraphX for
data visualizations
• Integrated with Spark Streaming for
real time data processing
Beyond Machine Learning
• Run R and machine learning models
using the same infrastructure
• Leverage R scripts from Spark Mlib
models
• Scale R models as part of a Spark
cluster
• Execute R models programmatically
using Java APIs
Spark Mlib + SparkR
Dato
• Makes Python machine learning
enterprise – ready
• Graphlab Create
• Dato Distributed
• Dato Predictive Services
Dato
Principles:
• Get started fast
• Rapidly iterate
• Combine for new apps
import graphlab as gl
data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(data,
user_id='user',
item_id='moviez
target='rating')
recommendations = model.recommend(k=5)
Recommender Image search Sentiment Analysis
Data Matching Auto Tagging Churn Predictor
Click Prediction Product Sentiment Object Detector
Search Ranking Summarization …
Sophisticated ML made easy - Toolkits
Tensor Flow
• Powers deep learning capabilities on dozens
of Google’s products
• Interfaces for modeling machine and deep
learning algorithms
• Platform for executing those algorithms
• Scales from mobile devices to a cluster with
thousands of nodes
• Has become one of the most popular projects
in Guthub in less than a week
Google’s Tensor Flow
• Based on the principle of a dataflow
graph
• Nodes can perform data operations
but also send or receive data
• Python and C++ libraries. NodeJS, Go
and others in the pipeline
Tensorflow Programming Model
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.initialize_all_variables())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print "step %d, training accuracy %g"%(i, train_accuracy)
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print "test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})
• Scales from a single device to a large
cluster of nodes
• Tensorflow uses a placement algorithm
based on heuristics to place tasks on
the different nodes in a graph
• The execution engine assigns tasks for
fault tolerance
• Linear scalability model
Tensor Flow Implementation
• TensorFlow includes an engine that
enables the visual representation of
the execution graph
• Visualizations include summary
statistics of the different states of
the model
• The visualization engine is included
in the current open source release
Tensor Flow Graph Visualization
Other Interesting Projects
• H20.ai
• PredictionIO
• Scikit-Learn
• Microsoft’s DMTK
Machine Learning in the Enterprise
•Enable foundational building blocks
-Data quality
-Data discovery
-Functional and integration testing
•Predictions are tempting but classification and clustering are
easier
•Run multiple models at once
•Enable programmatic interfaces to interact with ML models
•Start small, deliver quickly, iterate…
Machine Learning in the Enterprise
•Machine learning is becoming one of the most important elements of
modern enterprise solutions
•Innovation in machine learning is happening in both the on-premise
and cloud space
•Cloud machine learning innovators include: Azure ML, AWS ML,
Databricks and IBM Watson
•On-premise machine learning innovators include: Spark Mlib,
Microsoft’s Revolution R, Dato, TensorFlow
•Enterprise machine learning solutions should include elements such
as data quality, data governance, etc
•Start small and use real use cases
Summary
Thanks
jesus.rodriguez@tellago.com
https://twitter.com/jrdothoughts
http://jrodthoughts.com/
https://medium.com/@jrodthoughts
Appendix A: Scikit-Learn
• Extensions to SciPy (Scientific Python) are called SciKits. SciKit-Learn
provides machine learning algorithms.
• Algorithms for supervised & unsupervised learning
• Built on SciPy and Numpy
• Standard Python API interface
• Sits on top of c libraries, LAPACK, LibSVM, and Cython
• Open Source: BSD License (part of Linux)
• Probably the best general ML framework out there.
Scikit-Learn
Load &
Transform Data
Raw Data
Feature
Extraction
Build Model
Feature
Evaluation
Very Simple Prediction Model
Evaluate
Model
Assess how model will generalize to independent data set (e.g.
data not in the training set).
1. Divide data into training and test splits
2. Fit model on training, predict on test
3. Determine accuracy, precision and recall
4. Repeat k times with different splits then average as F1
Predicted Class A Predicted Class B
Actual A True A False B #A
Actual B False A True B #B
#P(A) #P(B) total
Simple Programming Model-Cross Validation (classification)
How to evaluate clusters? Visualization (but only in 2D)
Data Visualization
Appendix B: Prediction IO
• Developer friendly machine learning platform
• Completely open source
• Based on Apache Spark
PredictionIO
• PredictionIO platform
A machine learning stack for building, evaluating
and deploying engines with machine learning
algorithms.
• Event Server
An open source machine learning analytics layer for
unifying events from multiple platforms
• Template Gallery
engine templates for different type of machine
learning applications
A Simple Architecture
• Execute models asynchronous via event
interface
• Query data programmatically via REST
interface
• Various SDKs provided as part of the platform
Model Execution
• Visual model for model creation
• Integrated with a template gallery
• Ability to test and valite engines
Rich Model Creation Interface

Weitere ähnliche Inhalte

Was ist angesagt?

Lessons from Large-Scale Cloud Software at Databricks
Lessons from Large-Scale Cloud Software at DatabricksLessons from Large-Scale Cloud Software at Databricks
Lessons from Large-Scale Cloud Software at DatabricksMatei Zaharia
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Guido Schmutz
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Amazon Web Services
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream AnalyticsMarco Parenzan
 
The Event Mesh: real-time, event-driven, responsive APIs and beyond
The Event Mesh: real-time, event-driven, responsive APIs and beyondThe Event Mesh: real-time, event-driven, responsive APIs and beyond
The Event Mesh: real-time, event-driven, responsive APIs and beyondSolace
 
Migrating Your Data Platform At a High Growth Startup
Migrating Your Data Platform At a High Growth StartupMigrating Your Data Platform At a High Growth Startup
Migrating Your Data Platform At a High Growth StartupDatabricks
 
Mainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft AzureMainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft AzurePrecisely
 
Microservices, DevOps & SRE
Microservices, DevOps & SREMicroservices, DevOps & SRE
Microservices, DevOps & SREAraf Karsh Hamid
 
Overview PowerPlatform PowerApss
Overview PowerPlatform PowerApssOverview PowerPlatform PowerApss
Overview PowerPlatform PowerApssJuan Fabian
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesIvo Andreev
 
The Basics of Getting Started With Microsoft Azure
The Basics of Getting Started With Microsoft AzureThe Basics of Getting Started With Microsoft Azure
The Basics of Getting Started With Microsoft AzureMicrosoft Azure
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Amazon Web Services
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream AnalyticsMarco Parenzan
 
Building compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWSBuilding compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWSAmazon Web Services
 
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Lucas Jellema
 
Modernizing your Application Architecture with Microservices
Modernizing your Application Architecture with MicroservicesModernizing your Application Architecture with Microservices
Modernizing your Application Architecture with Microservicesconfluent
 

Was ist angesagt? (20)

Lessons from Large-Scale Cloud Software at Databricks
Lessons from Large-Scale Cloud Software at DatabricksLessons from Large-Scale Cloud Software at Databricks
Lessons from Large-Scale Cloud Software at Databricks
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?
 
Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS Building a Big Data & Analytics Platform using AWS
Building a Big Data & Analytics Platform using AWS
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
 
The Event Mesh: real-time, event-driven, responsive APIs and beyond
The Event Mesh: real-time, event-driven, responsive APIs and beyondThe Event Mesh: real-time, event-driven, responsive APIs and beyond
The Event Mesh: real-time, event-driven, responsive APIs and beyond
 
Migrating Your Data Platform At a High Growth Startup
Migrating Your Data Platform At a High Growth StartupMigrating Your Data Platform At a High Growth Startup
Migrating Your Data Platform At a High Growth Startup
 
Mainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft AzureMainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft Azure
 
Microservices, DevOps & SRE
Microservices, DevOps & SREMicroservices, DevOps & SRE
Microservices, DevOps & SRE
 
Overview PowerPlatform PowerApss
Overview PowerPlatform PowerApssOverview PowerPlatform PowerApss
Overview PowerPlatform PowerApss
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
The Basics of Getting Started With Microsoft Azure
The Basics of Getting Started With Microsoft AzureThe Basics of Getting Started With Microsoft Azure
The Basics of Getting Started With Microsoft Azure
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
 
Redington Value Journal - July 2017
Redington Value Journal - July 2017Redington Value Journal - July 2017
Redington Value Journal - July 2017
 
Building compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWSBuilding compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWS
 
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
Introducing Apache Kafka and why it is important to Oracle, Java and IT profe...
 
Modernizing your Application Architecture with Microservices
Modernizing your Application Architecture with MicroservicesModernizing your Application Architecture with Microservices
Modernizing your Application Architecture with Microservices
 

Ähnlich wie A practical guidance of the enterprise machine learning

Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin
 
Introduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep LearningIntroduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep LearningNishan Aryal
 
Making Data Scientists Productive in Azure
Making Data Scientists Productive in AzureMaking Data Scientists Productive in Azure
Making Data Scientists Productive in AzureValdas Maksimavičius
 
Cloud-based Modelling Solutions Empowering Tool Integration
Cloud-based Modelling Solutions Empowering Tool IntegrationCloud-based Modelling Solutions Empowering Tool Integration
Cloud-based Modelling Solutions Empowering Tool IntegrationIstvan Rath
 
With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?Dan Sullivan, Ph.D.
 
IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019Istvan Rath
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AIJames Serra
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksAlberto Diaz Martin
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowDatabricks
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMark Kromer
 
Cloud-native Data
Cloud-native DataCloud-native Data
Cloud-native Datacornelia davis
 
Cloud-Native-Data with Cornelia Davis
Cloud-Native-Data with Cornelia DavisCloud-Native-Data with Cornelia Davis
Cloud-Native-Data with Cornelia DavisVMware Tanzu
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Rui Quintino
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on DatabricksDataScienceConferenc1
 
Deeplearning and dev ops azure
Deeplearning and dev ops azureDeeplearning and dev ops azure
Deeplearning and dev ops azureVishwas N
 
BBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationBBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationRitika Gunnar
 

Ähnlich wie A practical guidance of the enterprise machine learning (20)

Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Introduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep LearningIntroduction to Machine learning and Deep Learning
Introduction to Machine learning and Deep Learning
 
Making Data Scientists Productive in Azure
Making Data Scientists Productive in AzureMaking Data Scientists Productive in Azure
Making Data Scientists Productive in Azure
 
Cloud-based Modelling Solutions Empowering Tool Integration
Cloud-based Modelling Solutions Empowering Tool IntegrationCloud-based Modelling Solutions Empowering Tool Integration
Cloud-based Modelling Solutions Empowering Tool Integration
 
With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?
 
IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Machine learning
Machine learningMachine learning
Machine learning
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
 
Cloud-native Data
Cloud-native DataCloud-native Data
Cloud-native Data
 
Cloud-Native-Data with Cornelia Davis
Cloud-Native-Data with Cornelia DavisCloud-Native-Data with Cornelia Davis
Cloud-Native-Data with Cornelia Davis
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Deeplearning and dev ops azure
Deeplearning and dev ops azureDeeplearning and dev ops azure
Deeplearning and dev ops azure
 
BBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationBBBT Watson Data Platform Presentation
BBBT Watson Data Platform Presentation
 

Mehr von Jesus Rodriguez

The Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesThe Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesJesus Rodriguez
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxJesus Rodriguez
 
DeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketDeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketJesus Rodriguez
 
MEV Deep Dive .pptx
MEV Deep Dive .pptxMEV Deep Dive .pptx
MEV Deep Dive .pptxJesus Rodriguez
 
Quant in Crypto Land
Quant in Crypto LandQuant in Crypto Land
Quant in Crypto LandJesus Rodriguez
 
The Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersThe Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersJesus Rodriguez
 
Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Jesus Rodriguez
 
DeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesDeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesJesus Rodriguez
 
High Frequency Trading and DeFi
High Frequency Trading and DeFiHigh Frequency Trading and DeFi
High Frequency Trading and DeFiJesus Rodriguez
 
Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Jesus Rodriguez
 
15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics15 Minutes of DeFi Analytics
15 Minutes of DeFi AnalyticsJesus Rodriguez
 
DeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesDeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesJesus Rodriguez
 
Practical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revPractical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revJesus Rodriguez
 
Better Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsBetter Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsJesus Rodriguez
 
Price Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesPrice Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesJesus Rodriguez
 
Fascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesFascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesJesus Rodriguez
 
Price PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningPrice PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningJesus Rodriguez
 
Demystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceDemystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceJesus Rodriguez
 
Crypto assets are a data science heaven rev
Crypto assets are a data science heaven revCrypto assets are a data science heaven rev
Crypto assets are a data science heaven revJesus Rodriguez
 
Implementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldImplementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldJesus Rodriguez
 

Mehr von Jesus Rodriguez (20)

The Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-PrimitivesThe Emergence of DeFi Micro-Primitives
The Emergence of DeFi Micro-Primitives
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
DeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto MarketDeFi Opportunities and Challenges in the Current Crypto Market
DeFi Opportunities and Challenges in the Current Crypto Market
 
MEV Deep Dive .pptx
MEV Deep Dive .pptxMEV Deep Dive .pptx
MEV Deep Dive .pptx
 
Quant in Crypto Land
Quant in Crypto LandQuant in Crypto Land
Quant in Crypto Land
 
The Polygon Blockchain by the Numbers
The Polygon Blockchain by the NumbersThe Polygon Blockchain by the Numbers
The Polygon Blockchain by the Numbers
 
Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies Social Analytics for Cryptocurrencies
Social Analytics for Cryptocurrencies
 
DeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating StrategiesDeFi Quant Yield-Generating Strategies
DeFi Quant Yield-Generating Strategies
 
High Frequency Trading and DeFi
High Frequency Trading and DeFiHigh Frequency Trading and DeFi
High Frequency Trading and DeFi
 
Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About Simple DeFi Analytics Any Crypto-Investor Should Know About
Simple DeFi Analytics Any Crypto-Investor Should Know About
 
15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics15 Minutes of DeFi Analytics
15 Minutes of DeFi Analytics
 
DeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and ChallengesDeFi Trading Strategies: Opportunities and Challenges
DeFi Trading Strategies: Opportunities and Challenges
 
Practical Crypto Asset Predictions rev
Practical Crypto Asset Predictions revPractical Crypto Asset Predictions rev
Practical Crypto Asset Predictions rev
 
Better Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain IndicatorsBetter Technical Analysis with Blockchain Indicators
Better Technical Analysis with Blockchain Indicators
 
Price Predictions for Cryptocurrencies
Price Predictions for CryptocurrenciesPrice Predictions for Cryptocurrencies
Price Predictions for Cryptocurrencies
 
Fascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About CryptocurrenciesFascinating Metrics and Analytics About Cryptocurrencies
Fascinating Metrics and Analytics About Cryptocurrencies
 
Price PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep LearningPrice PRedictions for Crypto-Assets Using Deep Learning
Price PRedictions for Crypto-Assets Using Deep Learning
 
Demystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data ScienceDemystifying Centralized Crypto Exchanges using Data Science
Demystifying Centralized Crypto Exchanges using Data Science
 
Crypto assets are a data science heaven rev
Crypto assets are a data science heaven revCrypto assets are a data science heaven rev
Crypto assets are a data science heaven rev
 
Implementing Machine Learning in the Real World
Implementing Machine Learning in the Real WorldImplementing Machine Learning in the Real World
Implementing Machine Learning in the Real World
 

KĂźrzlich hochgeladen

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessEnvertis Software Solutions
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

KĂźrzlich hochgeladen (20)

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

A practical guidance of the enterprise machine learning

  • 1. A Practical Guidance to the Enterprise Machine Learning Platform Ecosystem
  • 2. About Us • Helping great companies become great software companies • Building software solutions powered by disruptive enterprise software trends -Machine learning and data science -Cyber-security -Enterprise IOT -Powered by Cloud and Mobile • Bringing innovation from startups and academic institutions to the enterprise • Award winning agencies: Inc 500, American Business Awards, International Business Awards
  • 3. About This Webinar • Research that brings together big enterprise software trends, exciting startups and academic research • Best practices based on real world implementation experience • No sales pitches
  • 4. • Cloud vs. On-Premise machine learning • Cloud machine learning platforms • Azure machine learning • AWS machine learning • Databricks • Watson developer cloud • Others… • On-premise machine learning platforms • Revolution analytics • Dato • Spark Mlib • TensorFlow • Others… Agenda
  • 7.
  • 8. Modern Machine Learning • Advances in storage, compute and data science research are making machine learning as part of mainstream technology platforms • Big data movement • Machine learning platforms are optimized with developer-friendly interfaces • Platform as a service providers have drastically lowered the entry point for machine learning applications • R and Python are leading the charge
  • 9. Cloud vs. On-Premise machine learning platforms
  • 10. Cloud Machine Learning Platforms: Benefits • Service abstraction layer over the machine learning infrastructure • Rich visual modeling tools • Rich monitoring and tracking interfaces • Combine multiple platforms: R, Python, etc • Enable programmatic access to ML models
  • 11. Cloud machine Learning Platforms:: Challenges • Integration with on-premise data stores • Extensibility • Security and privacy
  • 12. On-Premise machine Learning Platforms: Benefits • Control • Security • Integration with on-premise data stores • Integrated with R and Python machine learning frameworks
  • 13. On-Premise machine Learning Platforms: Challenges • Code-based modeling interfaces • Scalability • Tightly coupled with Hadoop distributions • Monitoring and management • Data quality and curation
  • 15. • Azure Machine Learning • AWS machine learning • Databricks • Watson developer cloud The Leaders
  • 17. Azure Machine Learning • Native machine learning capabilities as part of the Azure cloud • Elastic infrastructure that scale based on the model requirements • Support over 30 supervised and unsupervised machine learning algorithms • Integration with R and Python machine learning libraries • Expose machine learning models via programmable interfaces • Integrated with the Cortana Analytics suite • Integrated with PowerBI
  • 18. • Supports both supervised and unsupervised models • Integrated with Azure HDInsight • Large library of models and sample gallery • Support for R and Python code Visual Model Creation
  • 19. • Visual dashboard to track the execution of ML models • Track execution of different steps within a ML model • Integrated monitoring experience with other Azure services Rich Monitoring and Management Interface
  • 20. • Expose machine learning models as Web Services APIs • Integrate ML Models with Azure API Gateway • Retrain and extend models via ML APIs Programmatic Access to ML Models
  • 22. AWS Machine Learning • Native machine learning service in AWS • Provide data exploration and visualization tools • Supports supervised and unsupervised algorithms • Integrated data transformation models • APIs for dynamically creating machine learning models
  • 23. • Programmatic creation of machine learning models • Large number of algorithms and recipes • Data transformation models included in the language Sophisticated ML Model Authoring
  • 24. • Sophisticated monitoring for evaluating ML models • Integrated with AWS Cloud Watch • KPIs that evaluate the efficiency of ML models Monitoring ML Model Execution
  • 25. • Optimized DSL for data transformation • Recipes that abstract common transformations • Reuse transformation recipes across ML models Embedded Data Transformation
  • 26. • Sophisticated monitoring for evaluating ML models • Integrated with AWS Cloud Watch • KPIs that evaluate the efficiency of ML models Monitoring ML Model Execution
  • 28. Databricks Machine Learning • Scaling Spark machine learning pipelines • Integrated data visualization tools • Sophisticated ML monitoring tools • Combine Python, Scala and R in a single platform
  • 29. • Implementing machine learning models using Notebooks • Publishing notebooks to a centralized catalog • Leverage Python, Scala or R to implement machine learning models Notebooks Based Authoring
  • 30. • Integrate data visualization into machine learning pipelines • Reuse data visualization notebooks across applications • Evaluate the efficiency of machine learning pipelines using visualizations Machine Learning Data Visualization
  • 31. • Monitor the execution of machine learning pipelines • Run machine learning pipelines manually • Rapidly modify and deploy machine learning pipelines Monitoring and Management
  • 33. • Personality Insights • Tradeoff Analytics • Relationship Extraction • Concept Insights • Speech to Text • Text to Speech • Visual Recognition • Natural Language Classifier • Language Identification • Language Translation • Question and Answer • Concept Expansion • Message Resonance • AlchemyAPI Services Large Variety of Cognitive Services
  • 34. • Access services via REST APIs • SDKs available for different languages • Integration with different services in the BlueMix platform Rich Developer Interfaces
  • 35. Relationship Extraction Concept Expansion Message Resonance User Modeling Complex Algorithms – Simple Interfaces
  • 36. Other Interesting Platforms • Microsoft’s Project Oxford https://www.projectoxford.ai/ • BigML https://bigml.com/
  • 38. The Leaders • Revolution Analytics (Microsoft) • Spark Mlib + Spark R • Dato • TensorFlow • Others: PredictionIO, Scikit-learn…
  • 40. All of Open Source R plus: • Big Data scalability • High-performance analytics • Development and deployment tools • Data source connectivity • Application integration framework • Multi-platform architecture • Support, Training and Services Revolution Analytics (Microsoft)
  • 41. DistributedR ScaleR ConnectR DeployR In the Cloud Amazon AWS Workstations & Servers Windows Red Hat and SUSE Linux Clustered Systems IBM Platform LSF Microsoft HPC EDW IBM Netezza Teradata Hadoop Hortonworks Cloudera Write Once, Deploy Anywhere
  • 42. DeployR does not provide any application UI. 3 integration modes embed real-time R results into existing interfaces Web app, mobile app, desktop app, BI tool, Excel, … RBroker Framework : Simple, high-performance API for Java, .NET and Javascript apps Supports transactional, on-demand analytics on a stateless R session Client Libraries: Flexible control of R services from Java, .NET and Javascript apps Also supports stateful R integrations (e.g. complex GUIs) DeployR Web Services API: Integrate R using almost any client languages Integrate R Scripts Into Third Party Applications
  • 43. Spark Mlib + SparkR
  • 44. • It is built on Apache Spark, a fast and general engine for large-scale data processing • Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. • Write applications quickly in Java, Scala, or Python. Spark Mlib
  • 45. • Integrated with Spark SQL for data queries and transformations • Integrated with Spark GraphX for data visualizations • Integrated with Spark Streaming for real time data processing Beyond Machine Learning
  • 46. • Run R and machine learning models using the same infrastructure • Leverage R scripts from Spark Mlib models • Scale R models as part of a Spark cluster • Execute R models programmatically using Java APIs Spark Mlib + SparkR
  • 47. Dato
  • 48. • Makes Python machine learning enterprise – ready • Graphlab Create • Dato Distributed • Dato Predictive Services Dato
  • 49.
  • 50.
  • 51. Principles: • Get started fast • Rapidly iterate • Combine for new apps import graphlab as gl data = gl.SFrame.read_csv('my_data.csv') model = gl.recommender.create(data, user_id='user', item_id='moviez target='rating') recommendations = model.recommend(k=5) Recommender Image search Sentiment Analysis Data Matching Auto Tagging Churn Predictor Click Prediction Product Sentiment Object Detector Search Ranking Summarization … Sophisticated ML made easy - Toolkits
  • 53. • Powers deep learning capabilities on dozens of Google’s products • Interfaces for modeling machine and deep learning algorithms • Platform for executing those algorithms • Scales from mobile devices to a cluster with thousands of nodes • Has become one of the most popular projects in Guthub in less than a week Google’s Tensor Flow
  • 54. • Based on the principle of a dataflow graph • Nodes can perform data operations but also send or receive data • Python and C++ libraries. NodeJS, Go and others in the pipeline Tensorflow Programming Model cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print "step %d, training accuracy %g"%(i, train_accuracy) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print "test accuracy %g"%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})
  • 55. • Scales from a single device to a large cluster of nodes • Tensorflow uses a placement algorithm based on heuristics to place tasks on the different nodes in a graph • The execution engine assigns tasks for fault tolerance • Linear scalability model Tensor Flow Implementation
  • 56. • TensorFlow includes an engine that enables the visual representation of the execution graph • Visualizations include summary statistics of the different states of the model • The visualization engine is included in the current open source release Tensor Flow Graph Visualization
  • 57. Other Interesting Projects • H20.ai • PredictionIO • Scikit-Learn • Microsoft’s DMTK
  • 58. Machine Learning in the Enterprise
  • 59. •Enable foundational building blocks -Data quality -Data discovery -Functional and integration testing •Predictions are tempting but classification and clustering are easier •Run multiple models at once •Enable programmatic interfaces to interact with ML models •Start small, deliver quickly, iterate… Machine Learning in the Enterprise
  • 60. •Machine learning is becoming one of the most important elements of modern enterprise solutions •Innovation in machine learning is happening in both the on-premise and cloud space •Cloud machine learning innovators include: Azure ML, AWS ML, Databricks and IBM Watson •On-premise machine learning innovators include: Spark Mlib, Microsoft’s Revolution R, Dato, TensorFlow •Enterprise machine learning solutions should include elements such as data quality, data governance, etc •Start small and use real use cases Summary
  • 63. • Extensions to SciPy (Scientific Python) are called SciKits. SciKit-Learn provides machine learning algorithms. • Algorithms for supervised & unsupervised learning • Built on SciPy and Numpy • Standard Python API interface • Sits on top of c libraries, LAPACK, LibSVM, and Cython • Open Source: BSD License (part of Linux) • Probably the best general ML framework out there. Scikit-Learn
  • 64. Load & Transform Data Raw Data Feature Extraction Build Model Feature Evaluation Very Simple Prediction Model Evaluate Model
  • 65. Assess how model will generalize to independent data set (e.g. data not in the training set). 1. Divide data into training and test splits 2. Fit model on training, predict on test 3. Determine accuracy, precision and recall 4. Repeat k times with different splits then average as F1 Predicted Class A Predicted Class B Actual A True A False B #A Actual B False A True B #B #P(A) #P(B) total Simple Programming Model-Cross Validation (classification)
  • 66. How to evaluate clusters? Visualization (but only in 2D) Data Visualization
  • 68. • Developer friendly machine learning platform • Completely open source • Based on Apache Spark PredictionIO
  • 69. • PredictionIO platform A machine learning stack for building, evaluating and deploying engines with machine learning algorithms. • Event Server An open source machine learning analytics layer for unifying events from multiple platforms • Template Gallery engine templates for different type of machine learning applications A Simple Architecture
  • 70. • Execute models asynchronous via event interface • Query data programmatically via REST interface • Various SDKs provided as part of the platform Model Execution
  • 71. • Visual model for model creation • Integrated with a template gallery • Ability to test and valite engines Rich Model Creation Interface