SlideShare a Scribd company logo
1 of 17
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Building, training, and deploying ML
models with Amazon SageMaker
Emily Webber
Machine Learning Specialist
Amazon Web Services
A I M 3 0 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
For this workshop, you need
• A personal laptop
• An AWS account
• Access to Amazon SageMaker, Amazon S3, and Amazon ECS
• Credits will be provided to offset cost of AWS services charged to
your account for this workshop
3
1837 1969
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Put machine learning in the
hands of every developer
Our mission at AWS
Our mission at AWS
Put machine learning in the
hands of every developer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The Amazon ML stack:
Broadest & deepest set of capabilities
AI SERVICES
Easily add intelligence to applications without machine learning skills
Vision | Documents | Speech | Language | Chatbots | Forecasting | Recommendations
ML SERVICES
Build, train, and deploy machine learning models quickly and easily
Data labeling | Pre-built algorithms & notebooks | One-click training and deployment
ML FRAMEWORKS &
INFRASTRUCTURE
Flexibility & choice, highest-performing infrastructure
Support for ML frameworks | Compute options purpose built for ML
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
1 2 3 5
I I I I
Notebook Instances
with 200+ Examples
17 Built-In Algorithms ML Training Service x 5 ML Hosting Service
4
I
Hyper-Parameter
Tuning Service x 2
1
1 + 𝑒−𝑥
6
Batch Transform
• Inferencing
• Data Transformation
7
SDK’s:
• Python
• Spark
8
Documentation &
whitepapers
& blog posts
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Problem Statement
Healthcare insurance fraud is a pressing problem, causing substantial and
increasing costs in medical insurance programs.
Due to large amounts of claims submitted, review of individual claims becomes a
difficult task and encourages the employment of automated pre-payment controls
and better post-payment decision support tools to enable subject matter expert
analysis.
We will demonstrate the application of unsupervised anomalous outlier techniques
on a minimal set of metrics made available in the CMS Medicare inpatient claims
from 2008.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Dataset
CMS Medicare inpatient claims from 2008
• Medicare inpatient claims from 2008
• Each record is an inpatient claim incurred by a 5% sample of Medicare beneficiaries.
• Beneficiary identities are not provided
• Zip-code of facilities where patient was treated are not provided
• The file contains seven (8) variables. One primary key and seven analytic variables
• Data dictionary required to interpret codes in dataset are provided
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data Variables
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Techniques and Algorithms used in the workshop
• Outlier Detection
• Word Embedding and Word2Vec
• Principal Component Analysis
• Calculate the Mahalanobis distance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Outlier Detection
“Observation which deviates so much from other observations as to arouse suspicion it was generated
by a different mechanism” —Hawkins(1980)
• Modeling normal objects and outlier effectively. The border between data normality and abnormality
(outliers) is often not clear cut.
• Outlier detection methods could be application specific. Example, in clinical data small deviation could
be an outlier, but, in a marketing application large deviation is required to justify an outlier.
• Noise in data may be present as deviations in attribute values or even as missing values. Noise may
hide an outlier or may flag deviation as an outlier.
• Providing justification for an outlier from understandability point of view may be difficult.
Challenges
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Word Embedding and Word2Vec
Word Embedding
Word2Vec
• Texts converted into numbers and there may be different numerical representations of the
same text
• Many techniques exist. CBOW (Continuous Bag of words) and skip-gram are popular and
effective for large corpus of documents
• Example, CBOW predict the probability of a word given a context
• Shallow neural networks which map word(s) to the target variable which is also a word(s).
• Learn weights which act as word vector representations
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Principal Component Analysis
When
• Do you want to reduce the number of variables, but aren’t able to identify variables to completely
remove from consideration?
• Do you want to ensure your variables are independent of one another?
• Are you comfortable making your independent variables less interpretable?
How
• A measure of how each variable is associated with one another. (Covariance matrix)
• The directions in which our data are dispersed. (Eigenvectors)
• The relative importance of these different directions. (Eigenvalues)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Isolation of different environment
Notebook Instance
Training
Inference
Amazon SageMaker provides you a clean separation between your Jupyter Notebook Instance, Training
environment instances and Inference environment instances by launching a new stack on new machine to
support below functions.
This is your IDE environment to write scripts and test out your algorithm on a small subset of data. Choose
lower size instances for this work.
This is your training environment that is created when you make a fit function call using SageMaker SDK
from within a Jupyter Notebook. Choose one or multiple lager instances to train your learning algorithm on
a large dataset. The environment is terminated automatically when training completes.
This is your hosting environment for your trained models. You can scale it out automatically by using
automatic scaling feature. You can choose to run smaller or larger size instances to meet your demand for
inference.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Complete the workshop
https://tinyurl.com/y3ch2kjg
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Amazon Web Services
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Amazon Web Services
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Amazon Web Services
 
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...Amazon Web Services
 
AWS Fargate deep dive - MAD303 - New York AWS Summit
AWS Fargate deep dive - MAD303 - New York AWS SummitAWS Fargate deep dive - MAD303 - New York AWS Summit
AWS Fargate deep dive - MAD303 - New York AWS SummitAmazon Web Services
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Amazon Web Services
 
Increase the value of video using ML and AWS media services - SVC301 - Santa ...
Increase the value of video using ML and AWS media services - SVC301 - Santa ...Increase the value of video using ML and AWS media services - SVC301 - Santa ...
Increase the value of video using ML and AWS media services - SVC301 - Santa ...Amazon Web Services
 
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Amazon Web Services
 
Building ML platforms in Financial Services with serverless technology - FSV2...
Building ML platforms in Financial Services with serverless technology - FSV2...Building ML platforms in Financial Services with serverless technology - FSV2...
Building ML platforms in Financial Services with serverless technology - FSV2...Amazon Web Services
 
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...Amazon Web Services
 
Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Amazon Web Services
 
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...Amazon Web Services
 
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019AWS Summits
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...Amazon Web Services
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Amazon Web Services
 
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...Amazon Web Services
 
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...Amazon Web Services
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Amazon Web Services
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAmazon Web Services
 

What's hot (20)

Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
Get hands-on with AWS DeepRacer and compete in the AWS DeepRacer League - AIM...
 
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...Optimize deep learning training and inferencing using GPU and Amazon SageMake...
Optimize deep learning training and inferencing using GPU and Amazon SageMake...
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
 
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
 
AWS Fargate deep dive - MAD303 - New York AWS Summit
AWS Fargate deep dive - MAD303 - New York AWS SummitAWS Fargate deep dive - MAD303 - New York AWS Summit
AWS Fargate deep dive - MAD303 - New York AWS Summit
 
Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...Building enterprise solutions with blockchain technology - SVC217 - New York ...
Building enterprise solutions with blockchain technology - SVC217 - New York ...
 
Increase the value of video using ML and AWS media services - SVC301 - Santa ...
Increase the value of video using ML and AWS media services - SVC301 - Santa ...Increase the value of video using ML and AWS media services - SVC301 - Santa ...
Increase the value of video using ML and AWS media services - SVC301 - Santa ...
 
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
 
Building ML platforms in Financial Services with serverless technology - FSV2...
Building ML platforms in Financial Services with serverless technology - FSV2...Building ML platforms in Financial Services with serverless technology - FSV2...
Building ML platforms in Financial Services with serverless technology - FSV2...
 
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
 
Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...Using automation to drive continuous-compliance best practices - SEC208 - New...
Using automation to drive continuous-compliance best practices - SEC208 - New...
 
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...
Building real-time applications with Amazon ElastiCache - ADB204 - Anaheim AW...
 
HK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-WorkshopHK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-Workshop
 
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019
Frontend and Mobile with AWS Amplify | AWS Summit Tel Aviv 2019
 
AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...AWS storage solutions for business-critical applications - STG301 - Chicago A...
AWS storage solutions for business-critical applications - STG301 - Chicago A...
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...
Amazon SageMaker: ML for Every Developer and Data Scientist - AIM202 - Anahei...
 
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...
Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - At...
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
 

Similar to Build, train, and deploy ML models with Amazon SageMaker - AIM302 - New York AWS Summit

Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Amazon Web Services
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Amazon Web Services Korea
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Julien SIMON
 
Build-Train-Deploy-Machine-Learning-Models-at-Any-Scale
Build-Train-Deploy-Machine-Learning-Models-at-Any-ScaleBuild-Train-Deploy-Machine-Learning-Models-at-Any-Scale
Build-Train-Deploy-Machine-Learning-Models-at-Any-ScaleAmazon Web Services
 
Integrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourIntegrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourVMware Tanzu
 
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Amazon Web Services
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...Amazon Web Services
 
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservicesKim Kao
 
Mining Intelligent Insights: AI/ML for Financial Services
Mining Intelligent Insights: AI/ML for Financial ServicesMining Intelligent Insights: AI/ML for Financial Services
Mining Intelligent Insights: AI/ML for Financial ServicesAmazon Web Services LATAM
 
How to Wrangle Data for Machine Learning on AWS
 How to Wrangle Data for Machine Learning on AWS How to Wrangle Data for Machine Learning on AWS
How to Wrangle Data for Machine Learning on AWSAmazon Web Services
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...Jonathan Dion
 
AIM102-S_Cognizant_CognizantCognitive
AIM102-S_Cognizant_CognizantCognitiveAIM102-S_Cognizant_CognizantCognitive
AIM102-S_Cognizant_CognizantCognitivePhilipBasford
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSAmazon Web Services
 
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成Amazon Web Services
 
Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Amazon Web Services
 
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain Pipeline
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain PipelineThe Zen of DataOps – AWS Lake Formation and the Data Supply Chain Pipeline
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain PipelineAmazon Web Services
 

Similar to Build, train, and deploy ML models with Amazon SageMaker - AIM302 - New York AWS Summit (20)

Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
Become a ML developer
Become a ML developerBecome a ML developer
Become a ML developer
 
Build-Train-Deploy-Machine-Learning-Models-at-Any-Scale
Build-Train-Deploy-Machine-Learning-Models-at-Any-ScaleBuild-Train-Deploy-Machine-Learning-Models-at-Any-Scale
Build-Train-Deploy-Machine-Learning-Models-at-Any-Scale
 
Integrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an HourIntegrate Machine Learning into Your Spring Application in Less than an Hour
Integrate Machine Learning into Your Spring Application in Less than an Hour
 
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
Fraud detection using machine learning with Amazon SageMaker - AIM306 - New Y...
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - ...
 
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices
2019 06-12-aws taipei summit-dev day-essential capabilities behind microservices
 
Mining Intelligent Insights: AI/ML for Financial Services
Mining Intelligent Insights: AI/ML for Financial ServicesMining Intelligent Insights: AI/ML for Financial Services
Mining Intelligent Insights: AI/ML for Financial Services
 
How to Wrangle Data for Machine Learning on AWS
 How to Wrangle Data for Machine Learning on AWS How to Wrangle Data for Machine Learning on AWS
How to Wrangle Data for Machine Learning on AWS
 
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
 
AIM102-S_Cognizant_CognizantCognitive
AIM102-S_Cognizant_CognizantCognitiveAIM102-S_Cognizant_CognizantCognitive
AIM102-S_Cognizant_CognizantCognitive
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWS
 
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成
AI ﹑大數據媒體應用和利用機器學習與 AWS 媒體服務實現自動化內容生成
 
Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?
 
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain Pipeline
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain PipelineThe Zen of DataOps – AWS Lake Formation and the Data Supply Chain Pipeline
The Zen of DataOps – AWS Lake Formation and the Data Supply Chain Pipeline
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Build, train, and deploy ML models with Amazon SageMaker - AIM302 - New York AWS Summit

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Building, training, and deploying ML models with Amazon SageMaker Emily Webber Machine Learning Specialist Amazon Web Services A I M 3 0 2
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T For this workshop, you need • A personal laptop • An AWS account • Access to Amazon SageMaker, Amazon S3, and Amazon ECS • Credits will be provided to offset cost of AWS services charged to your account for this workshop
  • 4.
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Put machine learning in the hands of every developer Our mission at AWS Our mission at AWS Put machine learning in the hands of every developer
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The Amazon ML stack: Broadest & deepest set of capabilities AI SERVICES Easily add intelligence to applications without machine learning skills Vision | Documents | Speech | Language | Chatbots | Forecasting | Recommendations ML SERVICES Build, train, and deploy machine learning models quickly and easily Data labeling | Pre-built algorithms & notebooks | One-click training and deployment ML FRAMEWORKS & INFRASTRUCTURE Flexibility & choice, highest-performing infrastructure Support for ML frameworks | Compute options purpose built for ML
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker 1 2 3 5 I I I I Notebook Instances with 200+ Examples 17 Built-In Algorithms ML Training Service x 5 ML Hosting Service 4 I Hyper-Parameter Tuning Service x 2 1 1 + 𝑒−𝑥 6 Batch Transform • Inferencing • Data Transformation 7 SDK’s: • Python • Spark 8 Documentation & whitepapers & blog posts
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Problem Statement Healthcare insurance fraud is a pressing problem, causing substantial and increasing costs in medical insurance programs. Due to large amounts of claims submitted, review of individual claims becomes a difficult task and encourages the employment of automated pre-payment controls and better post-payment decision support tools to enable subject matter expert analysis. We will demonstrate the application of unsupervised anomalous outlier techniques on a minimal set of metrics made available in the CMS Medicare inpatient claims from 2008.
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Dataset CMS Medicare inpatient claims from 2008 • Medicare inpatient claims from 2008 • Each record is an inpatient claim incurred by a 5% sample of Medicare beneficiaries. • Beneficiary identities are not provided • Zip-code of facilities where patient was treated are not provided • The file contains seven (8) variables. One primary key and seven analytic variables • Data dictionary required to interpret codes in dataset are provided
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data Variables
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Techniques and Algorithms used in the workshop • Outlier Detection • Word Embedding and Word2Vec • Principal Component Analysis • Calculate the Mahalanobis distance
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Outlier Detection “Observation which deviates so much from other observations as to arouse suspicion it was generated by a different mechanism” —Hawkins(1980) • Modeling normal objects and outlier effectively. The border between data normality and abnormality (outliers) is often not clear cut. • Outlier detection methods could be application specific. Example, in clinical data small deviation could be an outlier, but, in a marketing application large deviation is required to justify an outlier. • Noise in data may be present as deviations in attribute values or even as missing values. Noise may hide an outlier or may flag deviation as an outlier. • Providing justification for an outlier from understandability point of view may be difficult. Challenges
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Word Embedding and Word2Vec Word Embedding Word2Vec • Texts converted into numbers and there may be different numerical representations of the same text • Many techniques exist. CBOW (Continuous Bag of words) and skip-gram are popular and effective for large corpus of documents • Example, CBOW predict the probability of a word given a context • Shallow neural networks which map word(s) to the target variable which is also a word(s). • Learn weights which act as word vector representations
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Principal Component Analysis When • Do you want to reduce the number of variables, but aren’t able to identify variables to completely remove from consideration? • Do you want to ensure your variables are independent of one another? • Are you comfortable making your independent variables less interpretable? How • A measure of how each variable is associated with one another. (Covariance matrix) • The directions in which our data are dispersed. (Eigenvectors) • The relative importance of these different directions. (Eigenvalues)
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Isolation of different environment Notebook Instance Training Inference Amazon SageMaker provides you a clean separation between your Jupyter Notebook Instance, Training environment instances and Inference environment instances by launching a new stack on new machine to support below functions. This is your IDE environment to write scripts and test out your algorithm on a small subset of data. Choose lower size instances for this work. This is your training environment that is created when you make a fit function call using SageMaker SDK from within a Jupyter Notebook. Choose one or multiple lager instances to train your learning algorithm on a large dataset. The environment is terminated automatically when training completes. This is your hosting environment for your trained models. You can scale it out automatically by using automatic scaling feature. You can choose to run smaller or larger size instances to meet your demand for inference.
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Complete the workshop https://tinyurl.com/y3ch2kjg
  • 17. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.