SlideShare ist ein Scribd-Unternehmen logo
1 von 38
1StoryStream.ai
From POC to Production in
Minimal Time –
Avoiding Pain in ML Projects
Dr Janet Bastiman
@yssybyl
InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
poc-ml/
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
2StoryStream.ai
Project timings
Dr Janet Bastiman @yssybyl
3StoryStream.ai
The world’s leading automotive content platform
StoryStream is a dedicated automotive content platform, trusted by some of the
world’s leading car brands. Specifically created to help automotive brands
provide a more relevant, engaging customer experience, fuelled with authentic
content and designed for efficiently scaling content operations across global
teams.
● Grow customer engagement and conversions by up to 25%
● Reduce content creation and management costs by up to 60%
● Provide a more authentic customer experience
● Understand your customer in a deeper way
About StoryStream
The Core StoryStream Benefits
4StoryStream.ai
5StoryStream.ai
Dr Janet Bastiman @yssybyl
6StoryStream.ai
“[Client] needs this to go live at the end of
the month, I promised them we could
deliver...”
Every salesperson ever
Dr Janet Bastiman @yssybyl
7StoryStream.ai
Project timings
Dr Janet Bastiman @yssybyl
● 35 models = 1050 days (one person linear)
● ~ 5 years for one person working Mon-Fri - who is allowed
holidays :)
● 250 days with parallelisation of tasks and data upfront
● 150 days on worksheet, balanced by an increase in ongoing
license
8StoryStream.ai
Can you guess what happened next?
Dr Janet Bastiman @yssybyl
9StoryStream.ai
What would it take to get it done in that time?
Dr Janet Bastiman @yssybyl
The Core (2003)
Paramount Pictures
10StoryStream.ai
“They don’t have any data to give us”
Dr Janet Bastiman @yssybyl
11StoryStream.ai
If you are dealing with any critical
inferencing do not take shortcuts, do it
properly and do it rigorously and stand up
to the company and say no - make sure
it’s clear that the timelines will be longer
to get it right.
Dr Janet Bastiman @yssybyl
12StoryStream.ai
Without Data ML is just a Random Result
Dr Janet Bastiman @yssybyl
● Legal public sources
● https://github.com/awesomedata/awesome-public-datasets
● https://www.kaggle.com/datasets
● Take your own pictures/videos
● access/permission?
● Slow and inconsistent
● Scrape the client site with permission
13StoryStream.ai
How much data?
Dr Janet Bastiman @yssybyl
• Vision: 1000 images per output class but depends on
complexity of the problem
• Time series: at least double the time period over which you
are predicting, but be cautious of data becoming irrelevant
• Text: very variable depending on the problem
• This also changes if you already have pre-trained networks
that you’re updating
14StoryStream.ai
What do you do with the Data?
Dr Janet Bastiman @yssybyl
● Selection bias
● Random Sampling
● Over coverage
● Undercoverage
● Measurement (Response) error
● Processing errors
● Participation bias
15StoryStream.ai
What do you do with the Data?
Dr Janet Bastiman @yssybyl
Photos
Scrape
S3 bucket ● Unique filename
● source
● Set uuid (if multiple images of
same car)
● Date taken
● S3 bucket per vehicle variant
16StoryStream.ai
What do you do with the Data?
Dr Janet Bastiman @yssybyl
Photos
Scrape
Car
Detector
S3
Bucket
Manual
verification
● Extra field for label
● S3 bucket name became
mostly irrelevant
17StoryStream.ai
Crowdsource labelling
Dr Janet Bastiman @yssybyl
https://xkcd.com/1897/
19StoryStream.ai
Data Pipeline
Dr Janet Bastiman @yssybyl
Data In
Object
detector
Images
saved
Auxiliary
info saved
Temp public
access
Extract for
Turk
Import of
results
Dashboard
Expert
clean
Data
Ready
21StoryStream.ai
Transfer Learning
Dr Janet Bastiman @yssybyl
● Use transfer learning - fix most of the weights of
a good network and adapt the last few layers
● Fast and easy retraining and works with smaller
data sets in a variety of fields
● (image) https://arxiv.org/abs/1903.02196
● (series) https://arxiv.org/abs/1907.01332
● (audio) https://arxiv.org/abs/1909.07526
Deep Learning for Vision Systems, Mohamed Elgendy
22StoryStream.ai
Unbalanced Data
Dr Janet Bastiman @yssybyl
23StoryStream.ai https://www.designhacks.co/products/cognitive-bias-codex-poster
25StoryStream.ai
Stand on the shoulders
of giants…
Dr Janet Bastiman @yssybyl
● For some problems CNNs are robust to
noisy labels and up to 20 time noise to
real labels can still give business level
accuracy
https://arxiv.org/pdf/1705.10694.pdf
● Find the right architecture
http://www.asimovinstitute.org/neural-network-zoo/
26StoryStream.ai
Go old school
Dr Janet Bastiman @yssybyl
Reduce the dimensionality of the problem and use Bayesian approach, KNN or SVM
https://xkcd.com/2059/
27StoryStream.ai
Choose wisely
Dr Janet Bastiman @yssybyl
28StoryStream.ai
Simplify the problem
Dr Janet Bastiman @yssybyl
Removal of camera artefacts in eye images to
make detection easier - Jeffrey De Fauw
http://blog.kaggle.com/2015/08/10/detecting-diabetic-
retinopathy-in-eye-images/
Image Image
Specific
Vehicle
Specific
Vehicle
Car?
Make?
Removal of Doppler effect on moving source using
fractional octave band shifting, F Mobley
https://asa.scitation.org/doi/pdf/10.1121/2.0000578?class=pdf
Δ𝑛=−r[𝑙𝑜𝑔2(1−𝑀cos𝜃sin𝜑)]
29StoryStream.ai
Get every last drop from what you have
Dr Janet Bastiman @yssybyl
Statistical anatomical modelling for efficient and
personalised spine biomechanical models - I Castro
Mateos PhD thesis
Have a toolkit of augmentation
approaches but choose what’s relevant to
your needs...
30StoryStream.ai
Augmentation - detail
Dr Janet Bastiman @yssybyl
● Flip L/R U/D
● Rotations
● Reduce or enlarge bounding box coordinates by N%
● Add occlusions
https://www.umbc.edu/rssipl/people/aplaza/Papers/Journals/2019
.GRSL.Occlusion.pdf
● Change hue saturation and value of colours in the image
https://arxiv.org/pdf/1902.06543.pdf
● Copypairing - https://arxiv.org/abs/1909.00390#
34StoryStream.ai
Infrastructure
Dr Janet Bastiman @yssybyl
Data In Data Store
Taxonomy
Classifier
Definition
Test Set
DockerHub
Setup
Codeship
Project
GitHub
Setup
Notification
Slack
Email
Template
AWS
Image
Scripts
Dashboard
35StoryStream.ai
Cloud Formation
Dr Janet Bastiman @yssybyl
36StoryStream.ai
Automation
Dr Janet Bastiman @yssybyl
Delete
local data
Build
container
Get model
and key
Run test
harness
Validate
container
Run
container
Report
results
DashboardCommit
Build new
Container
37StoryStream.ai
Stack Automation
Dr Janet Bastiman @yssybyl
Add new
container
Start stack
Run stack
test harness
Better?
Compare
results
Create docs
YesUpdate CFLive
No
Human
investigation
38StoryStream.ai
Automatic Documentation
Dr Janet Bastiman @yssybyl
LaTeX
templates
Pweave
.tex files
and images
Save with
model files
Convert to
PDF
Run LaTeX
If live, save
in live docs
Email to
team
40StoryStream.ai
Did we make it?
Dr Janet Bastiman @yssybyl
● Some really difficult images
● Only expected images were
given
● Where it was wrong it was
(mostly) sensibly wrong
● Client happy
● Cool automated system
41StoryStream.ai
The Playbook
Dr Janet Bastiman @yssybyl
ai-playbook.com
42StoryStream.ai
Dr Janet Bastiman @yssybyl
Thank You
https://xkcd.com/2191/
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
poc-ml/

Weitere ähnliche Inhalte

Mehr von C4Media

Mehr von C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

From POC to Production in Minimal Time - Avoiding Pain in ML Projects

  • 1. 1StoryStream.ai From POC to Production in Minimal Time – Avoiding Pain in ML Projects Dr Janet Bastiman @yssybyl
  • 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ poc-ml/
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 5. 3StoryStream.ai The world’s leading automotive content platform StoryStream is a dedicated automotive content platform, trusted by some of the world’s leading car brands. Specifically created to help automotive brands provide a more relevant, engaging customer experience, fuelled with authentic content and designed for efficiently scaling content operations across global teams. ● Grow customer engagement and conversions by up to 25% ● Reduce content creation and management costs by up to 60% ● Provide a more authentic customer experience ● Understand your customer in a deeper way About StoryStream The Core StoryStream Benefits
  • 8. 6StoryStream.ai “[Client] needs this to go live at the end of the month, I promised them we could deliver...” Every salesperson ever Dr Janet Bastiman @yssybyl
  • 9. 7StoryStream.ai Project timings Dr Janet Bastiman @yssybyl ● 35 models = 1050 days (one person linear) ● ~ 5 years for one person working Mon-Fri - who is allowed holidays :) ● 250 days with parallelisation of tasks and data upfront ● 150 days on worksheet, balanced by an increase in ongoing license
  • 10. 8StoryStream.ai Can you guess what happened next? Dr Janet Bastiman @yssybyl
  • 11. 9StoryStream.ai What would it take to get it done in that time? Dr Janet Bastiman @yssybyl The Core (2003) Paramount Pictures
  • 12. 10StoryStream.ai “They don’t have any data to give us” Dr Janet Bastiman @yssybyl
  • 13. 11StoryStream.ai If you are dealing with any critical inferencing do not take shortcuts, do it properly and do it rigorously and stand up to the company and say no - make sure it’s clear that the timelines will be longer to get it right. Dr Janet Bastiman @yssybyl
  • 14. 12StoryStream.ai Without Data ML is just a Random Result Dr Janet Bastiman @yssybyl ● Legal public sources ● https://github.com/awesomedata/awesome-public-datasets ● https://www.kaggle.com/datasets ● Take your own pictures/videos ● access/permission? ● Slow and inconsistent ● Scrape the client site with permission
  • 15. 13StoryStream.ai How much data? Dr Janet Bastiman @yssybyl • Vision: 1000 images per output class but depends on complexity of the problem • Time series: at least double the time period over which you are predicting, but be cautious of data becoming irrelevant • Text: very variable depending on the problem • This also changes if you already have pre-trained networks that you’re updating
  • 16. 14StoryStream.ai What do you do with the Data? Dr Janet Bastiman @yssybyl ● Selection bias ● Random Sampling ● Over coverage ● Undercoverage ● Measurement (Response) error ● Processing errors ● Participation bias
  • 17. 15StoryStream.ai What do you do with the Data? Dr Janet Bastiman @yssybyl Photos Scrape S3 bucket ● Unique filename ● source ● Set uuid (if multiple images of same car) ● Date taken ● S3 bucket per vehicle variant
  • 18. 16StoryStream.ai What do you do with the Data? Dr Janet Bastiman @yssybyl Photos Scrape Car Detector S3 Bucket Manual verification ● Extra field for label ● S3 bucket name became mostly irrelevant
  • 19. 17StoryStream.ai Crowdsource labelling Dr Janet Bastiman @yssybyl https://xkcd.com/1897/
  • 20. 19StoryStream.ai Data Pipeline Dr Janet Bastiman @yssybyl Data In Object detector Images saved Auxiliary info saved Temp public access Extract for Turk Import of results Dashboard Expert clean Data Ready
  • 21. 21StoryStream.ai Transfer Learning Dr Janet Bastiman @yssybyl ● Use transfer learning - fix most of the weights of a good network and adapt the last few layers ● Fast and easy retraining and works with smaller data sets in a variety of fields ● (image) https://arxiv.org/abs/1903.02196 ● (series) https://arxiv.org/abs/1907.01332 ● (audio) https://arxiv.org/abs/1909.07526 Deep Learning for Vision Systems, Mohamed Elgendy
  • 24. 25StoryStream.ai Stand on the shoulders of giants… Dr Janet Bastiman @yssybyl ● For some problems CNNs are robust to noisy labels and up to 20 time noise to real labels can still give business level accuracy https://arxiv.org/pdf/1705.10694.pdf ● Find the right architecture http://www.asimovinstitute.org/neural-network-zoo/
  • 25. 26StoryStream.ai Go old school Dr Janet Bastiman @yssybyl Reduce the dimensionality of the problem and use Bayesian approach, KNN or SVM https://xkcd.com/2059/
  • 27. 28StoryStream.ai Simplify the problem Dr Janet Bastiman @yssybyl Removal of camera artefacts in eye images to make detection easier - Jeffrey De Fauw http://blog.kaggle.com/2015/08/10/detecting-diabetic- retinopathy-in-eye-images/ Image Image Specific Vehicle Specific Vehicle Car? Make? Removal of Doppler effect on moving source using fractional octave band shifting, F Mobley https://asa.scitation.org/doi/pdf/10.1121/2.0000578?class=pdf Δ𝑛=−r[𝑙𝑜𝑔2(1−𝑀cos𝜃sin𝜑)]
  • 28. 29StoryStream.ai Get every last drop from what you have Dr Janet Bastiman @yssybyl Statistical anatomical modelling for efficient and personalised spine biomechanical models - I Castro Mateos PhD thesis Have a toolkit of augmentation approaches but choose what’s relevant to your needs...
  • 29. 30StoryStream.ai Augmentation - detail Dr Janet Bastiman @yssybyl ● Flip L/R U/D ● Rotations ● Reduce or enlarge bounding box coordinates by N% ● Add occlusions https://www.umbc.edu/rssipl/people/aplaza/Papers/Journals/2019 .GRSL.Occlusion.pdf ● Change hue saturation and value of colours in the image https://arxiv.org/pdf/1902.06543.pdf ● Copypairing - https://arxiv.org/abs/1909.00390#
  • 30. 34StoryStream.ai Infrastructure Dr Janet Bastiman @yssybyl Data In Data Store Taxonomy Classifier Definition Test Set DockerHub Setup Codeship Project GitHub Setup Notification Slack Email Template AWS Image Scripts Dashboard
  • 32. 36StoryStream.ai Automation Dr Janet Bastiman @yssybyl Delete local data Build container Get model and key Run test harness Validate container Run container Report results DashboardCommit Build new Container
  • 33. 37StoryStream.ai Stack Automation Dr Janet Bastiman @yssybyl Add new container Start stack Run stack test harness Better? Compare results Create docs YesUpdate CFLive No Human investigation
  • 34. 38StoryStream.ai Automatic Documentation Dr Janet Bastiman @yssybyl LaTeX templates Pweave .tex files and images Save with model files Convert to PDF Run LaTeX If live, save in live docs Email to team
  • 35. 40StoryStream.ai Did we make it? Dr Janet Bastiman @yssybyl ● Some really difficult images ● Only expected images were given ● Where it was wrong it was (mostly) sensibly wrong ● Client happy ● Cool automated system
  • 36. 41StoryStream.ai The Playbook Dr Janet Bastiman @yssybyl ai-playbook.com
  • 37. 42StoryStream.ai Dr Janet Bastiman @yssybyl Thank You https://xkcd.com/2191/
  • 38. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ poc-ml/