SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Machine Learning @ Netflix
(and some lessons learned)
Yves Raimond (@moustaki)
Research/Engineering Manager
Search & Recommendations
Algorithm Engineering
Netflix evolution
Netflix scale
● > 69M members
● > 50 countries
● > 1000 device types
● > 3B hours/month
● 36% of peak US downstream traffic
Recommendations @ Netflix
● Goal: Help members find content
to watch and enjoy to maximize
satisfaction and retention
● Over 80% of what people watch
comes from our recommendations
● Top Picks, Because you Watched,
Trending Now, Row Ordering,
Evidence, Search, Search
Recommendations, Personalized
Genre Rows, ...
▪ Regression (Linear, logistic, elastic net)
▪ SVD and other Matrix Factorizations
▪ Factorization Machines
▪ Restricted Boltzmann Machines
▪ Deep Neural Networks
▪ Markov Models and Graph Algorithms
▪ Clustering
▪ Latent Dirichlet Allocation
▪ Gradient Boosted Decision Trees/Random Forests
▪ Gaussian Processes
▪ …
Models & Algorithms
Some lessons learned
Build the offline experimentation
framework first
When tackling a new problem
● What offline metrics can we compute that capture what online improvements we’
re actually trying to achieve?
● How should the input data to that evaluation be constructed (train, validation,
test)?
● How fast and easy is it to run a full cycle of offline experimentations?
○ Minimize time to first metric
● How replicable is the evaluation? How shareable are the results?
○ Provenance (see Dagobah)
○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
When tackling an old problem
● Same…
○ Were the metrics designed when first running experimentation in that space still appropriate now?
Think about distribution from the
outermost layers
1. For each combination of hyper-parameter
(e.g. grid search, random search, gaussian processes…)
2. For each subset of the training data
a. Multi-core learning (e.g. HogWild)
b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
When to use distributed learning?
● The impact of communication overhead when building distributed ML
algorithms is non-trivial
● Is your data big enough that the distribution offsets the communication overhead?
Example: Uncollapsed Gibbs sampler for LDA
(more details here)
Design production code to be
experimentation-friendly
Idea Data
Offline
Modeling
(R, Python,
MATLAB, …)
Iterate
Implement in
production
system (Java,
C++, …)
Missing post-
processing logic
Performance
issues
Actual
outputProduction environment
(A/B test) Code
discrepancies
Final
model
Data
discrepancies
Example development process
Avoid dual implementations
Shared Engine
Experiment
code
Production
code
ProductionExperiment
To be continued...
We’re hiring!
Yves Raimond (@moustaki)

Weitere ähnliche Inhalte

Was ist angesagt?

How to Have Difficult Conversations
How to Have Difficult ConversationsHow to Have Difficult Conversations
How to Have Difficult ConversationsMattan Griffel
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
10 Ways Your Boss Kills Employee Motivation
10 Ways Your Boss Kills Employee Motivation10 Ways Your Boss Kills Employee Motivation
10 Ways Your Boss Kills Employee MotivationOfficevibe
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
 
ChatGPT 101 - Vancouver ChatGPT Experts
ChatGPT 101 - Vancouver ChatGPT ExpertsChatGPT 101 - Vancouver ChatGPT Experts
ChatGPT 101 - Vancouver ChatGPT ExpertsAli Tavanayan
 
Productivity Facts Every Employee Should Know
Productivity Facts Every Employee Should KnowProductivity Facts Every Employee Should Know
Productivity Facts Every Employee Should KnowRobert Half
 
17 Things Powerful People Say
17 Things Powerful People Say17 Things Powerful People Say
17 Things Powerful People SayGetSmarter
 
How Game Developers Reach New Customers with Twitch
How Game Developers Reach New Customers with Twitch How Game Developers Reach New Customers with Twitch
How Game Developers Reach New Customers with Twitch Amazon Web Services
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPTKnoldus Inc.
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?Windzoon Technologies
 
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabit Summit
 
Customer First Creating data-driven products with a human touch by Deliveroo ...
Customer First Creating data-driven products with a human touch by Deliveroo ...Customer First Creating data-driven products with a human touch by Deliveroo ...
Customer First Creating data-driven products with a human touch by Deliveroo ...Product School
 
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...Buffer
 
Boring to Bold: Presentation Design Ideas for Non-Designers
Boring to Bold: Presentation Design Ideas for Non-DesignersBoring to Bold: Presentation Design Ideas for Non-Designers
Boring to Bold: Presentation Design Ideas for Non-DesignersMichael Gowin
 
How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023Dominik Lukes
 

Was ist angesagt? (20)

How to Have Difficult Conversations
How to Have Difficult ConversationsHow to Have Difficult Conversations
How to Have Difficult Conversations
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
10 Ways Your Boss Kills Employee Motivation
10 Ways Your Boss Kills Employee Motivation10 Ways Your Boss Kills Employee Motivation
10 Ways Your Boss Kills Employee Motivation
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 
ChatGPT 101 - Vancouver ChatGPT Experts
ChatGPT 101 - Vancouver ChatGPT ExpertsChatGPT 101 - Vancouver ChatGPT Experts
ChatGPT 101 - Vancouver ChatGPT Experts
 
Productivity Facts Every Employee Should Know
Productivity Facts Every Employee Should KnowProductivity Facts Every Employee Should Know
Productivity Facts Every Employee Should Know
 
17 Things Powerful People Say
17 Things Powerful People Say17 Things Powerful People Say
17 Things Powerful People Say
 
How Game Developers Reach New Customers with Twitch
How Game Developers Reach New Customers with Twitch How Game Developers Reach New Customers with Twitch
How Game Developers Reach New Customers with Twitch
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPT
 
The Hierarchy of Engagement
The Hierarchy of EngagementThe Hierarchy of Engagement
The Hierarchy of Engagement
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?
 
Enabling Autonomy
Enabling AutonomyEnabling Autonomy
Enabling Autonomy
 
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit SummitHabits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
Habits at Work - Merci Victoria Grace, Growth, Slack - 2016 Habit Summit
 
Customer First Creating data-driven products with a human touch by Deliveroo ...
Customer First Creating data-driven products with a human touch by Deliveroo ...Customer First Creating data-driven products with a human touch by Deliveroo ...
Customer First Creating data-driven products with a human touch by Deliveroo ...
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...
Top 10 Learnings Growing to (Almost) $10 Million ARR: Leo's presentation at S...
 
Boring to Bold: Presentation Design Ideas for Non-Designers
Boring to Bold: Presentation Design Ideas for Non-DesignersBoring to Bold: Presentation Design Ideas for Non-Designers
Boring to Bold: Presentation Design Ideas for Non-Designers
 
How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023
 

Andere mochten auch

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql TuningChris Adkin
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScriptdanwrong
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic conceptsAbhinav Vatsa
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItLeading Edge Process Consultants LLC
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With ScrumTommy Norman
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerManas Das
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communicationNingsih SM
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity ModelUzair Akram
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Abdul Basit
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of Indiakichu
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and processCharles Cotter, PhD
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber SecurityStephen Lahanas
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and DesignHaitham El-Ghareeb
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural ChangeJohnny Ordóñez
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsSai praveen Seva
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)Usersnap
 

Andere mochten auch (20)

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScript
 
Capability maturity model
Capability maturity modelCapability maturity model
Capability maturity model
 
Organizational Communication
Organizational CommunicationOrganizational Communication
Organizational Communication
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic concepts
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With Scrum
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For Beginer
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communication
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity Model
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of India
 
6 Thinking Hats
6 Thinking Hats6 Thinking Hats
6 Thinking Hats
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and process
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber Security
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and Design
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural Change
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systems
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)
 

Ähnlich wie Paris ML meetup

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistAlexey Zinoviev
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...Infoshare
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Brocade
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabsChetan Khatri
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Ali Alkan
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 

Ähnlich wie Paris ML meetup (20)

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJSJavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Unit no_1.pptx
Unit no_1.pptxUnit no_1.pptx
Unit no_1.pptx
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabs
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Data science
Data scienceData science
Data science
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
A Kaggle Talk
A Kaggle TalkA Kaggle Talk
A Kaggle Talk
 
machine learning
machine learningmachine learning
machine learning
 

Mehr von Yves Raimond

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learningYves Raimond
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Yves Raimond
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCYves Raimond
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBCYves Raimond
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebYves Raimond
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applicationsYves Raimond
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 

Mehr von Yves Raimond (11)

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBC
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBC
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the Web
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applications
 
Web of data
Web of dataWeb of data
Web of data
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 

Kürzlich hochgeladen

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptxNikhil Raut
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESNarmatha D
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 

Kürzlich hochgeladen (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptx
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIES
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 

Paris ML meetup

  • 1.
  • 2. Machine Learning @ Netflix (and some lessons learned) Yves Raimond (@moustaki) Research/Engineering Manager Search & Recommendations Algorithm Engineering
  • 4. Netflix scale ● > 69M members ● > 50 countries ● > 1000 device types ● > 3B hours/month ● 36% of peak US downstream traffic
  • 5. Recommendations @ Netflix ● Goal: Help members find content to watch and enjoy to maximize satisfaction and retention ● Over 80% of what people watch comes from our recommendations ● Top Picks, Because you Watched, Trending Now, Row Ordering, Evidence, Search, Search Recommendations, Personalized Genre Rows, ...
  • 6. ▪ Regression (Linear, logistic, elastic net) ▪ SVD and other Matrix Factorizations ▪ Factorization Machines ▪ Restricted Boltzmann Machines ▪ Deep Neural Networks ▪ Markov Models and Graph Algorithms ▪ Clustering ▪ Latent Dirichlet Allocation ▪ Gradient Boosted Decision Trees/Random Forests ▪ Gaussian Processes ▪ … Models & Algorithms
  • 8. Build the offline experimentation framework first
  • 9. When tackling a new problem ● What offline metrics can we compute that capture what online improvements we’ re actually trying to achieve? ● How should the input data to that evaluation be constructed (train, validation, test)? ● How fast and easy is it to run a full cycle of offline experimentations? ○ Minimize time to first metric ● How replicable is the evaluation? How shareable are the results? ○ Provenance (see Dagobah) ○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
  • 10. When tackling an old problem ● Same… ○ Were the metrics designed when first running experimentation in that space still appropriate now?
  • 11. Think about distribution from the outermost layers
  • 12. 1. For each combination of hyper-parameter (e.g. grid search, random search, gaussian processes…) 2. For each subset of the training data a. Multi-core learning (e.g. HogWild) b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
  • 13. When to use distributed learning? ● The impact of communication overhead when building distributed ML algorithms is non-trivial ● Is your data big enough that the distribution offsets the communication overhead?
  • 14. Example: Uncollapsed Gibbs sampler for LDA (more details here)
  • 15. Design production code to be experimentation-friendly
  • 16. Idea Data Offline Modeling (R, Python, MATLAB, …) Iterate Implement in production system (Java, C++, …) Missing post- processing logic Performance issues Actual outputProduction environment (A/B test) Code discrepancies Final model Data discrepancies Example development process
  • 17. Avoid dual implementations Shared Engine Experiment code Production code ProductionExperiment