SlideShare ist ein Scribd-Unternehmen logo
1 von 63
Downloaden Sie, um offline zu lesen
Boston Machine Learning
Architecting
Recommender
Systems
Algorithm design, user experience, and system architecture
June 2018
James Kirk
Tools for
Recommender
Systems
41 - 53
Tools for building systems
quickly
Anatomy of
Recommender
Systems
3 - 19
System components and
terminology
Evaluating
Recommender
Systems
54 - 58
What makes a good
recommender system?
What We
Missed
59 - 63
Other subjects in
recommender systems
Designing
Recommender
Systems
20 - 31
Design considerations and
frameworks
Example
Recommender
Systems
32 - 40
Real-world recommender
systems and their
architectures
Table of
contents
2
Anatomy of
Recommender
Systems
Recommendation
A recommendation system presents items to users in
a relevant way.
The definition of relevant is product/context-specific.
Recommendation vs Personalization
Personalization
A personalization system presents recommendations
in a way that is relevant to the individual user.
The user expects their experience to change based
on their interactions with the system.
Relevance can still be product/context specific.
Example:
Recommendation
Example:
Personalization
Users
A user in a recommender system is the party that is
receiving and acting on the recommendations.
Sometimes the user is the context, not an actual
person.
Users vs Items
Items
An item in a recommender system is the passive party
that is being recommended to the users.
The line between these two can be blurry.
Example:
Consultant
Matchmaking
(Hypothetical)
*Personalized
Rec Sys #1
Users: Consultants*
Items: Projects
Recommend projects for
the consultant to bid on.
Rec Sys #2
Users: Projects
Items: Consultants
Recommend the right
consultant for the project.
Rec Sys #3
Users: Enterprises*
Items: Consultants
Recommend consultants
for relationship building.
Positive
Hearts, stars, likes, listens, watches, follows,
bids, purchases, hires, reads, views, upvotes…
❤
Negative
Bans, skips, angry-face-reacts, 1-star reviews,
rejections, unfollows, returns, downvotes…
Interactions
Explicit vs Implicit
Explicit actions are those that a user expects
or intends to impact their personalized
experience.
Implicit actions are all other interactions
between users and items.
Interactions
User 1
User 2
User 3
User 4
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6
Indicator Features
A feature that is unique to every user/item to
allow for direct personalization.
These features allow recommender systems
to learn about every user individually without
being diluted through metadata.
Often one-hot encoded user IDs or just an
identity matrix.
Metadata Features
Age, location, language, tags, labels, word
counts, pre-learned embeddings…
Everything that is known about a user/item
before training can be a feature if properly
structured. Should it be?
Often called “side input” or “shared features.”
User/Item Features
User/Item Features
Indicator Features Metadata Features
Encoded
Labels/Tags/et
c.
[n_users x n_user_features]
or
[n_items x n_item_features]
User 1
User 2
User 3
User 4
User 5
User 6
Representation
A (typically) low-dimensional vector that
encodes the feature information about the
user or item.
Often called “embedding,” “latent user/item,”
or “latent representation.”
Representation size, which is the dimension of
the latent space, is often referred to as
“components.”
Representation Functions
Representation Function
The process that converts user/item features
in to representations.
Learning happens here.
Common examples:
1. Matrix factorization
2. Linear kernels
3. Deep nets
4. Word2Vec
5. Autoencoders
6. None! (Pass-through)
Representation Functions
Image: Eric Nyquist
Prediction
A prediction from a recommender system is an
estimate of an item’s relevance to the user.
Predictions can be ranked for relevance.
The predictions are an indirect approximation
of the interactions.
Prediction Functions
Prediction Function
The process that converts user/item
representations in to predictions.
Common examples:
1. Dot product
2. Cosine similarity/distance
3. Euclidean similarity/distance
4. Manhattan similarity/distance*
Some systems use deep nets for prediction,
and this can be an assumption-breaker.
*Actually, Manhattan is rare
Prediction Functions
User
Item
Θ
2-Component Latent Representation Space
(2-Dimensional)
Common examples:
1. Dot product = User · Item
2. Cosine similarity = cos(Θ)
3. Euclidean similarity* = ( -1 * δ )
4. Manhattan similarity = ( -1 * |User - Item| )
*There are many methods for expressing euclidean similarity
δ
Loss Function
The process that converts predictions and
interactions in to error for learning.
Common examples:
1. Root-mean-square error (RMSE)
2. Kullback-Leibler divergence (KLD)
3. Alternating least squares* (ALS)
4. Bayesian personalized ranking* (BPR)
5. Weighted approximately ranked
pairwise (WARP)
6. Weighted margin-rank batch (WMRB)
*These are both a loss and representation function
Loss and Learning
Learning-to-rank
Some loss functions learn to approximate
the values in the interactions matrix.
Other loss functions learn to uprank positive
interactions and downrank negative
interactions (and/or non-interacted items) for
that user.
This second category of loss functions are
called learning-to-rank.
User Features
Item Features
Interactions
User Representation
Item Representation
User Representation
Function
Item Representation
Function
Prediction
Function
Predicted Scores Predicted Ranks
Training Loss
Loss Function
InputData
Output Data
Y = Prediction
p = Prediction function
r = Representation function
X = Features
Ɛ = Loss
s = Loss Function
N = Interactions
Designing
Recommender
Systems
Interactions Features Learning
What are our
interaction values?
We must select interaction values based on
what data is available, how meaningful that
data is, and how it interacts with the rest of the
system.
Considerations
❏ What user behaviors do our interactions
represent?
❏ Explicit vs implicit?
❏ Do we allow for negative interactions?
❏ How dense are our interactions?
❏ Can our recommender handle these
interactions?
How does our system
learn?
We must select representation functions that are
appropriate for our features as well as a
prediction function and loss function that will
learn effectively from this data.
Considerations
❏ What representation functions will best
encode the user/item features?
❏ What prediction function will best estimate
relevance?
❏ What loss function will learn from our data
most effectively?
❏ Do these choices scale?
What are our user/item
features?
We must select user/item features from the
data available, ensure that the data is
meaningful to the recommender system, and
ensure that our use of this data is appropriate.
Considerations
❏ Do we use indicator features?
❏ What useful metadata is available?
❏ Does the metadata require feature
engineering?
❏ Do users expect this metadata to impact
their recommendations?
What user behaviors do our
interactions represent?
Interaction values should be an
approximation of the intended effect of the
recommender system on user behavior.
If we want people to purchase, our
interactions should be related to purchases.
If we want people to binge episodes of
shows for longer, our interactions should be
related to the act of binging.
What are our interaction values?
Explicit vs Implicit
When the user gave you this signal, did they
intend/expect it to alter their
recommendations?
Some explicit signals don’t work well as
interactions.
Negative explicit signals should be handled
with simple product logic.
“You might give five stars to Hotel Rwanda and two
stars to Captain America, but you’re much more likely to
watch Captain America.”
-Todd Yellin, Netflix, You May Also Like
What are our interaction values?
Explicit vs Implicit
Does the user know we are using this signal for
recommendation?
Does the user care we are using this signal for
recommendation?
Is it ethical for us to use this signal for
recommendation?
1. Positive Positive Positive
2. Positive Positive Positive
3. No-int Negative No-int
4. No-int Negative Negative
5. No-int Negative No-int
6. No-int No-int Negative
7. Negative No-int No-int
8. Negative No-int Negative
9. Negative No-int No-int
Confusing?Do we allow negative
interactions?
Negative interactions can be valuable
statements of what content to avoid.
Negative interactions can be confusing
when learning-to-rank.
Not all loss functions accommodate negative
interactions.
What are our interaction values?
Which ordering is better?
Do we use indicator
features?
Indicator features allow for powerful
personalization but are as numerous as our
users/items.
Recommenders with user indicators can not
effectively make recommendations for new
users* (the cold-start problem).
Many users means many indicator features
-- this may not scale.
*Vice-versa is true for new items
What are our user/item features?
What useful metadata is
available?
What user/item metadata do we have that is
relevant?
Metadata that is useful but missing can be
requested from users, crowd-sourced, or
inferred with other ML systems.
Does the metadata require
feature engineering?
Pre-processing features can improve
recommender learning.
Some features may be useless/misleading
without feature engineering.
The choice of representation function
impacts the usefulness of feature
engineering.
What are our user/item features?
Do users expect this
metadata to impact their
recommendations?
Is the use of this metadata ethical*?
Users can be surprised when changing
metadata impacts product experience.
*There is a distinction between metadata used in training
and metadata used in evaluation.
What representation
functions will best encode
the user/item features?
Linear kernels are effective if all we have are
indicator features or well-engineered
features. (Matrix factorization)
More complex relationships may lead us to
neural nets. How does their architecture
impact the recommender? (Use of the latent
space)
Can the representation be learned without
interaction? (Auto-encoders, word2vec, etc)
How does our system learn?
What prediction function will
best estimate relevance?
Dot-product prediction accounts for
representation relevance and magnitude.
Cosine prediction optimizes for relevance but
has no sense for magnitude.
Euclidean prediction builds a map of items but
also has no sense for magnitude.
Should items be biased, given our choice?
What loss function will learn
from our data most
effectively?
Do we want to estimate interactions, or
perform learning-to-rank?
Should the loss function accommodate
negative interactions? (RMSE, KLD…)
Should the loss function be sensitive to
interaction magnitude? (RMSE, B-WMRB…)
Tweaking the loss function can dramatically
change how recommendations feel.
How does our system learn?
Sparse vs Dense vs Sampled
Some implementations of loss functions only
account for user/item pairs with interactions.
These same loss functions can be written to
compare every possible user/item pair. These
predictions and losses are dense, and they can
be expensive.
Some of the most effective and efficient loss
functions learn by comparing pairs with
interactions against sampled pairs.* (WARP,
WMRB)
* There are many methods for sampling candidate pairs
Example: WMRB
WMRB approximates positive item rank
against a random sample and upranks
positive items through a hinge loss.
How does our system learn?
x = User
y = Positive item
y’ = Non-positive item
Y = All items
Z = Random sample of non-positive items
p = Prediction function
Hinge
Random
Sampling
Example: Balancing WMRB
If we notice an undue popularity bias, we can
balance this by accounting for interaction
magnitudes and popularity.
How does our system learn?
x = User
y = Positive item
X = All users
p = Prediction function
n = Interaction magnitude for pair (user, item)
Balancing
Factor
We can think about a recommender system
architecture as a set of top-level decisions.
When designing recommender systems, we
are evaluating the tradeoffs between these
decisions and the relationships between
these choices.
A Framework for Recommender Systems
Interactions ?
User Features ?
User Representation ?
Item Features ?
Item Representation ?
Prediction ?
Learning ?
Example
Recommender
Systems
A collaborative filter learns representations
from interactions and uses these to make
personalized recommendations, often
through matrix factorization.
Pure collaborative filters are metadata-naïve.
Example: Collaborative Filter
Interactions *
(Positive only?)
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction *
(Dot-product for MF)
Learning ALS, BPR, SVD, PCA, NMF...
A content-based recommender learns the
item features to which a user is affined.
Purely content-based systems do no transfer
learning between users.
This allows easy rec-splanation.
This requires clean item metadata.
Example: Content-based Recommender
Interactions *
User Features Indicator
User Representation Linear
Item Features Metadata
Item Representation None
(n_components = n_item_features)
Prediction Dot-product
Learning *
A hybrid recommender system learns
representations for both user and item
metadata and indicators, if available.
This opens a lot of options for us.
Example: Hybrid Recommender System
Interactions *
User Features *
User Representation *
Item Features *
Item Representation *
Prediction *
Learning *
We can build a hybrid recommender system
to recommend personalized products based
on past purchases.
Example: Purchase Recommendations
Interactions Purchases
User Features Indicator
User Representation Linear
Item Features Indicator + Metadata
Item Representation *
Prediction Dot-product
Learning *
We can use the pre-trained purchase
recommender’s representations to provide
recommendations in a new context.
In this system, the “user” is the context item,
not the person using our product.
Example: “You May Also Like” (YMAL)
Interactions X
User Features Context Item Repr
User Representation None
Item Features All Item Reprs
Item Representation None
Prediction Dot-product, Cosine?
Learning X
We can take the output of the YMAL
recommender and re-rank the items based
on the customer’s representation.
This system does not learn. The learning’s
already been done.
Example: Personalized “You May Also Like”
Interactions X
User Features User Reprs
User Representation None
Item Features Similar Item Reprs
Item Representation None
Prediction Dot-product
Learning X
Example: Personalized “You May Also Like”
Purchase
Recommender
System
“YMAL”
Recommender
System
“YMAL”
Personalization
System
Step 1:
Learn to personalize
purchasing
recommendations
Step 2:
Use previous learning to
calculate the most similar
items
Step 3:
Personalize the similar
items by re-ranking
OR
Contextualize purchase
recommendations by
limiting the item set
Example: YouTube (Covington, Adams, Sargin)
Interactions Watches + Searches
User Features Geography, Age, Gender...
User Representation Deep net
Item Features
Pre-learned embeddings,
language, previous impressions...
Item Representation Deep net
Prediction Deep net
Learning Sampled Cross-Entropy
Tools for
Recommender
Systems
Implicit
Interactions *
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction Dot-product
Learning ALS, BPR
Implicit is a Python collaborative filter toolkit
that uses matrix factorization to learn
representations.
Includes factorization classes for ALS and
BPR.
Made by Ben Frederickson.
MIT License
Scikit-Learn
Interactions *
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction Dot-product
Learning SVD, PCA, NMF...
Scikit-learn is a Python machine learning
toolkit with many tools for feature
engineering and machine learning.
The decomposition package contains some
classes that can be used for matrix
factorization recommender systems like SVD,
PCA, NMF...
Maintained by volunteers.
BSD license
LightFM
Interactions *
User Features *
User Representation Linear
Item Features *
Item Representation Linear
Prediction Dot-product
Learning Logistic, BPR, WARP
LightFM is a Python hybrid recommender
system that uses matrix factorization to learn
representations.
Made by Lyst - a fashion shopping website.
Apache-2.0 license
TensorRec is a Python hybrid recommender
system framework for developing whole
recommender systems quickly.
Representation functions, prediction
functions, and loss functions can be
customized using TensorFlow or Keras.
Made by James Kirk.
Apache-2.0 license
TensorRec
Interactions *
User Features *
User Representation Linear, Deep nets, None...
Item Features *
Item Representation Linear, Deep nets, None...
Prediction
Dot-product, Cosine,
Euclidean...
Learning RMSE, KLD, WMRB...
Hey, that’s me
Annoy is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
representations.
Made by Spotify.
Apache-2.0 license
ANNOY (Approximate Nearest Neighbors Oh Yeah)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Prediction
Cosine, Euclidean,
Manhattan, Hamming
Learning X
Faiss is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
representations.
Allows item biases.
Made by Facebook.
BSD license
FAISS (Facebook AI Similarity Search)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Prediction Dot-product, Euclidean
Learning X
NMSLib is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
representations.
Made by Bilegsaikhan Naidan, Leonid
Boytsov, Yury Malkov, David Novak, Ben
Frederickson.
Apache-2.0 license, with some
MIT and GNU components
NMSLib (Non-Metric Space Library)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Prediction Cosine, Euclidean
Learning X
We can build a hybrid recommender system
to recommend personalized news articles
based on past reading.
Requirements:
1. We have to learn the tastes of
individual users.
2. We know users’ home location with
low resolution (country/state).
3. Articles are ephemeral. All items are
cold-start items.
4. We can vectorize article contents and
tagged categories. (politics, sports…)
5. We have to serve production-scale
user traffic.
6. We don’t have to do rec-splanation.
Example: News Article Recommendation
Interactions Clicks, page dwells...
User Features
Indicator +
vectorized locations
User Representation Linear
Item Features
TF-IDF of contents +
vectorized categories
Item Representation Deep net
Prediction Cosine
Learning Balanced WMRB
Example: News Article Recommendation
Daily Model Training
Scikit-learn
Feature
Transformation
TensorRec
Recommender
System
Annoy
Ranking
Step 1:
Vectorize historical
article contents and
metadata
Step 2:
Use vectorized article
features to learn user
representations and train
a deep net for article
representation
Step 3:
Build Annoy indices
Scikit-learn
Feature
Transformation
TensorRec
Recommender
System
Annoy
Ranking
Step 1:
Vectorize new article
contents and metadata
Step 2:
Use trained deep net to
calculate new article
representation
Step 3:
Rebuild Annoy indices
with the new article
Example: News Article Recommendation
Handling New Articles
Database
Representation
Storage
Annoy
Ranking
Step 1:
Retrieve the user
representation from the
database
Step 2:
Find most relevant
articles for the user
Example: News Article Recommendation
Serving User Traffic
Example: MovieLens with TensorRec
Interactions Movie ratings
User Features Indicator
User Representation Linear
Item Features Indicator + Movie Tags
Item Representation Linear
Prediction Dot-product
Learning Balanced WMRB
Evaluating
Recommender
Systems
Offline Evaluation
Many metrics are available for offline
evaluation to comparing predictions and
known interactions.
Most measure novelty, diversity, and
coverage.
Precision@K, Recall@K, NDCG@K…
Precision@K: “What percentage of the top K
items were positively interacted?”
Recall@K: “What percentage of users’
positively interacted items were in the top K
results?”
What makes a good recommender system?
Offline Pitfalls
Many offline metrics don’t represent fairness
of performance between users or items.
These metrics can be useful for
hyperparameter optimization, but often fail to
evaluate the “feel” of recommendations.
It is hard to use offline metrics to state that
one recommender system is better than
another.
Example: Offline Pitfalls
Three recommendation results for two users.
User 1 has 5 positive interactions.
User 2 has 2 positive interactions.
The third recommendation system is the
most broadly effective, and probably the
“best.” Precision fails to identify that, but
recall does.
You can concoct similar pitfalls for recall or
NDCG.
What makes a good recommender system?
1 2 1 2 1 2
T T T T T
T T T T
T T T
T T
T
P@5: 0.5 P@5: 0.5 P@5: 0.5
R@5: 0.65 R@5: 0.5 R@5: 0.8
Online Evaluation
When rolling-out a new recommender
system, the truest test is an A/B test with an
existing system.
The most effective feedback comes from
user interviewing and monitoring the user
behaviors the system is intended to drive.
If there is no existing system, do phased
roll-outs with quant/qual feedback.*
User interviewing is the only way to evaluate
the “feel” of recommendations.
*Fellow employees make great, but biased, guinea pigs
What makes a good recommender system?
Feel?
“I already own a crib, why would I need
another?”
Missing item filtering based on metadata?
“These songs are excellent, but I already
know these bands.”
Maybe we should target discovery?
“I’ve watched Captain America twenty
times, but that doesn’t mean I only want to
watch Marvel movies. What about the
sitcoms I watch?”
Maybe we’re oversimplifying the user’s
representation?
All Algorithms Are Biased
There are biases innate in the data we use,
the way users interact with our products, and
the way our algorithms learn.
Controlling for this is not as simple as setting
biased=False.
When designing these systems, we have a
responsibility to, at the least, understand the
biases in our products.
You wouldn’t ship a product without tests.
You shouldn’t ship a RecSys without
examining bias.
Algorithmic Bias and Fairness
Understanding Fairness
There are many of definitions of fairness.
Some cross-section recommender
performance by user and item metadata.
C-fairness
Is recommendation recall significantly lower
for customers in Massachusetts?
P-fairness
Are movies with female leads recommended
less often than in the natural distribution of
movie watching?
Missing metadata? Crowdsource it, but be
careful with sensitive metadata.
What We Missed
1
2
3
4
5
6
What We Missed
Sequence-based models
In what order do our users interact with
our items?
Mixture-of-tastes models
Is one representation per user enough
for users with diverse tastes?
Rec-splanation
How do system design choices impact
interpretability?
Attention models
Can we learn more nuance to user
representation that just a vector?
Graphical models
Can we map relationships between
users, items, and their attributes?
Cold-start problems
How do we make recommendations for
brand-new users?
Wait, is it “recommender systems”
or “recommendation systems?”
Wait, is it “recommender systems”
or “recommendation systems?”
¯_(ツ)_/¯
Thank you!
Questions?
James Kirk
@jiminy_kirket
/jkirk12
@jameskirk1
/jfkirk

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsT212
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemAkshat Thakar
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 

Was ist angesagt? (20)

Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 

Ähnlich wie Boston ML - Architecting Recommender Systems

Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 
Colleges yvonne van_laarhoven
Colleges yvonne van_laarhovenColleges yvonne van_laarhoven
Colleges yvonne van_laarhovenDigital Power
 
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling PersonalizationBogo Vatovec
 
Usability in product development
Usability in product developmentUsability in product development
Usability in product developmentRavi Shyam
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemMilind Gokhale
 
Essay On Mkt Research
Essay On Mkt ResearchEssay On Mkt Research
Essay On Mkt ResearchSara Harris
 
Towards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptxTowards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptxLuis775803
 
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online ReviewIRJET Journal
 
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...IRJET Journal
 
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendationSimon Belak
 
Artifact Facet Ranking and It’sApplications
Artifact Facet Ranking and It’sApplicationsArtifact Facet Ranking and It’sApplications
Artifact Facet Ranking and It’sApplicationsMangaiK4
 
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rulesPreeti Mishra
 
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesQuestionPro
 
IRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET Journal
 
data-science-lifecycle-ebook.pdf
data-science-lifecycle-ebook.pdfdata-science-lifecycle-ebook.pdf
data-science-lifecycle-ebook.pdfDanilo Cardona
 
This Chapter Will Describe About The Software Requirements...
This Chapter Will Describe About The Software Requirements...This Chapter Will Describe About The Software Requirements...
This Chapter Will Describe About The Software Requirements...Anita Strong
 
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingRachel Davis
 
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...IRJET Journal
 
Personalized recommendation for cold start users
Personalized recommendation for cold start usersPersonalized recommendation for cold start users
Personalized recommendation for cold start usersIRJET Journal
 

Ähnlich wie Boston ML - Architecting Recommender Systems (20)

Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Colleges yvonne van_laarhoven
Colleges yvonne van_laarhovenColleges yvonne van_laarhoven
Colleges yvonne van_laarhoven
 
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling Personalization
 
Usability in product development
Usability in product developmentUsability in product development
Usability in product development
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Essay On Mkt Research
Essay On Mkt ResearchEssay On Mkt Research
Essay On Mkt Research
 
Towards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptxTowards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptx
 
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online Review
 
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
 
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendation
 
Artifact Facet Ranking and It’sApplications
Artifact Facet Ranking and It’sApplicationsArtifact Facet Ranking and It’sApplications
Artifact Facet Ranking and It’sApplications
 
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rules
 
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
 
IRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation System
 
data-science-lifecycle-ebook.pdf
data-science-lifecycle-ebook.pdfdata-science-lifecycle-ebook.pdf
data-science-lifecycle-ebook.pdf
 
How to write use cases
How to write use casesHow to write use cases
How to write use cases
 
This Chapter Will Describe About The Software Requirements...
This Chapter Will Describe About The Software Requirements...This Chapter Will Describe About The Software Requirements...
This Chapter Will Describe About The Software Requirements...
 
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
 
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
 
Personalized recommendation for cold start users
Personalized recommendation for cold start usersPersonalized recommendation for cold start users
Personalized recommendation for cold start users
 

Kürzlich hochgeladen

eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?AmeliaSmith90
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyRaymond Okyere-Forson
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Jonathan Katz
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxAutus Cyber Tech
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
How to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareHow to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareNYGGS Automation Suite
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
Understanding Native Mobile App Development
Understanding Native Mobile App DevelopmentUnderstanding Native Mobile App Development
Understanding Native Mobile App DevelopmentMobulous Technologies
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Jaydeep Chhasatia
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 

Kürzlich hochgeladen (20)

eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human Beauty
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)Vectors are the new JSON in PostgreSQL (SCaLE 21x)
Vectors are the new JSON in PostgreSQL (SCaLE 21x)
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptx
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in Trivandrum
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
How to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS SoftwareHow to Improve the Employee Experience? - HRMS Software
How to Improve the Employee Experience? - HRMS Software
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
Understanding Native Mobile App Development
Understanding Native Mobile App DevelopmentUnderstanding Native Mobile App Development
Understanding Native Mobile App Development
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 

Boston ML - Architecting Recommender Systems

  • 1. Boston Machine Learning Architecting Recommender Systems Algorithm design, user experience, and system architecture June 2018 James Kirk
  • 2. Tools for Recommender Systems 41 - 53 Tools for building systems quickly Anatomy of Recommender Systems 3 - 19 System components and terminology Evaluating Recommender Systems 54 - 58 What makes a good recommender system? What We Missed 59 - 63 Other subjects in recommender systems Designing Recommender Systems 20 - 31 Design considerations and frameworks Example Recommender Systems 32 - 40 Real-world recommender systems and their architectures Table of contents 2
  • 4. Recommendation A recommendation system presents items to users in a relevant way. The definition of relevant is product/context-specific. Recommendation vs Personalization Personalization A personalization system presents recommendations in a way that is relevant to the individual user. The user expects their experience to change based on their interactions with the system. Relevance can still be product/context specific.
  • 7. Users A user in a recommender system is the party that is receiving and acting on the recommendations. Sometimes the user is the context, not an actual person. Users vs Items Items An item in a recommender system is the passive party that is being recommended to the users. The line between these two can be blurry.
  • 8. Example: Consultant Matchmaking (Hypothetical) *Personalized Rec Sys #1 Users: Consultants* Items: Projects Recommend projects for the consultant to bid on. Rec Sys #2 Users: Projects Items: Consultants Recommend the right consultant for the project. Rec Sys #3 Users: Enterprises* Items: Consultants Recommend consultants for relationship building.
  • 9. Positive Hearts, stars, likes, listens, watches, follows, bids, purchases, hires, reads, views, upvotes… ❤ Negative Bans, skips, angry-face-reacts, 1-star reviews, rejections, unfollows, returns, downvotes… Interactions Explicit vs Implicit Explicit actions are those that a user expects or intends to impact their personalized experience. Implicit actions are all other interactions between users and items.
  • 10. Interactions User 1 User 2 User 3 User 4 Item 1 Item 2 Item 3 Item 4 Item 5 Item 6
  • 11. Indicator Features A feature that is unique to every user/item to allow for direct personalization. These features allow recommender systems to learn about every user individually without being diluted through metadata. Often one-hot encoded user IDs or just an identity matrix. Metadata Features Age, location, language, tags, labels, word counts, pre-learned embeddings… Everything that is known about a user/item before training can be a feature if properly structured. Should it be? Often called “side input” or “shared features.” User/Item Features
  • 12. User/Item Features Indicator Features Metadata Features Encoded Labels/Tags/et c. [n_users x n_user_features] or [n_items x n_item_features] User 1 User 2 User 3 User 4 User 5 User 6
  • 13. Representation A (typically) low-dimensional vector that encodes the feature information about the user or item. Often called “embedding,” “latent user/item,” or “latent representation.” Representation size, which is the dimension of the latent space, is often referred to as “components.” Representation Functions Representation Function The process that converts user/item features in to representations. Learning happens here. Common examples: 1. Matrix factorization 2. Linear kernels 3. Deep nets 4. Word2Vec 5. Autoencoders 6. None! (Pass-through)
  • 15. Prediction A prediction from a recommender system is an estimate of an item’s relevance to the user. Predictions can be ranked for relevance. The predictions are an indirect approximation of the interactions. Prediction Functions Prediction Function The process that converts user/item representations in to predictions. Common examples: 1. Dot product 2. Cosine similarity/distance 3. Euclidean similarity/distance 4. Manhattan similarity/distance* Some systems use deep nets for prediction, and this can be an assumption-breaker. *Actually, Manhattan is rare
  • 16. Prediction Functions User Item Θ 2-Component Latent Representation Space (2-Dimensional) Common examples: 1. Dot product = User · Item 2. Cosine similarity = cos(Θ) 3. Euclidean similarity* = ( -1 * δ ) 4. Manhattan similarity = ( -1 * |User - Item| ) *There are many methods for expressing euclidean similarity δ
  • 17. Loss Function The process that converts predictions and interactions in to error for learning. Common examples: 1. Root-mean-square error (RMSE) 2. Kullback-Leibler divergence (KLD) 3. Alternating least squares* (ALS) 4. Bayesian personalized ranking* (BPR) 5. Weighted approximately ranked pairwise (WARP) 6. Weighted margin-rank batch (WMRB) *These are both a loss and representation function Loss and Learning Learning-to-rank Some loss functions learn to approximate the values in the interactions matrix. Other loss functions learn to uprank positive interactions and downrank negative interactions (and/or non-interacted items) for that user. This second category of loss functions are called learning-to-rank.
  • 18. User Features Item Features Interactions User Representation Item Representation User Representation Function Item Representation Function Prediction Function Predicted Scores Predicted Ranks Training Loss Loss Function InputData Output Data
  • 19. Y = Prediction p = Prediction function r = Representation function X = Features Ɛ = Loss s = Loss Function N = Interactions
  • 21. Interactions Features Learning What are our interaction values? We must select interaction values based on what data is available, how meaningful that data is, and how it interacts with the rest of the system. Considerations ❏ What user behaviors do our interactions represent? ❏ Explicit vs implicit? ❏ Do we allow for negative interactions? ❏ How dense are our interactions? ❏ Can our recommender handle these interactions? How does our system learn? We must select representation functions that are appropriate for our features as well as a prediction function and loss function that will learn effectively from this data. Considerations ❏ What representation functions will best encode the user/item features? ❏ What prediction function will best estimate relevance? ❏ What loss function will learn from our data most effectively? ❏ Do these choices scale? What are our user/item features? We must select user/item features from the data available, ensure that the data is meaningful to the recommender system, and ensure that our use of this data is appropriate. Considerations ❏ Do we use indicator features? ❏ What useful metadata is available? ❏ Does the metadata require feature engineering? ❏ Do users expect this metadata to impact their recommendations?
  • 22. What user behaviors do our interactions represent? Interaction values should be an approximation of the intended effect of the recommender system on user behavior. If we want people to purchase, our interactions should be related to purchases. If we want people to binge episodes of shows for longer, our interactions should be related to the act of binging. What are our interaction values? Explicit vs Implicit When the user gave you this signal, did they intend/expect it to alter their recommendations? Some explicit signals don’t work well as interactions. Negative explicit signals should be handled with simple product logic. “You might give five stars to Hotel Rwanda and two stars to Captain America, but you’re much more likely to watch Captain America.” -Todd Yellin, Netflix, You May Also Like
  • 23. What are our interaction values? Explicit vs Implicit Does the user know we are using this signal for recommendation? Does the user care we are using this signal for recommendation? Is it ethical for us to use this signal for recommendation?
  • 24. 1. Positive Positive Positive 2. Positive Positive Positive 3. No-int Negative No-int 4. No-int Negative Negative 5. No-int Negative No-int 6. No-int No-int Negative 7. Negative No-int No-int 8. Negative No-int Negative 9. Negative No-int No-int Confusing?Do we allow negative interactions? Negative interactions can be valuable statements of what content to avoid. Negative interactions can be confusing when learning-to-rank. Not all loss functions accommodate negative interactions. What are our interaction values? Which ordering is better?
  • 25. Do we use indicator features? Indicator features allow for powerful personalization but are as numerous as our users/items. Recommenders with user indicators can not effectively make recommendations for new users* (the cold-start problem). Many users means many indicator features -- this may not scale. *Vice-versa is true for new items What are our user/item features? What useful metadata is available? What user/item metadata do we have that is relevant? Metadata that is useful but missing can be requested from users, crowd-sourced, or inferred with other ML systems.
  • 26. Does the metadata require feature engineering? Pre-processing features can improve recommender learning. Some features may be useless/misleading without feature engineering. The choice of representation function impacts the usefulness of feature engineering. What are our user/item features? Do users expect this metadata to impact their recommendations? Is the use of this metadata ethical*? Users can be surprised when changing metadata impacts product experience. *There is a distinction between metadata used in training and metadata used in evaluation.
  • 27. What representation functions will best encode the user/item features? Linear kernels are effective if all we have are indicator features or well-engineered features. (Matrix factorization) More complex relationships may lead us to neural nets. How does their architecture impact the recommender? (Use of the latent space) Can the representation be learned without interaction? (Auto-encoders, word2vec, etc) How does our system learn? What prediction function will best estimate relevance? Dot-product prediction accounts for representation relevance and magnitude. Cosine prediction optimizes for relevance but has no sense for magnitude. Euclidean prediction builds a map of items but also has no sense for magnitude. Should items be biased, given our choice?
  • 28. What loss function will learn from our data most effectively? Do we want to estimate interactions, or perform learning-to-rank? Should the loss function accommodate negative interactions? (RMSE, KLD…) Should the loss function be sensitive to interaction magnitude? (RMSE, B-WMRB…) Tweaking the loss function can dramatically change how recommendations feel. How does our system learn? Sparse vs Dense vs Sampled Some implementations of loss functions only account for user/item pairs with interactions. These same loss functions can be written to compare every possible user/item pair. These predictions and losses are dense, and they can be expensive. Some of the most effective and efficient loss functions learn by comparing pairs with interactions against sampled pairs.* (WARP, WMRB) * There are many methods for sampling candidate pairs
  • 29. Example: WMRB WMRB approximates positive item rank against a random sample and upranks positive items through a hinge loss. How does our system learn? x = User y = Positive item y’ = Non-positive item Y = All items Z = Random sample of non-positive items p = Prediction function Hinge Random Sampling
  • 30. Example: Balancing WMRB If we notice an undue popularity bias, we can balance this by accounting for interaction magnitudes and popularity. How does our system learn? x = User y = Positive item X = All users p = Prediction function n = Interaction magnitude for pair (user, item) Balancing Factor
  • 31. We can think about a recommender system architecture as a set of top-level decisions. When designing recommender systems, we are evaluating the tradeoffs between these decisions and the relationships between these choices. A Framework for Recommender Systems Interactions ? User Features ? User Representation ? Item Features ? Item Representation ? Prediction ? Learning ?
  • 33. A collaborative filter learns representations from interactions and uses these to make personalized recommendations, often through matrix factorization. Pure collaborative filters are metadata-naïve. Example: Collaborative Filter Interactions * (Positive only?) User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction * (Dot-product for MF) Learning ALS, BPR, SVD, PCA, NMF...
  • 34. A content-based recommender learns the item features to which a user is affined. Purely content-based systems do no transfer learning between users. This allows easy rec-splanation. This requires clean item metadata. Example: Content-based Recommender Interactions * User Features Indicator User Representation Linear Item Features Metadata Item Representation None (n_components = n_item_features) Prediction Dot-product Learning *
  • 35. A hybrid recommender system learns representations for both user and item metadata and indicators, if available. This opens a lot of options for us. Example: Hybrid Recommender System Interactions * User Features * User Representation * Item Features * Item Representation * Prediction * Learning *
  • 36. We can build a hybrid recommender system to recommend personalized products based on past purchases. Example: Purchase Recommendations Interactions Purchases User Features Indicator User Representation Linear Item Features Indicator + Metadata Item Representation * Prediction Dot-product Learning *
  • 37. We can use the pre-trained purchase recommender’s representations to provide recommendations in a new context. In this system, the “user” is the context item, not the person using our product. Example: “You May Also Like” (YMAL) Interactions X User Features Context Item Repr User Representation None Item Features All Item Reprs Item Representation None Prediction Dot-product, Cosine? Learning X
  • 38. We can take the output of the YMAL recommender and re-rank the items based on the customer’s representation. This system does not learn. The learning’s already been done. Example: Personalized “You May Also Like” Interactions X User Features User Reprs User Representation None Item Features Similar Item Reprs Item Representation None Prediction Dot-product Learning X
  • 39. Example: Personalized “You May Also Like” Purchase Recommender System “YMAL” Recommender System “YMAL” Personalization System Step 1: Learn to personalize purchasing recommendations Step 2: Use previous learning to calculate the most similar items Step 3: Personalize the similar items by re-ranking OR Contextualize purchase recommendations by limiting the item set
  • 40. Example: YouTube (Covington, Adams, Sargin) Interactions Watches + Searches User Features Geography, Age, Gender... User Representation Deep net Item Features Pre-learned embeddings, language, previous impressions... Item Representation Deep net Prediction Deep net Learning Sampled Cross-Entropy
  • 42. Implicit Interactions * User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction Dot-product Learning ALS, BPR Implicit is a Python collaborative filter toolkit that uses matrix factorization to learn representations. Includes factorization classes for ALS and BPR. Made by Ben Frederickson. MIT License
  • 43. Scikit-Learn Interactions * User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction Dot-product Learning SVD, PCA, NMF... Scikit-learn is a Python machine learning toolkit with many tools for feature engineering and machine learning. The decomposition package contains some classes that can be used for matrix factorization recommender systems like SVD, PCA, NMF... Maintained by volunteers. BSD license
  • 44. LightFM Interactions * User Features * User Representation Linear Item Features * Item Representation Linear Prediction Dot-product Learning Logistic, BPR, WARP LightFM is a Python hybrid recommender system that uses matrix factorization to learn representations. Made by Lyst - a fashion shopping website. Apache-2.0 license
  • 45. TensorRec is a Python hybrid recommender system framework for developing whole recommender systems quickly. Representation functions, prediction functions, and loss functions can be customized using TensorFlow or Keras. Made by James Kirk. Apache-2.0 license TensorRec Interactions * User Features * User Representation Linear, Deep nets, None... Item Features * Item Representation Linear, Deep nets, None... Prediction Dot-product, Cosine, Euclidean... Learning RMSE, KLD, WMRB... Hey, that’s me
  • 46. Annoy is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Made by Spotify. Apache-2.0 license ANNOY (Approximate Nearest Neighbors Oh Yeah) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Cosine, Euclidean, Manhattan, Hamming Learning X
  • 47. Faiss is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Allows item biases. Made by Facebook. BSD license FAISS (Facebook AI Similarity Search) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Dot-product, Euclidean Learning X
  • 48. NMSLib is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Made by Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak, Ben Frederickson. Apache-2.0 license, with some MIT and GNU components NMSLib (Non-Metric Space Library) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Cosine, Euclidean Learning X
  • 49. We can build a hybrid recommender system to recommend personalized news articles based on past reading. Requirements: 1. We have to learn the tastes of individual users. 2. We know users’ home location with low resolution (country/state). 3. Articles are ephemeral. All items are cold-start items. 4. We can vectorize article contents and tagged categories. (politics, sports…) 5. We have to serve production-scale user traffic. 6. We don’t have to do rec-splanation. Example: News Article Recommendation Interactions Clicks, page dwells... User Features Indicator + vectorized locations User Representation Linear Item Features TF-IDF of contents + vectorized categories Item Representation Deep net Prediction Cosine Learning Balanced WMRB
  • 50. Example: News Article Recommendation Daily Model Training Scikit-learn Feature Transformation TensorRec Recommender System Annoy Ranking Step 1: Vectorize historical article contents and metadata Step 2: Use vectorized article features to learn user representations and train a deep net for article representation Step 3: Build Annoy indices
  • 51. Scikit-learn Feature Transformation TensorRec Recommender System Annoy Ranking Step 1: Vectorize new article contents and metadata Step 2: Use trained deep net to calculate new article representation Step 3: Rebuild Annoy indices with the new article Example: News Article Recommendation Handling New Articles
  • 52. Database Representation Storage Annoy Ranking Step 1: Retrieve the user representation from the database Step 2: Find most relevant articles for the user Example: News Article Recommendation Serving User Traffic
  • 53. Example: MovieLens with TensorRec Interactions Movie ratings User Features Indicator User Representation Linear Item Features Indicator + Movie Tags Item Representation Linear Prediction Dot-product Learning Balanced WMRB
  • 55. Offline Evaluation Many metrics are available for offline evaluation to comparing predictions and known interactions. Most measure novelty, diversity, and coverage. Precision@K, Recall@K, NDCG@K… Precision@K: “What percentage of the top K items were positively interacted?” Recall@K: “What percentage of users’ positively interacted items were in the top K results?” What makes a good recommender system? Offline Pitfalls Many offline metrics don’t represent fairness of performance between users or items. These metrics can be useful for hyperparameter optimization, but often fail to evaluate the “feel” of recommendations. It is hard to use offline metrics to state that one recommender system is better than another.
  • 56. Example: Offline Pitfalls Three recommendation results for two users. User 1 has 5 positive interactions. User 2 has 2 positive interactions. The third recommendation system is the most broadly effective, and probably the “best.” Precision fails to identify that, but recall does. You can concoct similar pitfalls for recall or NDCG. What makes a good recommender system? 1 2 1 2 1 2 T T T T T T T T T T T T T T T P@5: 0.5 P@5: 0.5 P@5: 0.5 R@5: 0.65 R@5: 0.5 R@5: 0.8
  • 57. Online Evaluation When rolling-out a new recommender system, the truest test is an A/B test with an existing system. The most effective feedback comes from user interviewing and monitoring the user behaviors the system is intended to drive. If there is no existing system, do phased roll-outs with quant/qual feedback.* User interviewing is the only way to evaluate the “feel” of recommendations. *Fellow employees make great, but biased, guinea pigs What makes a good recommender system? Feel? “I already own a crib, why would I need another?” Missing item filtering based on metadata? “These songs are excellent, but I already know these bands.” Maybe we should target discovery? “I’ve watched Captain America twenty times, but that doesn’t mean I only want to watch Marvel movies. What about the sitcoms I watch?” Maybe we’re oversimplifying the user’s representation?
  • 58. All Algorithms Are Biased There are biases innate in the data we use, the way users interact with our products, and the way our algorithms learn. Controlling for this is not as simple as setting biased=False. When designing these systems, we have a responsibility to, at the least, understand the biases in our products. You wouldn’t ship a product without tests. You shouldn’t ship a RecSys without examining bias. Algorithmic Bias and Fairness Understanding Fairness There are many of definitions of fairness. Some cross-section recommender performance by user and item metadata. C-fairness Is recommendation recall significantly lower for customers in Massachusetts? P-fairness Are movies with female leads recommended less often than in the natural distribution of movie watching? Missing metadata? Crowdsource it, but be careful with sensitive metadata.
  • 60. 1 2 3 4 5 6 What We Missed Sequence-based models In what order do our users interact with our items? Mixture-of-tastes models Is one representation per user enough for users with diverse tastes? Rec-splanation How do system design choices impact interpretability? Attention models Can we learn more nuance to user representation that just a vector? Graphical models Can we map relationships between users, items, and their attributes? Cold-start problems How do we make recommendations for brand-new users?
  • 61. Wait, is it “recommender systems” or “recommendation systems?”
  • 62. Wait, is it “recommender systems” or “recommendation systems?” ¯_(ツ)_/¯