Scoring Metrics for Classification Models

© 2019 KNIME AG. All rights reserved.
Scoring Metrics for Classification Models
KNIME: Maarit.Widmann@knime.com
@KNIME
What to do after training a machine learning algorithm

Introduction
There are many scoring metrics for a
classification model.
Which of them fits depends on your
classification problem.
2

Different Scoring Metrics
3
1. Confusion Matrix
• True positives
• False negatives
• False positives
• True negatives
2. Sensitivity and Specificity
3. Precision and Recall
4. F-measure
5. Overall Accuracy and Cohen‘s kappa

Why different scoring metrics?
4
1. What is your objective?
2. What is the target class distribution?
3. Is the target binomial or multinomial?

Introduction
Doctor‘s diagnosis as an example of classification
5
Sample of patients
Disease carrier
Healthy
Diagnosis
Classification results
Positive
class
Negative
class

Confusion Matrix
6

Introduction
Disease detection using a machine learning algorithm
Sample of patients
split into training (80 %)
and test (20 %) set
Model
training and
prediction
Evaluation of
classification results

Scoring Metrics for Classification Models
9

Sensitivity
• Sensitivity=
𝑻𝑷
𝑻𝑷+𝑭𝑵
=
𝟑
𝟑+𝟏
= 𝟎. 𝟕𝟓
Are ALL positive class events found by the model?

Specificity
• Specificity=
𝑻𝑵
𝑻𝑵+𝑭𝑷
=
𝟐𝟎
𝟐𝟎+𝟒
= 𝟎. 𝟖𝟑
Are ALL negative class events found by the model?
Sensitivity: Is the
model sensitive to
detecting disease?
Specificity: Is the
disease diagnosis
specific?

Recall
• Recall=
𝑻𝑷
𝑻𝑷+𝑭𝑵
=
𝟑
𝟑+𝟏
= 𝟎. 𝟕𝟓
Are ALL positive class events found by the model?

Precision
• Precision=
𝑻𝑷
𝑻𝑷+𝑭𝑷
=
𝟑
𝟑+𝟒
= 𝟎. 𝟒𝟑
Are ONLY positive class events found by the model?
Recall: Detect the
most disease
carriers
Precision: Make
precise disease
prediction

Defining the Classification Threshold
Machine learning model predicts each patient Score(Diagnosis=disease carrier). The
class assignment is based on the set threshold for this score.
True
positives
False
positives
False
negatives
True
negatives
Score(Diagnosis=
disease carrier)0 0.5 1
Recall
↓↑
Precision

F-measure
• F-measure= 2 ∗
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙
= 2 ∗
0.43∗0.75
0.43+0.75
≈ 0.55
Harmonic mean of
precision and
recall.
Are ALL and ONLY positive class events found by the model?

Overall Accuracy
• Overall Accuracy=
𝑻𝑷+𝑻𝑵
𝑻𝑷+𝑭𝑷+𝑭𝑵+𝑻𝑵
=
𝟑+𝟐𝟎
𝟑+𝟏+𝟒+𝟐𝟎
= 𝟎. 𝟖𝟐
Target class
distribution
must be
balanced!
Probability of classifying a positive OR negative class event correctly.

Cohen‘s kappa (𝜿)
• 𝜅 =
𝑝0−𝑝 𝑒
1−𝑝 𝑒
, where
𝑝0 is the overall accuracy by the model
𝑝 𝑒 = 𝑝 𝑒1 + 𝑝 𝑒2
𝑝 𝑒1 = 𝑝 𝑝𝑟𝑒𝑑="𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑐𝑎𝑟𝑟𝑖𝑒𝑟" × 𝑝 𝑎𝑐𝑡="𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑐𝑎𝑟𝑟𝑖𝑒𝑟"
𝑝 𝑒2 = 𝑝 𝑝𝑟𝑒𝑑="ℎ𝑒𝑎𝑙𝑡ℎ𝑦" × 𝑝 𝑎𝑐𝑡="ℎ𝑒𝑎𝑙𝑡ℎ𝑦"
The overall accuracy using
a random classifier.

𝑝 𝑒1 =
7
28
×
4
28
𝑝 𝑒2 =
21
28
×
24
28
𝑝 𝑒 = 𝑝 𝑒1 + 𝑝 𝑒2 = 0.68
𝑝0 =
23
28
= 0.82
𝜅 =
𝑝0−𝑝 𝑒
1−𝑝 𝑒
=
0.14
0.32
≈ 0.44
Cohen‘s kappa (𝜿) vs. Overall accuracy
Overall
accuracy
𝑝 𝑒1 =
5
28
×
4
28
𝑝 𝑒2 =
23
28
×
24
28
𝑝 𝑒 = 𝑝 𝑒1 + 𝑝 𝑒2 = 0.73
𝑝0 =
21
28
= 0.75
𝜅 =
𝑝0−𝑝 𝑒
1−𝑝 𝑒
=
0.02
0.27
= 0.07
New
model
𝜅 = 1: perfect model
performance
𝜅 = 0: performance of a
random classifier

Scoring Metrics for a Multivariate Classification Model
22
Sample of patients
Disease carrier
Healthy
Diagnosis Classification results
Recessive
disease carrier
Positive
class
Negative
class

Confusion Matrix
23
True
positives
False
positives
False
negatives
True
negatives

Classification Model Evaluation in KNIME
26

Scorer (JavaScript) node
27

Interactive View: Confusion Matrix
28

Interactive View: Confusion Matrix
29

Workflow for Classification
30
• On KNIME Workflow Hub:
Evaluating Classification Model Performance
• On EXAMPLES Server:
EXAMPLES/04_Analytics/10_Scoring/01_Eval
uating_Classification_Model_Performance

Summary
• After training a classification model, the model performance is
reported using scoring metrics
• Scoring metrics describe and compare the model performance
• Confusion matrix shows the numbers of correct and incorrect
predictions
• Class statistics and overall accuracy statistics are based on the values in
the confusion matrix
32

KNIME Fall Summit 2019
November 5 – 8 at AT&T Executive Education and Conference Center,
Austin, Texas
• Tuesday & Wednesday: One-day courses
• Thursday & Friday: Summit sessions
Register by October 1 for
Early Bird Discount!
Register at
knime.com/summits

KNIME Beginner’s Luck
Course Book downloadable from
KNIME Press
https://www.knime.com/knimepress
with code:
SCORING-METRICS-0519

© 2019 KNIME AG. All rights reserved. 36
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME.com AG under license from KNIME GmbH,
and are registered in the United States. KNIME® is also registered in Germany.
Thank You!

Scoring Metrics for Classification Models

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Scoring Metrics for Classification Models

Ähnlich wie Scoring Metrics for Classification Models (20)

Mehr von KNIMESlides

Mehr von KNIMESlides (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scoring Metrics for Classification Models