SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
▪ Most important and widely used metric for evaluating the performance of diagnostic test
– Sensitivity : Num of true positive decision/the number of positive cases
– Specificity : Num of false negative decision/the number of negative cases
Performance Measures
Diagnostic Test
▪ Diagnostic decision making itself is ambiguous
– No clear-cut between ‘Normal’ and ‘Abnormal’
– Therefore it is more natural to rate the case using some scale.
– Ex) Five-point scale for nodules in chest radiograph
• 1(definitely benign), 2(probably malignant), 3(possibly malignant), 4(probably malignant), 5(definitely
malignant)
• There are four cut-off values : 2≥, 3≥, 4≥, 5
– Then we have multiple points pair of (sensitivity, specificity) values which can be plotted on the graph with
sensitivity as the y-axis and (1-specificity) as the x-axis
– These discrete points are called as ‘operating points’.
– We need a way to assess the performance of diagnostic test independently of the decision threshold
Why Do We Need a Curve for Performance Measure?
Operating Points
▪ The ROC curve is the estimation of all possible pairs on the graph from these operating points(A).
– Fitted or Smoothed ROC Curve(B) : Parametric estimation
• Smooth curve estimated from the operating points based on a binormal distribution assumption on the test results for both
positive and negative cases.
– Empirical ROC Curve(C) : Nonparametric estimation
• Connect all operating points with straight lines
▪ Why is it called ROC?
– The term ROC refers to the performance of a human or mechanical observer(the receiver) that has to
discriminate between radio signals contaminated by noise and noise alone. It is developed in 1950s.
Receiver Operating Characteristic
ROC Curve
▪ Even Googler …
Receiver Operating Characteristic
ROC Curve
▪ AUROC or AUC
– Average value of sensitivity for all possible value of specificity
– The value of AUC takes any value between 0 and 1 and independent from disease prevalence
– AUC of 1 means perfectly accurate test while the practical lower bound is 0.5 for random guess.
– The rating scheme(discrete or continuous) is important for the reduction of bias in the estimation of AUC
– It can be interpreted as the figure of merit(FOM), the probability that positive case is rated higher than negative cases.
▪ Frequentist Method
– Parametric AUC
• Obtained with fitted ROC curve.
• Based some assumption(Well distributed Binormal distribution
of test results, sample cases are not extremely small)
– Nonparametric AUC
• Estimated by the summation of trapezoids formed under
empirical ROC curve
• Underestimates AUC when discrete ratings are used.
▪ Bayesian Method
– Exploit prior or latent variable to express the unknown disease status
– Especially useful when the ‘gold standard’ is absent or uncertain.
Measure of Overall Diagnostic Performance
Area Under ROC Curve
▪ BiNormal Assumption
▪ Proof(Caution! Proof by KH, thus not guaranteed)
Fitted ROC Curve
Parametric ROC Curve
http://www.navan.name/roc/
▪ AUC can vary according to the sample cases.
– With same diagnostic test, the performance will vary according to the test samples.
– We can therefore choose a range of AUC in which the true value lies with certain degree confidence.
– 95% confidence interval is often used.
▪ Computation of confidence interval for AUC
– Confidence Interval :
where
,
Assessing Statistical Significance of AUC
Confidence Interval of AUC
J. A. Hanley and B. J. McNeil(1982)
https://pubs.rsna.org/doi/pdf/10.1148/radiology.143.1.7063747
▪ Overall performance of different diagnostic test can be compared using AUCs
– However, same AUCs do not mean two tests are identical.
– The equality of two ROC curves can be statistically tested using ‘a’ and ‘b’,
which completely specify the shape of ROC curve.
▪ Partial AUC
– According to the diagnostic situations,
full AUC will not be clinically meaningful.
– For screening serious disease in a high risk
group, high sensitivity is important.
– For a disease with low prevalence and
risky subsequent confirmatory test,
high specificity is important.
– In these cases, we can set a specific FPR range
(or sensitivity range) to calculate mean
sensitivity(or FPR) within that range
Comparison of Overall Diagnostic Performance
Comparing AUCs
▪ The Need for Extension to ROC
– ROC can only deal with binary decision and don’t encompass lesion locations.
– Location ROC(LROC) handle predefined regions in the image separately and compute ROC based on the number of regions and
their decision(ex) left, right lung or lobe). The readers are informed that there can be at most one lesion per image.
– Both ROC and LROC is problematic to handle multiple lesions or suspicious location in the images.
– In Region-of-interest(ROI) method, similar to LROC but deals with regions independently.
– Both LROC and ROI method cannot account for the correlations among the regions in the same image.
– Free response task means the reader is given no prior information regarding the number of lesion in the image, and therefore
it is free for the reader how many(or no) lesions to mark.
▪ Free-response ROC(FROC)
– Plot of lesion locations performance test in a way that
y-axis corresponds to fraction of lesions detection and x-axis
corresponds to false positive per image.
– Most widely used plot used to assess lesion detection tasks
such as lung nodule detection or liver tumor detection.
– True positive is defined when an indicated location falls within
a specified distance of a true lesion.
– Here, the x-axis has no upper bound.
The Free-Response Task
Free-response ROC
▪ Generation of FROC Curve
– Below, green circles means true positive while red circles means false positives.
– The circles are ordered with the confidence level(z) increasing to the right.
– Starting on the extreme right hand side, from the positive infinity, we move the cutoff to the left.
Whenever we pass the green circle, we move up the operating point by 1/L, where L is the number of lesions.
– Whenever we pass the red circle, we move right the operating point by 1/N, where N is the number of images.
▪ Pros and Cons of FROC
– Pros
• It visualizes the utilization of rating scales -> Ideally, the FROC curve should end in plateau.
• We can deal with multiple lesion marks and corresponding ratings.
– Cons
• It does not account for unmarked non-diseased cases(true negative), which account for most of the cases in many diagnostic
imaging.
• The x-axis is unconstrained making it impossible to assess the figure of merit.
Interpretation of FROC
Free-response ROC
▪ AFROC Definition
– When we change x-axis of FROC to false positive fraction, then it is called alternative FROC or AFROC.
– The plot is constrained to lie within the unit-square and figure-of-merit is computable.
– However, AFROC ignores intra-image lesion correlations and used in limited situations.
Solving Problem of FROC by Bounding Characteristics
Alternative FROC
JAFROC
▪ Bootstrapping
– A method for evaluating the variance of an estimator
Bootstrapping and Jackknifing
JAFROC
▪ Jackknifing
– Instead of generating a set of random samples, we generate n
samples of size n-1 by leaving out one observation at a time.
▪ Method for analyzing free-response multiple-reader multiple-case (MRMC) study.
Jackknife Analysis of Free-Response ROC Data
JAFROC
=> Probability that a lesion rating exceeds non-lesion rating
▪ Excel File Format
– The worksheets must be named Truth, TP and FP.
– The first row of each worksheet is reserved for data labels.
– Truth denotes ground truth information for each image.
– TP = the ratings "true positives", i.e., lesions that are correctly localized.
– FP = ratings for "false positive", i.e., ratings of marked normal region
Data Format in JAFROC Analysis Software
JAFROC
Result of JAFROC Analysis Software
JAFROC
JAFROC in Clinical Applications
▪ Inclusion Criteria
– 300 PA and lateral chest radiographs are retrospectively selected from 4 hospitals in Netherland
(Radboud University Medical Center, University Medical Center, Academic Medical Center, Meander Medical Center)
– Presence of a solid solitary nodules(< 30mm in diameter, mean 16.2mm) and the availability of a PA and lateral chest
radiograph and a chest CT scan obtained within 3 months. (189 negative, 111 positive cases).
– Radiograph showing signs of other disease(except COPD) were excluded.
– All subjects were older than 40. (44-88 years with average 65 years, 177 male, 123 female.).
– Absence of disease was ascertained by radiograph and CT scans(taken within 6 months) with negative findings.
– To contain wide range of lesion conspicuities, two experience radiologists rated the visibility in consensus.
• Category 1(Well visible), Category 2(Moderately subtle), Category 3(Subtle), Category 4(Very subtle)
– Nodule volume was assessed using CT scan and diameter was calculated assuming each nodule to be a sphere.
▪ Image Acquisition
– Chest radiographs are obtained with digital x-ray devices from Agfa Healthcare, Philips Healthcare and Siemens.
▪ Image Processing
– Commercially available CAD(ClearRead +Detect 5.2, Riverain Technology) was used.
– This CAD is optimized for the detection of nodules between 9 to 30mm in diameter which are marked by circles.
– Bone suppression images were computed by using software(ClearRead Bone Suppression 2.4, Riverain Technologies) which
digitally removes ribs and clavicles.
– Both software are FDA approved.
Data
JAFROC in Clinical Applications
▪ Readers
– Five radiologists(5, 13, 3, 17, 17 years of experience), and three residents(2nd-year, 4th year and 4th year).
– No experience with CAD and BSIs
▪ Reading Setting
– Evaluation was performed in different randomized orders.
– Readers reviewed the cases first without and subsequently with the use of CAD.
– BSIs were always available.
– Training session was provided to familiarize the readers with the softwares(40cases, 22 w/, 18 w/o nodules)
▪ Reading Method
– Readers mark suspicious regions in the chest radiograph with the degree of suspiciousness(confidence) that a nodule was
present(0, not suspicious, 100, definitely suspicious).
– Readers were allowed to mark multiple regions per image and did not have ability to change their decision.
– After first scoring phase without CAD but with BSIs, CAD marks were automatically displayed and could be toggled on/off.
– The readers were asked to score new region, remove marked region in the first phase, or change the score of the marked
region.
– The readers were informed that maximum of one nodule is present at each case and there are more normal cases than nodule
cases. But they did not know exact numbers.
Reading Method
JAFROC in Clinical Applications
▪ Statistics
– Multireader multiple-case jackknife alternative free-response receiver operating characteristic(AFROC) analysis was performed.
– A finding by the reader was considered a TP finding when the marking was within 1cm of the center of the ground-truth
annotation.
– As input for jackknife AFROC analysis, only one reader score per image is used.
– For cases with negative findings, FP finding with the highest score was used.
– For cases with positive findings, markings of nonlesion locations are ignored and only TP markings are used.
– AUC which represent the probability that a lesion is rated higher than nonlesion in the negative case was calculated by using
the trapezoidal integration method(a.k.a. Wilcoxon rank-sum test).
– AUCs without and with the help of CAD were compared with the Dorfman-Berbaum-Metz method(DBMMRMC, ver. 2.33)
Statistical Analysis
JAFROC in Clinical Applications
OR-DBM MRMC Data Format
(http://perception.radiology.uiowa.edu/Software/ReceiverOperatingCh
aracteristicROC/MRMCAnalysis/tabid/116/Default.aspx)
▪ Stand-alone CAD
– CAD reached sensitivity of 74%(82 of 111) at 1.0 FP mark per image. (0~5 marks per image)
• 91%(28 of 32) for well-visible nodules, 88%(28 of 32) for moderately subtle nodules.
• 62%(18 of 29) for subtle and 39%(7 of 18) for very subtle nodules.
• 91%(21 of 23) of the nodules > 20mm, 62%(20 of 32) for nodules between 15 and 20mm, 77%(36 of 47) for nodules
between 10 and 15mm, 56%(5/9) for nodules < 10mm
• CAD reached AU-AFROC curve of 0.656. CAD generated 196 FP in cases with negative findings(189)
▪ Observer Performance
– AUAFROC for human readers was 0.812 without CAD vs 0.841 with CAD(p=0.0001)
– CAD detected 53%(127 of 239) of the nodules that were missed by
the readers.
– Reader dismissed 55%(70 of 127) of these TP CAD candidates.
– CAD helped the readers to place new correct lesion label(57)
and increase confidence score to lesion(220).
– CAD counteracted by placing wrong new label(92) or increase
confidence score to nonlesion(66)
Results
JAFROC in Clinical Applications
▪ Positive and Negative Effect of CAD
Results
JAFROC in Clinical Applications
▪ Advantage of using CAD
Results
JAFROC in Clinical Applications
▪ Examples
Results
JAFROC in Clinical Applications
▪ Advantage of using CAD
Results
JAFROC in Clinical Applications
▪ Conclusion
– CAD improves observer performance for the detection of lung nodules on chest radiographs, beyond the
application of bone suppression alone
– CAD detected 3/7 nodules missed by all radiologists and 12/25 most of the radiologists.
– The CAD was most helpful for moderately subtle and subtle lesions.
– For well-visible nodules both CAD and radiologists had high sensitivity
– For very subtle nodules, the sensitivity of CAD was much better than readers(39% vs 22%) but readers could
not take advantage of CAD because of the difficulty of differentiate TP from FP.
– The beneficial effect of CAD is limited by the insufficient ability of the observers to differentiate true-positive
from false-positive CAD candidates.
– Combination of CAD and bone suppression in chest radiography improves detection of potentially early lung
cancer.
=> We need a principled and clinically realistic method to assess more complex use cases of CAD(multiple
disease with lesion markings).
Discussion and Conclusion
JAFROC in Clinical Applications
khwan.jung@vuno.co
hello@vuno.co
Putting the world’s medical data to work

Weitere ähnliche Inhalte

Was ist angesagt?

Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithmsguest9938738
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimizationJoy Dutta
 
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Chris Ohk
 
Aerial photogrammetry 04
Aerial photogrammetry  04Aerial photogrammetry  04
Aerial photogrammetry 04Rajesh Rajguru
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithmallyn joy calcaben
 
K means clustering
K means clusteringK means clustering
K means clusteringKuppusamy P
 
Neural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdfNeural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdfNavneetPaul2
 
K MEANS CLUSTERING.pptx
K MEANS CLUSTERING.pptxK MEANS CLUSTERING.pptx
K MEANS CLUSTERING.pptxkibriaswe
 
International Terrestrial Reference Frame
International Terrestrial Reference FrameInternational Terrestrial Reference Frame
International Terrestrial Reference FrameSurvey Department
 
종 분포 모형 실습 서울대학교
종 분포 모형 실습 서울대학교종 분포 모형 실습 서울대학교
종 분포 모형 실습 서울대학교cheongokjeon
 
Geo Sense - UAV service, unmanned remote sensing
Geo Sense - UAV service, unmanned remote sensingGeo Sense - UAV service, unmanned remote sensing
Geo Sense - UAV service, unmanned remote sensingIsmail Ibrahim
 
Lect 7 &amp; 8 types of vector data model-gis
Lect 7 &amp; 8 types of vector data model-gisLect 7 &amp; 8 types of vector data model-gis
Lect 7 &amp; 8 types of vector data model-gisRehana Jamal
 
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...Aboul Ella Hassanien
 

Was ist angesagt? (20)

Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
 
Steepest descent method
Steepest descent methodSteepest descent method
Steepest descent method
 
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
 
Chicken swarm optimization (CSO)
Chicken swarm optimization (CSO)Chicken swarm optimization (CSO)
Chicken swarm optimization (CSO)
 
Aerial photogrammetry 04
Aerial photogrammetry  04Aerial photogrammetry  04
Aerial photogrammetry 04
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Ant colony algorithm
Ant colony algorithm Ant colony algorithm
Ant colony algorithm
 
Neural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdfNeural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdf
 
Gps
GpsGps
Gps
 
K MEANS CLUSTERING.pptx
K MEANS CLUSTERING.pptxK MEANS CLUSTERING.pptx
K MEANS CLUSTERING.pptx
 
International Terrestrial Reference Frame
International Terrestrial Reference FrameInternational Terrestrial Reference Frame
International Terrestrial Reference Frame
 
종 분포 모형 실습 서울대학교
종 분포 모형 실습 서울대학교종 분포 모형 실습 서울대학교
종 분포 모형 실습 서울대학교
 
Geo Sense - UAV service, unmanned remote sensing
Geo Sense - UAV service, unmanned remote sensingGeo Sense - UAV service, unmanned remote sensing
Geo Sense - UAV service, unmanned remote sensing
 
Lect 7 &amp; 8 types of vector data model-gis
Lect 7 &amp; 8 types of vector data model-gisLect 7 &amp; 8 types of vector data model-gis
Lect 7 &amp; 8 types of vector data model-gis
 
Lasso regression
Lasso regressionLasso regression
Lasso regression
 
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...
Neutrosophic sets and fuzzy c means clustering for improving ct liver image s...
 
Firefly algorithm
Firefly algorithmFirefly algorithm
Firefly algorithm
 

Ähnlich wie (20180524) vuno seminar roc and extension

Diagnosing a diagnostic april 08 2015
Diagnosing a diagnostic april 08 2015Diagnosing a diagnostic april 08 2015
Diagnosing a diagnostic april 08 2015Athula Herath
 
ROC CURVE AND ANALYSIS.pptx
ROC CURVE AND ANALYSIS.pptxROC CURVE AND ANALYSIS.pptx
ROC CURVE AND ANALYSIS.pptxagniva pradhan
 
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2Eyenirvaan
 
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptxVALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptxShaliniPattanayak
 
Introduction to ROC Curve Analysis with Application in Functional Genomics
Introduction to ROC Curve Analysis with Application in Functional GenomicsIntroduction to ROC Curve Analysis with Application in Functional Genomics
Introduction to ROC Curve Analysis with Application in Functional GenomicsShana White
 
Summer 2015 Internship
Summer 2015 InternshipSummer 2015 Internship
Summer 2015 InternshipTaylor Martell
 
Validity of a screening test
Validity of a screening testValidity of a screening test
Validity of a screening testdrkulrajat
 
Advance concept of screening_Nabaraj Paudel
Advance concept of screening_Nabaraj PaudelAdvance concept of screening_Nabaraj Paudel
Advance concept of screening_Nabaraj PaudelNabaraj Paudel
 
Electronic portal imaging by rose wekesa
Electronic portal imaging by rose wekesaElectronic portal imaging by rose wekesa
Electronic portal imaging by rose wekesaKesho Conference
 
Gamma Camera Image Quality
Gamma Camera Image QualityGamma Camera Image Quality
Gamma Camera Image QualityDavid Graff
 
Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping THL
 
Automated perimetry
Automated perimetryAutomated perimetry
Automated perimetryarmaan ahmed
 
How to read a receiver operating characteritic (ROC) curve
How to read a receiver operating characteritic (ROC) curveHow to read a receiver operating characteritic (ROC) curve
How to read a receiver operating characteritic (ROC) curveSamir Haffar
 
Bio statistical analysis in clinical research
Bio statistical analysis  in clinical research  Bio statistical analysis  in clinical research
Bio statistical analysis in clinical research Helwan University
 

Ähnlich wie (20180524) vuno seminar roc and extension (20)

Roc curves
Roc curvesRoc curves
Roc curves
 
Diagnosing a diagnostic april 08 2015
Diagnosing a diagnostic april 08 2015Diagnosing a diagnostic april 08 2015
Diagnosing a diagnostic april 08 2015
 
ROC CURVE AND ANALYSIS.pptx
ROC CURVE AND ANALYSIS.pptxROC CURVE AND ANALYSIS.pptx
ROC CURVE AND ANALYSIS.pptx
 
Visual field Analysis .ppt
Visual field Analysis .pptVisual field Analysis .ppt
Visual field Analysis .ppt
 
Roc
RocRoc
Roc
 
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2
Evaluating a diagnostic test presentation www.eyenirvaan.com - part 2
 
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptxVALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
 
Introduction to ROC Curve Analysis with Application in Functional Genomics
Introduction to ROC Curve Analysis with Application in Functional GenomicsIntroduction to ROC Curve Analysis with Application in Functional Genomics
Introduction to ROC Curve Analysis with Application in Functional Genomics
 
Hrt &amp; g dx
Hrt &amp; g dxHrt &amp; g dx
Hrt &amp; g dx
 
ch 18 roc.doc
ch 18  roc.docch 18  roc.doc
ch 18 roc.doc
 
Summer 2015 Internship
Summer 2015 InternshipSummer 2015 Internship
Summer 2015 Internship
 
Validity of a screening test
Validity of a screening testValidity of a screening test
Validity of a screening test
 
Advance concept of screening_Nabaraj Paudel
Advance concept of screening_Nabaraj PaudelAdvance concept of screening_Nabaraj Paudel
Advance concept of screening_Nabaraj Paudel
 
Electronic portal imaging by rose wekesa
Electronic portal imaging by rose wekesaElectronic portal imaging by rose wekesa
Electronic portal imaging by rose wekesa
 
Gamma Camera Image Quality
Gamma Camera Image QualityGamma Camera Image Quality
Gamma Camera Image Quality
 
Agreement analysis
Agreement analysisAgreement analysis
Agreement analysis
 
Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping
 
Automated perimetry
Automated perimetryAutomated perimetry
Automated perimetry
 
How to read a receiver operating characteritic (ROC) curve
How to read a receiver operating characteritic (ROC) curveHow to read a receiver operating characteritic (ROC) curve
How to read a receiver operating characteritic (ROC) curve
 
Bio statistical analysis in clinical research
Bio statistical analysis  in clinical research  Bio statistical analysis  in clinical research
Bio statistical analysis in clinical research
 

Mehr von Kyuhwan Jung

(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jungKyuhwan Jung
 
(20180728) kosaim workshop vuno - kyuhwan jung
(20180728) kosaim workshop   vuno - kyuhwan jung(20180728) kosaim workshop   vuno - kyuhwan jung
(20180728) kosaim workshop vuno - kyuhwan jungKyuhwan Jung
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsKyuhwan Jung
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between CapsulesKyuhwan Jung
 
(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imagingKyuhwan Jung
 
Hello, Recommender System
Hello, Recommender SystemHello, Recommender System
Hello, Recommender SystemKyuhwan Jung
 

Mehr von Kyuhwan Jung (6)

(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung(20180715) ksiim   gan in medical imaging - vuno - kyuhwan jung
(20180715) ksiim gan in medical imaging - vuno - kyuhwan jung
 
(20180728) kosaim workshop vuno - kyuhwan jung
(20180728) kosaim workshop   vuno - kyuhwan jung(20180728) kosaim workshop   vuno - kyuhwan jung
(20180728) kosaim workshop vuno - kyuhwan jung
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
 
Dynamic Routing Between Capsules
Dynamic Routing Between CapsulesDynamic Routing Between Capsules
Dynamic Routing Between Capsules
 
(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging
 
Hello, Recommender System
Hello, Recommender SystemHello, Recommender System
Hello, Recommender System
 

Kürzlich hochgeladen

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 

Kürzlich hochgeladen (20)

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 

(20180524) vuno seminar roc and extension

  • 1.
  • 2.
  • 3. ▪ Most important and widely used metric for evaluating the performance of diagnostic test – Sensitivity : Num of true positive decision/the number of positive cases – Specificity : Num of false negative decision/the number of negative cases Performance Measures Diagnostic Test
  • 4. ▪ Diagnostic decision making itself is ambiguous – No clear-cut between ‘Normal’ and ‘Abnormal’ – Therefore it is more natural to rate the case using some scale. – Ex) Five-point scale for nodules in chest radiograph • 1(definitely benign), 2(probably malignant), 3(possibly malignant), 4(probably malignant), 5(definitely malignant) • There are four cut-off values : 2≥, 3≥, 4≥, 5 – Then we have multiple points pair of (sensitivity, specificity) values which can be plotted on the graph with sensitivity as the y-axis and (1-specificity) as the x-axis – These discrete points are called as ‘operating points’. – We need a way to assess the performance of diagnostic test independently of the decision threshold Why Do We Need a Curve for Performance Measure? Operating Points
  • 5. ▪ The ROC curve is the estimation of all possible pairs on the graph from these operating points(A). – Fitted or Smoothed ROC Curve(B) : Parametric estimation • Smooth curve estimated from the operating points based on a binormal distribution assumption on the test results for both positive and negative cases. – Empirical ROC Curve(C) : Nonparametric estimation • Connect all operating points with straight lines ▪ Why is it called ROC? – The term ROC refers to the performance of a human or mechanical observer(the receiver) that has to discriminate between radio signals contaminated by noise and noise alone. It is developed in 1950s. Receiver Operating Characteristic ROC Curve
  • 6. ▪ Even Googler … Receiver Operating Characteristic ROC Curve
  • 7. ▪ AUROC or AUC – Average value of sensitivity for all possible value of specificity – The value of AUC takes any value between 0 and 1 and independent from disease prevalence – AUC of 1 means perfectly accurate test while the practical lower bound is 0.5 for random guess. – The rating scheme(discrete or continuous) is important for the reduction of bias in the estimation of AUC – It can be interpreted as the figure of merit(FOM), the probability that positive case is rated higher than negative cases. ▪ Frequentist Method – Parametric AUC • Obtained with fitted ROC curve. • Based some assumption(Well distributed Binormal distribution of test results, sample cases are not extremely small) – Nonparametric AUC • Estimated by the summation of trapezoids formed under empirical ROC curve • Underestimates AUC when discrete ratings are used. ▪ Bayesian Method – Exploit prior or latent variable to express the unknown disease status – Especially useful when the ‘gold standard’ is absent or uncertain. Measure of Overall Diagnostic Performance Area Under ROC Curve
  • 8. ▪ BiNormal Assumption ▪ Proof(Caution! Proof by KH, thus not guaranteed) Fitted ROC Curve Parametric ROC Curve http://www.navan.name/roc/
  • 9. ▪ AUC can vary according to the sample cases. – With same diagnostic test, the performance will vary according to the test samples. – We can therefore choose a range of AUC in which the true value lies with certain degree confidence. – 95% confidence interval is often used. ▪ Computation of confidence interval for AUC – Confidence Interval : where , Assessing Statistical Significance of AUC Confidence Interval of AUC J. A. Hanley and B. J. McNeil(1982) https://pubs.rsna.org/doi/pdf/10.1148/radiology.143.1.7063747
  • 10. ▪ Overall performance of different diagnostic test can be compared using AUCs – However, same AUCs do not mean two tests are identical. – The equality of two ROC curves can be statistically tested using ‘a’ and ‘b’, which completely specify the shape of ROC curve. ▪ Partial AUC – According to the diagnostic situations, full AUC will not be clinically meaningful. – For screening serious disease in a high risk group, high sensitivity is important. – For a disease with low prevalence and risky subsequent confirmatory test, high specificity is important. – In these cases, we can set a specific FPR range (or sensitivity range) to calculate mean sensitivity(or FPR) within that range Comparison of Overall Diagnostic Performance Comparing AUCs
  • 11. ▪ The Need for Extension to ROC – ROC can only deal with binary decision and don’t encompass lesion locations. – Location ROC(LROC) handle predefined regions in the image separately and compute ROC based on the number of regions and their decision(ex) left, right lung or lobe). The readers are informed that there can be at most one lesion per image. – Both ROC and LROC is problematic to handle multiple lesions or suspicious location in the images. – In Region-of-interest(ROI) method, similar to LROC but deals with regions independently. – Both LROC and ROI method cannot account for the correlations among the regions in the same image. – Free response task means the reader is given no prior information regarding the number of lesion in the image, and therefore it is free for the reader how many(or no) lesions to mark. ▪ Free-response ROC(FROC) – Plot of lesion locations performance test in a way that y-axis corresponds to fraction of lesions detection and x-axis corresponds to false positive per image. – Most widely used plot used to assess lesion detection tasks such as lung nodule detection or liver tumor detection. – True positive is defined when an indicated location falls within a specified distance of a true lesion. – Here, the x-axis has no upper bound. The Free-Response Task Free-response ROC
  • 12. ▪ Generation of FROC Curve – Below, green circles means true positive while red circles means false positives. – The circles are ordered with the confidence level(z) increasing to the right. – Starting on the extreme right hand side, from the positive infinity, we move the cutoff to the left. Whenever we pass the green circle, we move up the operating point by 1/L, where L is the number of lesions. – Whenever we pass the red circle, we move right the operating point by 1/N, where N is the number of images. ▪ Pros and Cons of FROC – Pros • It visualizes the utilization of rating scales -> Ideally, the FROC curve should end in plateau. • We can deal with multiple lesion marks and corresponding ratings. – Cons • It does not account for unmarked non-diseased cases(true negative), which account for most of the cases in many diagnostic imaging. • The x-axis is unconstrained making it impossible to assess the figure of merit. Interpretation of FROC Free-response ROC
  • 13. ▪ AFROC Definition – When we change x-axis of FROC to false positive fraction, then it is called alternative FROC or AFROC. – The plot is constrained to lie within the unit-square and figure-of-merit is computable. – However, AFROC ignores intra-image lesion correlations and used in limited situations. Solving Problem of FROC by Bounding Characteristics Alternative FROC
  • 15. ▪ Bootstrapping – A method for evaluating the variance of an estimator Bootstrapping and Jackknifing JAFROC ▪ Jackknifing – Instead of generating a set of random samples, we generate n samples of size n-1 by leaving out one observation at a time.
  • 16. ▪ Method for analyzing free-response multiple-reader multiple-case (MRMC) study. Jackknife Analysis of Free-Response ROC Data JAFROC => Probability that a lesion rating exceeds non-lesion rating
  • 17. ▪ Excel File Format – The worksheets must be named Truth, TP and FP. – The first row of each worksheet is reserved for data labels. – Truth denotes ground truth information for each image. – TP = the ratings "true positives", i.e., lesions that are correctly localized. – FP = ratings for "false positive", i.e., ratings of marked normal region Data Format in JAFROC Analysis Software JAFROC
  • 18. Result of JAFROC Analysis Software JAFROC
  • 19. JAFROC in Clinical Applications
  • 20. ▪ Inclusion Criteria – 300 PA and lateral chest radiographs are retrospectively selected from 4 hospitals in Netherland (Radboud University Medical Center, University Medical Center, Academic Medical Center, Meander Medical Center) – Presence of a solid solitary nodules(< 30mm in diameter, mean 16.2mm) and the availability of a PA and lateral chest radiograph and a chest CT scan obtained within 3 months. (189 negative, 111 positive cases). – Radiograph showing signs of other disease(except COPD) were excluded. – All subjects were older than 40. (44-88 years with average 65 years, 177 male, 123 female.). – Absence of disease was ascertained by radiograph and CT scans(taken within 6 months) with negative findings. – To contain wide range of lesion conspicuities, two experience radiologists rated the visibility in consensus. • Category 1(Well visible), Category 2(Moderately subtle), Category 3(Subtle), Category 4(Very subtle) – Nodule volume was assessed using CT scan and diameter was calculated assuming each nodule to be a sphere. ▪ Image Acquisition – Chest radiographs are obtained with digital x-ray devices from Agfa Healthcare, Philips Healthcare and Siemens. ▪ Image Processing – Commercially available CAD(ClearRead +Detect 5.2, Riverain Technology) was used. – This CAD is optimized for the detection of nodules between 9 to 30mm in diameter which are marked by circles. – Bone suppression images were computed by using software(ClearRead Bone Suppression 2.4, Riverain Technologies) which digitally removes ribs and clavicles. – Both software are FDA approved. Data JAFROC in Clinical Applications
  • 21. ▪ Readers – Five radiologists(5, 13, 3, 17, 17 years of experience), and three residents(2nd-year, 4th year and 4th year). – No experience with CAD and BSIs ▪ Reading Setting – Evaluation was performed in different randomized orders. – Readers reviewed the cases first without and subsequently with the use of CAD. – BSIs were always available. – Training session was provided to familiarize the readers with the softwares(40cases, 22 w/, 18 w/o nodules) ▪ Reading Method – Readers mark suspicious regions in the chest radiograph with the degree of suspiciousness(confidence) that a nodule was present(0, not suspicious, 100, definitely suspicious). – Readers were allowed to mark multiple regions per image and did not have ability to change their decision. – After first scoring phase without CAD but with BSIs, CAD marks were automatically displayed and could be toggled on/off. – The readers were asked to score new region, remove marked region in the first phase, or change the score of the marked region. – The readers were informed that maximum of one nodule is present at each case and there are more normal cases than nodule cases. But they did not know exact numbers. Reading Method JAFROC in Clinical Applications
  • 22. ▪ Statistics – Multireader multiple-case jackknife alternative free-response receiver operating characteristic(AFROC) analysis was performed. – A finding by the reader was considered a TP finding when the marking was within 1cm of the center of the ground-truth annotation. – As input for jackknife AFROC analysis, only one reader score per image is used. – For cases with negative findings, FP finding with the highest score was used. – For cases with positive findings, markings of nonlesion locations are ignored and only TP markings are used. – AUC which represent the probability that a lesion is rated higher than nonlesion in the negative case was calculated by using the trapezoidal integration method(a.k.a. Wilcoxon rank-sum test). – AUCs without and with the help of CAD were compared with the Dorfman-Berbaum-Metz method(DBMMRMC, ver. 2.33) Statistical Analysis JAFROC in Clinical Applications OR-DBM MRMC Data Format (http://perception.radiology.uiowa.edu/Software/ReceiverOperatingCh aracteristicROC/MRMCAnalysis/tabid/116/Default.aspx)
  • 23. ▪ Stand-alone CAD – CAD reached sensitivity of 74%(82 of 111) at 1.0 FP mark per image. (0~5 marks per image) • 91%(28 of 32) for well-visible nodules, 88%(28 of 32) for moderately subtle nodules. • 62%(18 of 29) for subtle and 39%(7 of 18) for very subtle nodules. • 91%(21 of 23) of the nodules > 20mm, 62%(20 of 32) for nodules between 15 and 20mm, 77%(36 of 47) for nodules between 10 and 15mm, 56%(5/9) for nodules < 10mm • CAD reached AU-AFROC curve of 0.656. CAD generated 196 FP in cases with negative findings(189) ▪ Observer Performance – AUAFROC for human readers was 0.812 without CAD vs 0.841 with CAD(p=0.0001) – CAD detected 53%(127 of 239) of the nodules that were missed by the readers. – Reader dismissed 55%(70 of 127) of these TP CAD candidates. – CAD helped the readers to place new correct lesion label(57) and increase confidence score to lesion(220). – CAD counteracted by placing wrong new label(92) or increase confidence score to nonlesion(66) Results JAFROC in Clinical Applications
  • 24. ▪ Positive and Negative Effect of CAD Results JAFROC in Clinical Applications
  • 25. ▪ Advantage of using CAD Results JAFROC in Clinical Applications
  • 26. ▪ Examples Results JAFROC in Clinical Applications
  • 27. ▪ Advantage of using CAD Results JAFROC in Clinical Applications
  • 28. ▪ Conclusion – CAD improves observer performance for the detection of lung nodules on chest radiographs, beyond the application of bone suppression alone – CAD detected 3/7 nodules missed by all radiologists and 12/25 most of the radiologists. – The CAD was most helpful for moderately subtle and subtle lesions. – For well-visible nodules both CAD and radiologists had high sensitivity – For very subtle nodules, the sensitivity of CAD was much better than readers(39% vs 22%) but readers could not take advantage of CAD because of the difficulty of differentiate TP from FP. – The beneficial effect of CAD is limited by the insufficient ability of the observers to differentiate true-positive from false-positive CAD candidates. – Combination of CAD and bone suppression in chest radiography improves detection of potentially early lung cancer. => We need a principled and clinically realistic method to assess more complex use cases of CAD(multiple disease with lesion markings). Discussion and Conclusion JAFROC in Clinical Applications