2. • Introduction & Overview
• Multimedia Data in Medicine
• Characteristics of Endoscopic Video
• Different Fields and Communities
• Application 1: Post-Procedural Usage of Surgery Videos
• Domain-Specific Storage for long-term Archiving
• Video Content Analysis
• Visualization, Interaction & Annotation
• Application 2: Diagnostic Decision Support
• Knowledge transfer
• Analysis
• Feedback
• Conclusions & Outlook
Agenda
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 2
4. Inspections and intervention produce many kinds of data
• Medical text
• OR reports, Patient records…
• Sensor signals
• ECG, EEG, vital signs
• Medical images (radiology)
• Ultrasound, x-ray
• CT, MRI, PET, …
• Medical video
• Open surgery
• Microscopic surgery
• Endoscopic inspections
• Endoscopic surgery
Multimedia Data in Medicine
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 4
Communities:
• Signal Processing
• Medical Imaging
• Computer-Assisted
Surgery / Robotics
• Multimedia
„Human EEG without alpha-rhythm“ by Andrii Cherninskyi / CC BY-SA
„Pankreatitis“ by Hellerhoff/ CC BY-SA„Ultrasound“, Public Domain
5. • Traditional open surgery ?
• Minimally invasive interventions
• Reduced trauma for patient
• Inherently available video signal
• Useful for documentation
• Microscopic surgery
Video Data Sources in Medicine
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 5
„Laparoscopy“, Public Domain
9. Domain-specific Characteristics & Challenges
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 9
• Full HD or 4K (even stereo 3D)
• Single shot recordings
• Up to multiple hours
• Homogenous color distribution
• Visually very similar content
• Circular content area
• Restricted motion
• Geometric distortion
• Specular reflections
• Occlusions
• Smoke
• Noise, motion blur, blood, flying particles
12. Pre-Processing
• Image Enhancement
• Contrast enhancement, color misalignment
correction…
• Camera calibration and distortion correction
• Specular reflection removal
• Comb structure removal & super resolution
• …
• Information Filtering
• Frame Filtering
• Image Segmentation
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 12
T. Stehle. Removal of specular reflections in endoscopic images. Acta
Polytechnica: Journal of Advanced Engineering, 46(4):32–36, 2006.
J. Barreto, J. Roquette, P. Sturm, and F. Fonseca. Automatic
Camera Calibration Applied to Medical Endoscopy. In 20th
British Machine Vision Conference (BMVC ’09), 2009.
B. Münzer, K. Schoeffmann, and L. Böszörmenyi. Relevance Segmentation of Laparoscopic Videos. In 2013 IEEE International Symposium on Multimedia (ISM), pages 84–91, Dec. 2013.
A. Chhatkuli, A. Bartoli, A. Malti, and T. Collins. Live image parsing in uterine laparoscopy. In IEEE International Symposium on Biomedical Imaging (ISBI), 2014.
13. Real-time Support at Intervention Time
Applications
§ Diagnosis support
§ Robot-assisted surgery
§ Context Awareness
§ Augmented reality
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 13
“Robotic surgical system”, Public Domain
T. Collins, D. Pizarro, A. Bartoli, M. Canis, and N. Bourdel. Computer-Assisted Laparoscopic myomectomy by augmenting the uterus with pre-operative MRI data. In 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pages 243–248, Sept. 2014.
„Da Vinci Surgical System“ by Cmglee / CC BY-SA
Slightly modified from: M. P. Tjoa, S. M. Krishnan, et al. Feature extraction for the analysis of colon status from the endoscopic images. BioMedical Engineering OnLine, 2(9):1–17, 2003.
14. • 3D reconstruction
• Deforming tissue tracking
• Image Registration
• Instrument detection and tracking
• Surgical workflow understanding
Enabling Techniques
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 14
L. Maier-Hein, P. Mountney, A. Bartoli, H. Elhawary, D. Elson, A. Groch, A. Kolb, M. Rodrigues, J. Sorger, S. Speidel, and D. Stoyanov. Optical
techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery. Medical Image Analysis, 17(8):974–996, Dec. 2013.
S. Giannarou, M. Visentini-Scarzanella, and G. Z. Yang. Affine-invariant anisotropic detector for soft tissue tracking in minimally invasive
surgery. In Biomedical Imaging: From Nano to Macro, 2009. ISBI’09. IEEE International Symposium on, pages 1059–1062, 2009.
15. Post-Procedural Applications
Management and Retrieval
• Compression and storage
• Content-based retrieval
• Temporal video segmentation
• Video summarization
• Visualization & Interaction
Quality Assessment
§ Skills assessment
§ Education & Training
§ Error Rating
§ Assessment of intervention quality
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 15
M. Lux, O. Marques, K. Schöffmann, L. Böszörmenyi, and G. Lajtai. A novel tool for summarization of arthroscopic videos. Multimedia Tools and Applications, 46(2-3):521–544, Sept. 2009.
D. Liu, Y. Cao, W. Tavanapong, J. Wong, J. H. Oh, and P. C. de Groen. Quadrant coverage histogram: a new method
for measuring quality of colonoscopic procedures. In Engineering in Medicine and Biology Society, 2007. EMBS
2007. 29th Annual International Conference of the IEEE, pages 3470–3473, 2007.
J. Muthukudage, J. Oh, W. Tavanapong, J. Wong, and P. C. d. Groen. Color Based
Stool Region Detection in Colonoscopy Videos for Quality Measurements. In Y.-S. Ho,
editor, Advances in Image and Video Technology, number 7087 in Lecture Notes in
Computer Science, pages 61–72. Springer Berlin Heidelberg, Jan. 2012.
16. • Vision
• Archive together all relevant text, image, and video data
• Use data for information retrieval
• Support surgeons at diagnosis, surgery planning, teaching, …
• Combine different kind of data (e.g., radiology-supported surgery)
• Challenges
• Isolated systems / separation of data
• Very Big Data
• A lot of irrelevant content
• Very specific domain characteristics
• Need for domain expert knowledge
• Different communities and views
Medical Multimedia Information Systems
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 16
19. Full Storage of Endoscopic Videos
• Exemplary hospital
• 5 departments (Lap, Gyn, Arthro, GI, ENT)
• 2 operation rooms, each 4 ops/day, each op ca. 1-2h
• à i.e. 40 interventions per day, each ~ 90 mins.
• 60 hours video per day!
• Assumption: HD 1920x1080, H.264/AVC
• 270 GB / day (1h=4.5 GB)
• 1.9 TB / week
• 100 TB / year (200 TB MPEG-2)
4K about twice as much!
(unless encoded with H.265/HEVC)
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 19
Great challenge for a hospital’s IT department!
20. How to Reduce Storage Requirements?
1. Spatial compression optimization
2. Temporal compression optimization
3. Perceptual quality based optimization
Transcoding
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 20
up to 30%
up to 40%
up to 93%
21. Study on Video Quality
• Subjective quality assessment
• Catharina Hospital Eindhoven, NL
• 37 participants
• 19 experienced surgeons and 18 trainees
• 7 women, 30 men, average age: 40 years
• Subjective tests regarding
maximum compression
1) Perceivable quality loss
• Double-Stimulus (ITU-R BT.500-11)
• Switch between reference and test video
2) Perceivable semantic information loss
• Single Stimulus (ITU-R P.910)
• Assessing random videos (incl. reference)
Münzer, B., Schoeffmann, K., Böszörmenyi, L., Smulders, J. F., & Jakimowicz, J. J. (2014, May). Investigation of the impact of compression on the
perceptional quality of laparoscopic videos. In 2014 IEEE 27th International Symposium on Computer-Based Medical Systems (pp. 153-158). IEEE.
Session 1 Session 2
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 21
22. Assessment of Video Quality (Session 1)
-5
0
5
10
15
20
25
30
35
0
3000
6000
9000
12000
15000
18000
21000
24000
20 22 24 26 28 18 20 22 24 26 18 18
Difference Mean Opinion Score (DMOS)
Bitrate (Kb/s)
Test Conditions
Average bitrate Rating difference
1920x1080 1280x720 960x540 640x360
subjectively better
than reference
Reference video
(MPEG-2, HD, 20 (35) Mbit/s)
“lossless”
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 22
crf
(constant rate factor)
23. Assessment of Video Quality (Session 2)
1. Visually lossless with 8 Mbit/s Q1
(in comparison to 20 Mbit/s)
Reduction: 60% data vs. 0% MOS
2. Good quality with 2,5 Mbit/s and Q2
reduced resolution (1280x720)
Reduction: 88% data vs. 7% MOS
3. Acceptable quality with 1,4 Mbit/s Q3
and lower resolution (640x360)
Reduction: 93% data vs. 31% MOS
1
2
3
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 23
27. Content Relevance Filtering / Instrument Recognition
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 27
Münzer, B., Schoeffmann, K., & Böszörmenyi, L. (2013, December). Relevance segmentation of laparoscopic videos. In Multimedia (ISM), 2013 IEEE International Symposium on (pp. 84-91). IEEE.
Primus, M. J., Schoeffmann, K., & Böszörmenyi, L. (2015, June). Instrument classification in laparoscopic videos. In Content-Based Multimedia Indexing (CBMI), 2015 13th International Workshop on (pp. 1-6). IEEE.
Instrument detection for content understanding
(e.g., op phase segmentation, following
instruments in robot-assisted surgery)
Out-of-patient Scenes Blurry Scenes Border Area
28. Phase Segmentation (Cholecystectomy)
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 28
Manfred J. Primus, Klaus Schoeffmann and Laszlo Böszörmenyi. “Temporal Segmentation of Laparoscopic Videos into Surgical Phases“, in
Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing (CBMI 2016), Bucharest, Romania, 2016
à Phase segmentation through instrument recognition
(color analysis, image moments, rules/heuristics)
30. Classification of OP Scene (Cataract Surgeries)
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 30
Manfred J. Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, and Klaus Schoeffmann. “Frame-Based Classification of Operation
Phases in Cataract Surgery Videos“. Proceedings of the 24th International Conference on Multimedia Modeling 2018 (MMM 2018), Bangkok, Thailand, 2018, pp. 1-12, to appear
31. Learning Medical Semantic (e.g., Surgical Actions)
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 31
1.105 Segments / 823.000 Frames / 9h annotated Video (out of 111 interventions)
Dissection – 58 Segs / 35.517 Pics Coagulation – 212 Segs / 84.786 Pics Cutting cold – 271 Segs / 26.388 Pics
Cutting – 106 Segs / 92.653 Pics Hysterectomy – 25 Segs / 68.466 Pics Injection – 52 Segs / 52.355 Pics
Suturing – 92 Segs / 321.851 PicsSuction & Irrigation – 173 Segs / 73.977 Pics
Petscharnig, S., & Schöffmann, K. (2017). Learning laparoscopic video shot classification for gynecological surgery. Multimedia Tools and Applications, 1-19.
WHY?
• structure video content,
• automatic indexing for retrieval,
• automatic supervision of surgeries
32. Deep Learning Surgical Actions
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 32
Confidence
Thresholdslow high
Petscharnig, S., & Schöffmann, K. (2017). Learning laparoscopic video shot classification for gynecological surgery. Multimedia Tools and Applications, 1-19.
33. Deep Learning Surgical Actions
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 33
R...Recall P...Precision
35. Automatic Smoke Detection
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 35
Achievable Performance with Saturation Peak Analysis (SPA)
Andreas Leibetseder, Manfred J. Primus, Stefan Petscharnig, and Klaus Schoeffmann. “Image-based Smoke Detection in Laparoscopic Videos“. Proceedings of Computer Assisted and Robotic Endoscopy and Clinical Image-Based
Procedures: 4th International Workshop, CARE 2017, and 6th International Workshop, CLIP 2017, held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, pp. 70-87
36. Automatic Smoke Detection - Performance
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 36
20K images (DS A)
10K images (DS A)
4.5K images (DS B)
SPA: Saturation Peak Analysis
GLN RGB: GoogLeNet using RGB images
GLN SAT: GoogLeNet using saturation only images
Deep Learning
Andreas Leibetseder, Manfred J. Primus, Stefan Petscharnig, and Klaus Schoeffmann. “Image-based Smoke Detection in Laparoscopic Videos“. Proceedings of Computer Assisted and Robotic Endoscopy and Clinical Image-Based
Procedures: 4th International Workshop, CARE 2017, and 6th International Workshop, CLIP 2017, held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, pp. 70-87
37. Real-Time Smoke Detection Prototype
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 37
Andreas Leibetseder, Manfred J. Primus, Stefan Petscharnig, and Klaus Schoeffmann. “Image-based Smoke Detection in Laparoscopic Videos“. Proceedings of Computer Assisted and Robotic Endoscopy and Clinical Image-Based
Procedures: 4th International Workshop, CARE 2017, and 6th International Workshop, CLIP 2017, held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, pp. 70-87
41. Special Interaction Tools
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 41
Marco A. Hudelist, Sabrina Kletz, and Klaus Schoeffmann. 2016. A Multi-Video Browser for Endoscopic Videos on Tablets. In Proceedings of the 2016 ACM on Multimedia Conference (MM '16). ACM, New York, NY, USA, 722-724.
Marco A. Hudelist, Sabrina Kletz, and Klaus Schoeffmann. 2016. A Tablet Annotation Tool for Endoscopic Videos. In Proceedings of the 2016 ACM on Multimedia Conference (MM '16). ACM, New York, NY, USA, 725-727.
42. Surgical Quality Assessment (SQA) Software
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS) 42
• Integrating rating features
• More efficient video navigation/browsing
Marco A. Hudelist, Heinrich Husslein, Bernd Muenzer, Sabrina Kletz and Klaus Schoeffmann. “A Tool to Support Surgical Quality Assessment“,
in Proceedings of the Third IEEE International Conference on Multimedia Big Data (BigMM), Laguna Hills, CA, USA, 2017, pp. 238-239.
46. • Medical knowledge transfers – need DATA w/Ground Truth
• High detection accuracy
• Fast and efficient: real-time feedback and large scale
• Fit the normal examination procedures
• Adhere to ethical, legal, privacy challenges & regulations
46ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Key Challenges & Requirements
48. • Many types of diseases can potentially affect the human gastrointestinal (GI) tract – the digestive system
• about 2.8 millions of new luminal GI cancers (esophagus, stomach, colorectal) are detected yearly
• the mortality is about 65%
• Screening of the GI tract using different types of endoscopy…
• is costly (colonoscopy according to NY Times: $1100/patient, $10 billion dollars)
• consumes valuable medical personnel time (1-2 hours)
• does not scale to large populations
• is intrusive to the patient
• …
• Current technology may potentially enable automatic algorithmic screening and assisted examinations
à a true interdisciplinary activity with high chances of societal impact
48ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
GI Tract Challenges and Potential
49. 49ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
WHO: Colorectal Cancer Mortality 2012
Women
Men
Colorectal cancer is the third most common cause of cancer
mortality for both women and men, and it is a condition
where early detection is important for survival,
i.e., a 5-year survival probability of
going from a low 10-30% if detected in later stages
to a high 90% survival probability in early stages.
Colonoscopy it is not the ideal screening test.
Related to the cancer example, on average
20% of polyps (possible predecessors of cancer) are missed
or incompletely removed. The risk of getting cancer largely
depend on the endoscopists ability to detect and remove polyps.
A 1% increase in detection can decrease the risk of cancer with 3%.
50. ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Live Automatic Detection
• System to assist doctors during
live endoscopy procedures
• detection accuracy depend on
experience and skills
• have a “second eye”, “better” detection
• automatic tagging, annotation of lesions
• Better procedure for documentation,
automatic report generation
50
56. • Simple and efficient
• Web-based
• Assisted object tracking
56ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Video Annotation Subsystem
"Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos"
Zeno Albisser, et. al.
Proceedings of tMMSys, Portland, OR, USA, March 2015
57. • For large collection of images
• VV / Kvasir dataset
• Fully cleaned
• Feature extraction
mechanisms
• Different unsupervised
clustering algorithms
• Hierarchical image collection
visualization
• Open source: ClusterTag
https://bitbucket.org/mpg_projects/clustertag
57ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
ClusterTag: Image Clustering and Tagging Tool
"ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections"
Konstantin Pogorelov, et. al.
Proceedings of ICMR, Bucharest, Romania, June 2017
58. • Multi-Class Image Dataset for Computer Aided GI Disease Detection
• GI endoscopy images
• Some images contain the position and configuration of the endoscope (scope guide)
• 8 different anomalies and anatomical landmarks
• v1: 500 images per class, 6 pre-extracted global features
• v2: 1000 images per class
• New information added in the future: http://datasets.simula.no/kvasir/
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
The Kvasir Dataset
"Kvasir: A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection"
Konstantin Pogorelov, et al.
Proceedings of MMSYS, Taiwan, June 2017
59. • Bowel Preparation Quality Video
• 21 GI endoscopy videos of colon
• Some frames contain the position and
configuration of the endoscope (scope
guide)
• 4 classes showing four-score BBPS-
defined bowel-preparation quality
• 0 - very dirty
• …
• 3 - very clean
• http://datasets.simula.no/nerthus/
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
The Nerthus Dataset
"Nerthus: A Bowel Preparation Quality Video Dataset"
Konstantin Pogorelov, et al.
Proceedings of MMSYS, Taiwan, June 2017
62. 62ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
State-of-The-Art: Some Example Detection Systems
Polyp-Alert
• detects polyps using edges and texture
• near real-time feedback during colonoscopy (10fps)
• detected 97.7% (42 of 43) of polyp shots on 53 randomly selected
(not per frame detection)
• only 4.3% of a full-length colonoscopy procedure wrongly marked
• one of the few end-to-end systems
• Wallapak Tavanapong – from MM community
63. • Features extraction using open-source LIRE (Lucene Image Retrieval)
• Indexer:
• Indexing images by LIRE features for “training”
• Classifier:
• Built-in benchmarking functionality
• Output to console & JSON / HTML
• Verified with different datasets and use cases, e.g.,
life-logging, recommender systems, network analysis, etc.
• Open source project – OpenSea
63ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Global Features (GF)-Based Detection
”EIR - Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies"
Michael Riegler, et. al.
Proceedings of CBMI, Bucharest, Romania, June 2016
67. • Tensorflow as backend
• Based on Inception v3
• Last layers removed
• Model retrained on medical data
• Applying simple transformations to increase
size of training set
• Very long training time
• Applying model is fast
67ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Basic CNN-Based Detection
“Efficient disease detection in gastrointestinal videos - global features versus neural networks"
Konstantin Pogorelov, et. al.
Multimedia Tools and Applications, 2017
71. • Process only frames containing polyps
• Performs image enhancement
• Detects curve-shaped objects and
local maximums
• Builds energy map and selects
4 possible locations
• Localization performance:
• recall 31.83 %,
• precision 32.07%
• ~30 fps
• later better GPU: ~75 fps (detection: 300 fps ; localization 100 fps)
71ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
ASU Mayo Dataset: First Try for Polyp Localization
72. • Vestre Viken (VV) multi-disease dataset (250 images per class)
• GF:
• recall 90.60 %
• precision 91.40%
• fps ~30
• CNN:
• recall: 87.20%
• precision: 87.90%
• fps: ~30
72ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
VV Dataset: Multi-Disease Detection
""Efficient disease detection in gastrointestinal videos - global features versus neural networks"
Konstantin Pogorelov, et. al.
Multimedia Tools and Applications, 2017
73. • GF
• CNN
73ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
VV Dataset: Multi-Disease Detection
""Efficient disease detection in gastrointestinal videos - global features versus neural networks"
Konstantin Pogorelov, et. al.
Multimedia Tools and Applications, 2017
74. • 7 different algorithms
• Convolutional neural networks (CNN) (2) – trained from scratch
• 3-layers
• 6-layers
• Transfer learning (1) – retrained Inception v3
• Global features (4)
• 2 global features (JCD, Tamura)
• 6 global features (JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG)
• 2 different algorithms (Random forest and logistic model tree)
• 2 baselines
• Random Forrest with one global feature
• Majority class
• 2-folded cross validation
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Kvasir Dataset v1: Multi-Disease Detection
78. • Using same GF and some new deep features, i.e.,
• Pre-trained ImageNet dataset Inception v3
• ResNet50 models
• Used different ML classifications;
• random tree (RT)
• random forest (RF)
• logistic model tree (LMR) – performed best
• Uses weights of 1000 pre-defined concepts as
features
• Top layer input as features vector
(16384 for Inception v3 and 2048 for ResNet50)
ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Kvasir Dataset v1 à v2: Multi-Disease Detection
Pretrained
model
Output or top-
layer input
weights
WEKA for
classification
78
Team Approaches F1 FPS
SCL-UMD Global-features and deep-features extraction,
Inception-V3 and VGGNet CNN models, followed by
machine-learning-based classification using RT, RF, SVM
and LMR classifiers
0.848 1.3
FAST-NU-DS Global and local features combined followed by data size
reduction by applying K-means clustering and than
using logistic regression model for the classification
0.767 2.3
ITEC-AAU Two different custom Inception-like CNN models 0.755 1.4
HKBU A manifold learning method (bidirectional marginal
Fisher analysis) learning a compact representation of the
data, then machine-learning-based multi-class support
vector machine is used for the classification
0.703 2.2
SIMULA GF-features extraction, ResNet50 and Inception-V3 CNN
models and followed by machine-learning-based
classification using RT, RF and LMR classifiers
0.826 46.0
79. • 7 different algorithms
• Convolutional neural networks (CNN) (2) – trained from scratch
• 3-layers
• 6-layers
• Transfer learning (1) – retrained Inception v3
• Global features (4)
• 2 global features (JCD, Tamura)
• 6 global features
(JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG)
• 2 different algorithms (Random forest and logistic model tree)
• 2 baselines
• Random Forrest with one global feature
• Majority class
• 2-folded cross validation
79ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Nerthus Dataset: Bowel Cleanness Level
82. • Too little data
• Blurry images due to camera motion
• Objects too close to camera
• Under or over scene lighting
• Flares
• Artificial objects and natural “contaminations”
• Low resolution of capsular endoscopes
• …
82ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Data Challenges: Preprocessing
87. • Polyps
• Input:
Camera or Video files
• Output:
Live stream and
Performance reports
• Full HD
• Real-time: 30 FPS
87ACM Multimedia 2017 Tutorial Medical Multimedia Information Systems (MMIS)
Real-time Detection Feedback