SlideShare ist ein Scribd-Unternehmen logo
1 von 53
On Integrating Information Visualization
Techniques into Data Mining: A Review
by
Keqian Li (2015)
Sushant Gautam
MSIISE
Department of Electronics and Computer Engineering
Institute of Engineering, Thapathali Campus
6 April 2022
Background
● exploding growth of digital data in the information era
○ immensurable potential value
● Need for different types of data-driven techniques
○ exploit available data
● Datamining and Information Visualization
○ different communities/approaches of problem solving
● Combining:
○ DM: sophisticated algorithmic techniques
○ IV: intuitively and interactivity
2
Stages of typical process of data mining
● Preliminary data analysis
● Model construction
● Model evaluation
3
VISUALIZATION FOR PRELIMINARY DATA
ANALYSIS
4
PRELIMINARY DATA ANALYSIS
● Light-weight before main data mining algorithm
● show the basic characteristic of data
● quick grasp of intuitive
● formulate further strategy of heavy weighted analysis
○ facilitate further data mining approach
● visualization of preliminary analysis of data
○ visual representation, arrangement and simple manipulation
5
Scalar value data visualization
● Scalar value distribution
○ Box and whiskers plot🖼, Variable width box plots, histograms
○ Bagplot and Functional boxplot (envelope) 🖼
● visualizing scalar value along other dimensions
○ study the relation between the scalar data & other dimensions
■ time series analysis for stock price
○ Scatter plots🖼, spatial index
6
7
same distribution with boxplot and
probability density function
2d scatterplot with contour and a
calculated regression line
8
Multi-dimensional data visualization
● Glyph 🖼 based vector visualization
○ Vector field or even a tensor field
○ Chernoff face 🖼
● Similarly encoding each dimension
○ Scatterplot matrix
○ Parallel Scatterplot Matrix
○ Parallel Coordinates techniques
● 3D Visual Data Mining
○ 3D Parallel Coordinates Trees🖼
○ Equalized Density Surfaces 🖼 9
3D Scatterplots
10
example of glyphs
(a) Stick figures form textural patterns
(b) Dense glyph packing for diffusion tensor data
(c) Helix glyphs on maps for analyzing cyclic
temporal patterns for two diseases
(d) The local flow probe can simultaneously depict a
multitude of different variables
3D-Parallel-CoordinateTrees
11
12
Density-equalization on a hexagon
Density-equalization on a 3D cube
Bipartite flow visualization
Hierarchical heavy hitters
NFlowVis: 🖼 network flow between local hosts and external
IPs
14
Computed Lattice for a
bipartite relationship
3D visualization for
binary heavy hitters
15
Nflowvis for network traffic analysis
Network Data
- online social media, physical/biological interaction network
- UCINet, JUNG, and GUESS, GraphViz, Pajek, Visone
● Node-link diagrams
○ Vizster 🖼
● Adjacency matrix view
○ MatLink 🖼, NodeTrix 🖼
16
17
Vizster for
attribute search
enhanced
node–link
diagrams
18
MatLink for links and highlight enhanced matrix view
NodeTrix using matrix view for local communities'
analysis
Text Data
WWW: unstructured data beyond the multidimensional
framework described above
● bag of words
● Analysis on high dimensional space
○ ThemeRiver 🖼, The BETA system 🖼, The TextArc 🖼
● Analysis with dimension reduction
○ Navigating Wikipedia with a topic model 🖼
○ Starlight’s text visualization system 🖼
19
20
StreamGraph*: kind of used in ThmeRiver
system using the word frequency vector to
analyze the text
BETA system using entity-based analysis to help the
search query visualization
21
TextArc system for
concordance visualization
22
one topics from a dynamic topic model fit to Science
from 1880 to 2002 WEBSOM text visualization system with
SOM dimension reduction
23
Architecture of WebSOM method
24
Navigating Wikipedia: Navigating flow based on the topic model fit to Wikipedia
25
Starlight’s text
visualization system
using TRUST for
dimension reduction
Boeing Text
Representation
Using Subspace
Transformation
VISUALIZATION FOR MODEL
CONSTRUCTION
26
VISUALIZATION FOR MODEL CONSTRUCTION
● Data mining: highly automated model
● marrying the classical techniques with the idea of info vis
○ stimulated the research on Interactive model construction
○ visualization is provided for partial results of the current model
○ user can provide feedback to change the behavior of the process.
○ incorporate automatic analysis with the human wisdom
○ greatly increase interpretability of the model
27
Progressive construction
step-by-step construction
● Weka: Visual decision tree construction 🖼
● PBC: Perception Based Classification
● HDEye system for visual clustering 🖼
28
30
Weka: Visual decision tree construction
31
the HDEye system for visual clustering
Iterative prototyping
give feedback to a visual prototype of the trained model
● Seifert et.al.
● the Ensemble Matrix interface
32
33
The system by Seifert
et.al. The classification
probability view allows
users to select a falsely
classified item to the
right class and retrain
the classifier.
34
Primary view in EnsembleMatrix. Confusion matrices of component classifiers are shown in
thumbnails on the right. The matrix on the left shows the confusion matrix of the current ensemble
classifier built by the user.
Interactive pipeline construction
● Each submodule as black box
● visualization system for users to assemble submodules
● Weka
● RapidMiner, KNIME, STATISTICA, Orange
35
36
The Overview Panel in Rapidminer
system shows the entire data mining
pipeline.
Weka Panel
VISUALIZATION FOR MODEL EVALUATION
37
VISUALIZATION FOR MODEL EVALUATION
● visually convey the results of a mining task
○ such as classification, clustering
● answer the question
○ what the conclusions are
○ why a model makes such conclusions
○ how good they perform
● critical for human makers to analyze model behavior
38
Classification
● Test oriented visualization
○ ROC curves
○ Contingency Wheel++
● Instance space-based visualization
○ SPLOM based visualization tool 🖼
○ Projection techniques based on self-organizing maps (SOM)
● Factor analysis
○ Nomograms
○ CViz for multivariate data visualization
39
40
Byklov et. al’s tool showing the scatterplot matrix for an iteration
41
Contingency Wheel++
Alsallakh et.al’s tool, showing
(a) the confusion wheel
(b) the feature analysis view
(c, d) histograms and scatterplots for the separability of the selected features
42
Rheingans et.al’s tool, showing SOM
probability map with test instances
43
Nomograph showing the nonlinearities in the Pressure-
Temperature relation. Demo tool Scary Nomograph
Clustering
● H-BLOB clustering visualization
● OPSM biclustering method
● Hierarchical Clustering Explorer
44
45
Bicluster visualization, showing the overlapping,
similarity and separation for clusters and entities
The H-BLOB visualizes the clusters by
computing a hierarchy of implicit surfaces
46
HCE(Hierarchical
Clustering
Explorer)
Association Rules
● Double-Decker plots
● Parallel Coordinates
● Network
● TwoKey plot
47
48
(a) Doubledecker plot for the
OvaryCancer data showing the
conditional distribution of X-ray, given
operation, given stage, and with
survival highlighted.
49
(b) Parallel Coordinates
(c) Network (d) TwoKey plot
Neuron Network
● SOM
○ U-matrix
○ Distance Matrix
○ Similarity coloring
● Visualization of neurons
○ semantic conceptual features: stimuli
50
51
Different techniques for showing cluster structure among SOM cells.
52
Visualization of neurons shows evolution of a randomly chosen neurons in different layers through the training
process. Each layer’s features are displayed in a different horizontal block
Deep learning for Building
High-level Features
Graphical Probabilistic Models
• capturing dependencies among random variables
• building large-scale multivariate statistical models
• nodes encodes a variable, link encodes correlation.
Fluxflow 🖼
53
54
The FluxFlow system visualizes the hidden states inside the probabilistic model
Conclusions
● Surveyed integrating the wisdom in two fields towards
more effective and efficient data analytic
● Info Viz compliments Data Mining in all of the three key
stages.
55

Weitere ähnliche Inhalte

Was ist angesagt?

Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
 
Improving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computingImproving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computingRAHUL BHOJWANI
 
3 fmewt19 - automating quality control
3   fmewt19 - automating quality control3   fmewt19 - automating quality control
3 fmewt19 - automating quality controlConsortech
 
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Tiziano De Matteis
 
Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms cscpconf
 
A New Key Stream Generator Based on 3D Henon map and 3D Cat map
A New Key Stream Generator Based on 3D Henon map and 3D Cat mapA New Key Stream Generator Based on 3D Henon map and 3D Cat map
A New Key Stream Generator Based on 3D Henon map and 3D Cat maptayseer Karam alshekly
 
Benchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph AlgorithmsBenchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph AlgorithmsYash Khandelwal
 
Heterogeneous data fusion with multiple kernel growing self organizing maps
Heterogeneous data fusion with multiple kernel growing self organizing mapsHeterogeneous data fusion with multiple kernel growing self organizing maps
Heterogeneous data fusion with multiple kernel growing self organizing mapsPruthuvi Maheshakya Wijewardena
 
markov chain map generation
markov chain map generationmarkov chain map generation
markov chain map generationQasim Raheem
 
Nas net where model learn to generate models
Nas net where model learn to generate modelsNas net where model learn to generate models
Nas net where model learn to generate modelsKhang Pham
 

Was ist angesagt? (15)

Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloud
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
YOLACT
YOLACTYOLACT
YOLACT
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
 
Improving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computingImproving access to satellite imagery with Cloud computing
Improving access to satellite imagery with Cloud computing
 
3 fmewt19 - automating quality control
3   fmewt19 - automating quality control3   fmewt19 - automating quality control
3 fmewt19 - automating quality control
 
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
 
Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms
 
A New Key Stream Generator Based on 3D Henon map and 3D Cat map
A New Key Stream Generator Based on 3D Henon map and 3D Cat mapA New Key Stream Generator Based on 3D Henon map and 3D Cat map
A New Key Stream Generator Based on 3D Henon map and 3D Cat map
 
Benchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph AlgorithmsBenchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph Algorithms
 
Heterogeneous data fusion with multiple kernel growing self organizing maps
Heterogeneous data fusion with multiple kernel growing self organizing mapsHeterogeneous data fusion with multiple kernel growing self organizing maps
Heterogeneous data fusion with multiple kernel growing self organizing maps
 
markov chain map generation
markov chain map generationmarkov chain map generation
markov chain map generation
 
Nas net where model learn to generate models
Nas net where model learn to generate modelsNas net where model learn to generate models
Nas net where model learn to generate models
 
BarnieMAT
BarnieMATBarnieMAT
BarnieMAT
 

Ähnlich wie On Integrating Information Visualization Techniques into Data Mining: A Review by Keqian Li (2015)

Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...miyurud
 
Visual Search and Question Answering II
Visual Search and Question Answering IIVisual Search and Question Answering II
Visual Search and Question Answering IIWanjin Yu
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link PredictionMemory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link Predictionmiyurud
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsSangmin Woo
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AIDatabricks
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfRushikeshKulkarni71
 
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptxthanhdowork
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworkstm1966
 
Обучение нейросетей компьютерного зрения в видеоиграх
Обучение нейросетей компьютерного зрения в видеоиграхОбучение нейросетей компьютерного зрения в видеоиграх
Обучение нейросетей компьютерного зрения в видеоиграхAnatol Alizar
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 

Ähnlich wie On Integrating Information Visualization Techniques into Data Mining: A Review by Keqian Li (2015) (20)

Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
Scalable Graph Convolutional Network Based Link Prediction on a Distributed G...
 
Vivarana literature survey
Vivarana literature surveyVivarana literature survey
Vivarana literature survey
 
Visual Search and Question Answering II
Visual Search and Question Answering IIVisual Search and Question Answering II
Visual Search and Question Answering II
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link PredictionMemory Efficient Graph Convolutional Network based Distributed Link Prediction
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene Graphs
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AI
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
 
The Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention ModelsThe Importance of Time in Visual Attention Models
The Importance of Time in Visual Attention Models
 
algorithms
algorithmsalgorithms
algorithms
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdf
 
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
[20240408_LabSeminar_Huy]PivotalSTGNN.pptx
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
Data visualization
Data visualizationData visualization
Data visualization
 
Graph db
Graph dbGraph db
Graph db
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Обучение нейросетей компьютерного зрения в видеоиграх
Обучение нейросетей компьютерного зрения в видеоиграхОбучение нейросетей компьютерного зрения в видеоиграх
Обучение нейросетей компьютерного зрения в видеоиграх
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 

Kürzlich hochgeladen

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 

Kürzlich hochgeladen (20)

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

On Integrating Information Visualization Techniques into Data Mining: A Review by Keqian Li (2015)

  • 1. On Integrating Information Visualization Techniques into Data Mining: A Review by Keqian Li (2015) Sushant Gautam MSIISE Department of Electronics and Computer Engineering Institute of Engineering, Thapathali Campus 6 April 2022
  • 2. Background ● exploding growth of digital data in the information era ○ immensurable potential value ● Need for different types of data-driven techniques ○ exploit available data ● Datamining and Information Visualization ○ different communities/approaches of problem solving ● Combining: ○ DM: sophisticated algorithmic techniques ○ IV: intuitively and interactivity 2
  • 3. Stages of typical process of data mining ● Preliminary data analysis ● Model construction ● Model evaluation 3
  • 5. PRELIMINARY DATA ANALYSIS ● Light-weight before main data mining algorithm ● show the basic characteristic of data ● quick grasp of intuitive ● formulate further strategy of heavy weighted analysis ○ facilitate further data mining approach ● visualization of preliminary analysis of data ○ visual representation, arrangement and simple manipulation 5
  • 6. Scalar value data visualization ● Scalar value distribution ○ Box and whiskers plot🖼, Variable width box plots, histograms ○ Bagplot and Functional boxplot (envelope) 🖼 ● visualizing scalar value along other dimensions ○ study the relation between the scalar data & other dimensions ■ time series analysis for stock price ○ Scatter plots🖼, spatial index 6
  • 7. 7 same distribution with boxplot and probability density function 2d scatterplot with contour and a calculated regression line
  • 8. 8
  • 9. Multi-dimensional data visualization ● Glyph 🖼 based vector visualization ○ Vector field or even a tensor field ○ Chernoff face 🖼 ● Similarly encoding each dimension ○ Scatterplot matrix ○ Parallel Scatterplot Matrix ○ Parallel Coordinates techniques ● 3D Visual Data Mining ○ 3D Parallel Coordinates Trees🖼 ○ Equalized Density Surfaces 🖼 9 3D Scatterplots
  • 10. 10 example of glyphs (a) Stick figures form textural patterns (b) Dense glyph packing for diffusion tensor data (c) Helix glyphs on maps for analyzing cyclic temporal patterns for two diseases (d) The local flow probe can simultaneously depict a multitude of different variables 3D-Parallel-CoordinateTrees
  • 11. 11
  • 12. 12 Density-equalization on a hexagon Density-equalization on a 3D cube
  • 13. Bipartite flow visualization Hierarchical heavy hitters NFlowVis: 🖼 network flow between local hosts and external IPs 14 Computed Lattice for a bipartite relationship 3D visualization for binary heavy hitters
  • 14. 15 Nflowvis for network traffic analysis
  • 15. Network Data - online social media, physical/biological interaction network - UCINet, JUNG, and GUESS, GraphViz, Pajek, Visone ● Node-link diagrams ○ Vizster 🖼 ● Adjacency matrix view ○ MatLink 🖼, NodeTrix 🖼 16
  • 17. 18 MatLink for links and highlight enhanced matrix view NodeTrix using matrix view for local communities' analysis
  • 18. Text Data WWW: unstructured data beyond the multidimensional framework described above ● bag of words ● Analysis on high dimensional space ○ ThemeRiver 🖼, The BETA system 🖼, The TextArc 🖼 ● Analysis with dimension reduction ○ Navigating Wikipedia with a topic model 🖼 ○ Starlight’s text visualization system 🖼 19
  • 19. 20 StreamGraph*: kind of used in ThmeRiver system using the word frequency vector to analyze the text BETA system using entity-based analysis to help the search query visualization
  • 21. 22 one topics from a dynamic topic model fit to Science from 1880 to 2002 WEBSOM text visualization system with SOM dimension reduction
  • 23. 24 Navigating Wikipedia: Navigating flow based on the topic model fit to Wikipedia
  • 24. 25 Starlight’s text visualization system using TRUST for dimension reduction Boeing Text Representation Using Subspace Transformation
  • 26. VISUALIZATION FOR MODEL CONSTRUCTION ● Data mining: highly automated model ● marrying the classical techniques with the idea of info vis ○ stimulated the research on Interactive model construction ○ visualization is provided for partial results of the current model ○ user can provide feedback to change the behavior of the process. ○ incorporate automatic analysis with the human wisdom ○ greatly increase interpretability of the model 27
  • 27. Progressive construction step-by-step construction ● Weka: Visual decision tree construction 🖼 ● PBC: Perception Based Classification ● HDEye system for visual clustering 🖼 28
  • 28. 30 Weka: Visual decision tree construction
  • 29. 31 the HDEye system for visual clustering
  • 30. Iterative prototyping give feedback to a visual prototype of the trained model ● Seifert et.al. ● the Ensemble Matrix interface 32
  • 31. 33 The system by Seifert et.al. The classification probability view allows users to select a falsely classified item to the right class and retrain the classifier.
  • 32. 34 Primary view in EnsembleMatrix. Confusion matrices of component classifiers are shown in thumbnails on the right. The matrix on the left shows the confusion matrix of the current ensemble classifier built by the user.
  • 33. Interactive pipeline construction ● Each submodule as black box ● visualization system for users to assemble submodules ● Weka ● RapidMiner, KNIME, STATISTICA, Orange 35
  • 34. 36 The Overview Panel in Rapidminer system shows the entire data mining pipeline. Weka Panel
  • 35. VISUALIZATION FOR MODEL EVALUATION 37
  • 36. VISUALIZATION FOR MODEL EVALUATION ● visually convey the results of a mining task ○ such as classification, clustering ● answer the question ○ what the conclusions are ○ why a model makes such conclusions ○ how good they perform ● critical for human makers to analyze model behavior 38
  • 37. Classification ● Test oriented visualization ○ ROC curves ○ Contingency Wheel++ ● Instance space-based visualization ○ SPLOM based visualization tool 🖼 ○ Projection techniques based on self-organizing maps (SOM) ● Factor analysis ○ Nomograms ○ CViz for multivariate data visualization 39
  • 38. 40 Byklov et. al’s tool showing the scatterplot matrix for an iteration
  • 39. 41 Contingency Wheel++ Alsallakh et.al’s tool, showing (a) the confusion wheel (b) the feature analysis view (c, d) histograms and scatterplots for the separability of the selected features
  • 40. 42 Rheingans et.al’s tool, showing SOM probability map with test instances
  • 41. 43 Nomograph showing the nonlinearities in the Pressure- Temperature relation. Demo tool Scary Nomograph
  • 42. Clustering ● H-BLOB clustering visualization ● OPSM biclustering method ● Hierarchical Clustering Explorer 44
  • 43. 45 Bicluster visualization, showing the overlapping, similarity and separation for clusters and entities The H-BLOB visualizes the clusters by computing a hierarchy of implicit surfaces
  • 45. Association Rules ● Double-Decker plots ● Parallel Coordinates ● Network ● TwoKey plot 47
  • 46. 48 (a) Doubledecker plot for the OvaryCancer data showing the conditional distribution of X-ray, given operation, given stage, and with survival highlighted.
  • 47. 49 (b) Parallel Coordinates (c) Network (d) TwoKey plot
  • 48. Neuron Network ● SOM ○ U-matrix ○ Distance Matrix ○ Similarity coloring ● Visualization of neurons ○ semantic conceptual features: stimuli 50
  • 49. 51 Different techniques for showing cluster structure among SOM cells.
  • 50. 52 Visualization of neurons shows evolution of a randomly chosen neurons in different layers through the training process. Each layer’s features are displayed in a different horizontal block Deep learning for Building High-level Features
  • 51. Graphical Probabilistic Models • capturing dependencies among random variables • building large-scale multivariate statistical models • nodes encodes a variable, link encodes correlation. Fluxflow 🖼 53
  • 52. 54 The FluxFlow system visualizes the hidden states inside the probabilistic model
  • 53. Conclusions ● Surveyed integrating the wisdom in two fields towards more effective and efficient data analytic ● Info Viz compliments Data Mining in all of the three key stages. 55

Hinweis der Redaktion

  1. explore how information vis can be used to complement and improve different stages of data mining through established theories for optimized visual presentation as well as practical toolsets for rapid development
  2. off-the-shelf visualization techniques both from recent data mining and visual data analytics research or classical statistical plot technique, grouped by different type of data to be visualized
  3. a large group of values, one may be interested in the overall trend and aggregation of these values instead of observing individual one More specifically, here we are interested in visualizing the distribution of scalar value
  4. In statistical graphics, the functional boxplot is an informative exploratory tool that has been proposed for visualizing functional data Again, analogous to the classical boxplot, it uses of the envelope of 50
  5. In archaeology, a glyph is a carved or inscribed symbol
  6. Hide this
  7. visualization technique for a specific type of relation between datas, the flow intensity for two bipartite dimensions, to show how visual idiom can be developed based on the nature of specific type of relations
  8. picture shows attacks from the Internet to computers located at the MIT. The background represents the university’s network structure with computer systems as rectangles. External hosts are shown as colored circles on the outside. The splines represent the connections between attackers and computers within the network. This reveals a network scan (from top) and a distributed attack (bottom) originating from hundreds of hosts working together in attempt to break into specific computer systems.
  9. Network Data arises in many important applications, such as from online social media, and physical or biological interaction network, and more generally a set of entities with interdependencies among others , underscored by the surge of social media over the recent years, such as Twitter and Facebook [25]
  10. NodeTrix: InfoVis 2004 coauthorship dataset to identity three types of different inner community structure
  11. Concordance: alphabetical list of the words (especially the important ones) present in a text, usually with citations of the passages in which they are found.
  12. The scatterplot matrix, known acronymically as SPLOM
  13. per-iteration performance information for the algorithm
  14. places these histograms in a ring chart whose sectors represent different classes. In contrast to the matrix, this layout emphasizes the classes as primary visual objects with all information related to a class grouped in one place. For each class, the samples S are divided into four sets according to their classification results:TP, FP, TN, and FN, further divided by confidence. The chords between the sectors encode the confusion wheel depicts class confusions . A chord between two classes is depicted with a varying thickness denoting the confusion rate for each corresponding pair. The feature view on the right emphasizes the features of these samples and ranks them by their separation power between selected true and false classifications where they use techniques such as Boxplots, stacked histogram, separation measures, recall-precision graph for analyzing how the data features are distributed among the affected samples. The system also allow interactively defining and evaluating post-classification rules.
  15. representation of the model where the data space has been projected to two dimensions using a SOM. The decision boundary, i.e. the boundary between positive and negative predictions projected on the 2D plane is shown by a white line. Glyph size indicates the number of instances at a given point. A continuous color map is used to show the proportion of class labels in the set of collocated instances. Yellow shows predominantly positive instances, red shows predominantly negative instances. Test instances is encoded as a small spheres shaped glyph, with size indicating the density of instances and a continuous color indicating the proportion of class labels in the set of collocated, yellow positive for and red for negative. Notice that in the plot many of the misclassified instances are located in the vicinity of decision boundary, showing they are not too far from ”right
  16. The process of generating conclusions for classifer usually involve different factors from the feature of data instance. Visualizing the relationship between the different factors and the final conclusions for different data instance will shed light on the question of why a model makes such conclusions
  17. Double-Decker plots [60] provide a visualization for single association rules but also for all its the related rules. They were introduced to visualize each element of a multivariate contingency table as a tile in the plot and they have been adapted to visualize all the attributes involved in a rule by drawing a bar chart for the consequence item and using linking highlighting for the antecedent items
  18. Parallel coordinates have also been used for AR visualize to deal with the high dimensionality of the rules [15, 72, 121]. For example, the approach proposed by [121] starts from arranging items by groups on a number of parallel axes. A rule is represented as a polyline joining the items in the antecedent followed by an arrow connecting another polyline for the items in the consequenc network representation of AR is provided where item and rules are encoded as nodes and each rule node is linked to corresponding items in antecedent and consequence by edges. . The support values for the antecedent and consequence of each association rule are indicated by the sizes and colors of each circle. The thickness of each line indicates the confidence value while the sizes and colours of the circles in the center, above the Implies label, indicate the support of each rule. The TwoKey plot is a 2D plot approach for AR visualization that represents the rules according to their confidence and support values. In such a plot, each rule is a point in a 2-D space where the xaxis and the y-axis ranges respectively from the minimum to the maximum values of the supports and of the confidences and different colors are used to highlight the order of the rules
  19. Figure (a) shows the U-matrix visualziation, where white dots represents the neurons and hexagon represents the values of U-matrix. The gray scale encodes the distance to the neighboring unit. Darker color means bigger distance. Clusters in the plot can be observed as light areas enclosed by darker borders. b) use size of map unit to encode the average distance to its neighbors. (c) takes yet another approach by encoding the closeness in input space by similarity in the hue of colors for each hexagon area
  20. Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset.