SlideShare ist ein Scribd-Unternehmen logo
1 von 37
A Methodic Approach to
Good Data Visualization
Luca Candela - @luckymethod
Luca Candela
DataPad Inc. // UX Eye // @luckymethod
Men of great rank, or active business, can only
pay attention to particulars of use […] it is hoped
that with the assistance of these Charts,
information will be got, without the fatigue and
trouble of studying the particulars [...]
William Playfair - Commercial and Political Atlas, 1786
Data visualization is the art of
*reducing information in a data set while
preserving the knowledge contained in it.
*we can talk about what “reducing information” means in this case...
Data Preparation Data Visualization
Discovery of
knowledge
Conceptual data analysis workflow
Hadley Wickham popularized a concept called
split-apply-combine
as a way of thinking about data querying.
http://www.jstatsoft.org/v40/i01/paper
For the four most revenue generating
countries, what are the top three most
revenue generating categories?
Country Venue Type Sum Revenue
United States Fast Food $16
Street $10
Restaurant $9
France Cafe $18
Pub $12
Restaurant $2
Canada Cafe $10
Fast Food $4
Street $3
Japan Street $5
Fast Food $4
Pub $1
apply: Sum Revenue
Canada
United States
Germany
France
Japan
split by country
combine: sort descending by
Sum Revenue, limit 4
Country Sum Revenue
United States
France
Canada
Japan
$ 83
$ 42
$ 36
$ 18
data
Sum Revenue =
$ 36
Sum Revenue =
$ 83
Sum Revenue =
$ 8
Sum Revenue =
$ 42
Sum Revenue =
$ 18
The basics of split-apply-combine
Canada
United States
Germany
France
Japan
data
bus stop
fastfood
park
...
restaurant
hair saloon
pub
...
restaurant
street
cafe
...
park
pub
street
Country Sum Revenue
United States
France
Canada
Japan
$ 16
$ 10
$ 9
$ 18
$ 12
$ 2
$ 10
$ 4
$ 3
$ 5
$ 4
$ 1
Venue type
fastfood
street
restaurant
cafe
pub
restaurant
cafe
fastfood
park
street
fastfood
pub
...
The basics of split-apply-combine
Country Sum Revenue
United States
France
Canada
Japan
split by country,
combine by sorting
desc. on Sum
Revenue,
map to the vertical
axis using an ordinal
scale.
add labels
apply: sum revenue,
call it Sum Revenue,
plot rectangles and map
length to the horizontal
axis using a linear scale,
Color with #45808E.
Use `Country` as label
Split-apply-combine thinking translates to visualizations
1. split on state
apply sum population
combine: sort desc. by population; limit 6
Nested split-apply-combine underpins more complex visualizations
2. split on age (bin by 5 year)
combine: sort by age
apply sum population
Data Visualization can be thought as a
visual mapping function applied
during the *Apply and Combine steps.
*although it can be thought as applied exclusively during the combine step…
Name Operation Lines
Vadim Added 100
Luca Removed 34
Vadim Added 65
Vadim Removed 5
Luca Added 24
Vadim Removed 71
Luca Removed 45
Vadim Added 7
... ... ...
-960
LucaVadim
1531
-321
739
0
1k
2k
-2k
-1k
“plot”
AdditionsDeletions
Reduce information, preserve knowledge...
Question: Mapping of what, to what?
Types of data
ID Timestamp Location Name Operation Lines Pass Test?
0000001 11-05-2013 10.45 am San Francisco Vadim Added 100 Yes
0000002 11-05-2013 11.12 am San Bruno Luca Removed 34 Yes
0000003 11-05-2013 11.30 am San Francisco Vadim Added 65 Yes
0000004 11-05-2013 11.34 am San Francisco Vadim Removed 5 Yes
0000005 11-05-2013 11.43 am San Bruno Luca Added 24 No
0000006 11-05-2013 11.45 am San Francisco Vadim Removed 71 Yes
0000007 11-05-2013 12.51 pm San Francisco Luca Removed 45 Yes
0000008 11-05-2013 12.55 pm San Francisco Vadim Added 7 No
... ... ... ... ... ... ...
Categorical # Discrete
# Continuous# Discrete
Boolean
There are other ways to classify data,
but this one will get you very far.
pick up a good statistics book and just start reading...
Types of variables
1. Independent
a. a variable that isn't changed by the other
variables you are trying to measure. It
usually goes on the x axis.
2. Dependent
a. It is a variable that changes depending on
other variable(s). It usually goes on the y
axis.
-960
LucaVadim
1531
-321
739
0
1k
2k
-2k
-1k
AdditionsDeletions
Dependent Variable
Independent Variable
Variables of a visualization
1. Position (x,y)
2. Size (big, small…)
3. Value (bright, dark…)
4. Texture (hatched, dotted…)
5. Color (blue, red…)
6. Orientation (degree)
7. Shape (triangle, circle…)
y
x
# Discrete # Continuous Categorical Boolean
y
x
y
x
y
x
y
x
Optimal mappings by type
-960
LucaVadim
1531
-321
739
0
1k
2k
-2k
-1k
AddedRemoved
Name Operation Lines
Vadim Added 100
Luca Removed 34
Vadim Added 65
Vadim Removed 5
Luca Added 24
Vadim Removed 71
Luca Removed 45
Vadim Added 7
... ... ...
Split on Name
Split on Operation
Apply Sum(Added)
Apply Sum(Removed)
Combine -Removed map to
Red, value to size
Combine Added map to
Green, value to size
Combine Name map to x axis
Apply the minimum number of mappings
that illustrates the underlying question
you are trying to answer.
Choosing the right viz...
1. Label your axes
2. Include measurement units
3. Explain your encodings (add a legend)
4. Remove redundant information
5. Don’t fuck with distort the axis, especially with time series
Golden rules - Part 1
Golden rules - Part 2
1. If you are trying to visualize rate of change, then do it
2. Remove outliers, but know they are there
3. Tools have their own biases and quirks, know them.
4. The solution to 80% of your problems are bar charts and
histograms
5. Data Tables are visualizations too
...there are thousands of good rules, but the best one is still “keep it simple”
Some examples
this is going to be fun...
Example 1
Simple bar chart Linear scale
Missing bucket (4.8 - 4.9) Missing bucket (4.8 - 4.9)
Example 2
Example 2 - better
No - Human
Yes - Robot
Example 3
Example 4
Example 5
OK, this is comically bad, I was just going for a good collective giggle...
Books you should read
everybody knows about Tufte, so please don’t bring it up
The Semiology of Graphics, 1967
Jaques Bertin
The Elements of Graphing Data, 1985
&
Visualizing Data, 1993
William S. Cleveland
www.datapad.io
Thank you!
for questions, tweet me at @luckymethod

Weitere ähnliche Inhalte

Was ist angesagt?

SPECIAL PRODUCTS
SPECIAL PRODUCTSSPECIAL PRODUCTS
SPECIAL PRODUCTSzanedomingo
 
Sparse Binary Zero Sum Games (ACML2014)
Sparse Binary Zero Sum Games (ACML2014)Sparse Binary Zero Sum Games (ACML2014)
Sparse Binary Zero Sum Games (ACML2014)Jialin LIU
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)Kobkrit Viriyayudhakorn
 
Ch 3 rev trashketball exp logs
Ch 3 rev trashketball exp logsCh 3 rev trashketball exp logs
Ch 3 rev trashketball exp logsKristen Fouss
 
SPECIAL PRODUCTS
SPECIAL PRODUCTSSPECIAL PRODUCTS
SPECIAL PRODUCTSPIA_xx
 
Chess board problem(divide and conquer)
Chess board problem(divide and conquer)Chess board problem(divide and conquer)
Chess board problem(divide and conquer)RASHIARORA8
 
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...GroupFMathPeta
 
8.4 mixed.ppt worked
8.4 mixed.ppt worked8.4 mixed.ppt worked
8.4 mixed.ppt workedJonna Ramsey
 
X factoring revised
X factoring revisedX factoring revised
X factoring revisedsgriffin01
 
Comuter graphics dda algorithm
Comuter graphics dda algorithm Comuter graphics dda algorithm
Comuter graphics dda algorithm Rachana Marathe
 
Logic zoo ws 2013
Logic zoo ws 2013Logic zoo ws 2013
Logic zoo ws 2013dgbjdjg
 
Multiplication 3
Multiplication 3Multiplication 3
Multiplication 3Abha Arora
 
Mat0024 l8-16-sections 11-6-7
Mat0024 l8-16-sections 11-6-7Mat0024 l8-16-sections 11-6-7
Mat0024 l8-16-sections 11-6-7jheggo10
 
8th pre alg -l36--nov26
8th pre alg -l36--nov268th pre alg -l36--nov26
8th pre alg -l36--nov26jdurst65
 
7th pre alg -l36--dec7
7th pre alg -l36--dec77th pre alg -l36--dec7
7th pre alg -l36--dec7jdurst65
 

Was ist angesagt? (19)

SPECIAL PRODUCTS
SPECIAL PRODUCTSSPECIAL PRODUCTS
SPECIAL PRODUCTS
 
Sparse Binary Zero Sum Games (ACML2014)
Sparse Binary Zero Sum Games (ACML2014)Sparse Binary Zero Sum Games (ACML2014)
Sparse Binary Zero Sum Games (ACML2014)
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
 
Ch 3 rev trashketball exp logs
Ch 3 rev trashketball exp logsCh 3 rev trashketball exp logs
Ch 3 rev trashketball exp logs
 
Perfect square of Binomials
Perfect square of BinomialsPerfect square of Binomials
Perfect square of Binomials
 
SPECIAL PRODUCTS
SPECIAL PRODUCTSSPECIAL PRODUCTS
SPECIAL PRODUCTS
 
Chess board problem(divide and conquer)
Chess board problem(divide and conquer)Chess board problem(divide and conquer)
Chess board problem(divide and conquer)
 
Alg2 lesson 10-3
Alg2 lesson 10-3Alg2 lesson 10-3
Alg2 lesson 10-3
 
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...
Second Quarter Group F Math Peta - Special Products (Sq. of Bi, Sq. of Tri, S...
 
karnaugh maps
karnaugh mapskarnaugh maps
karnaugh maps
 
8.4 mixed.ppt worked
8.4 mixed.ppt worked8.4 mixed.ppt worked
8.4 mixed.ppt worked
 
X factoring revised
X factoring revisedX factoring revised
X factoring revised
 
Comuter graphics dda algorithm
Comuter graphics dda algorithm Comuter graphics dda algorithm
Comuter graphics dda algorithm
 
Logic zoo ws 2013
Logic zoo ws 2013Logic zoo ws 2013
Logic zoo ws 2013
 
Multiplication 3
Multiplication 3Multiplication 3
Multiplication 3
 
Mat0024 l8-16-sections 11-6-7
Mat0024 l8-16-sections 11-6-7Mat0024 l8-16-sections 11-6-7
Mat0024 l8-16-sections 11-6-7
 
8th pre alg -l36--nov26
8th pre alg -l36--nov268th pre alg -l36--nov26
8th pre alg -l36--nov26
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
7th pre alg -l36--dec7
7th pre alg -l36--dec77th pre alg -l36--dec7
7th pre alg -l36--dec7
 

Andere mochten auch

How to support content creators
How to support content creatorsHow to support content creators
How to support content creatorsMartin Lindeskog
 
Operations Professionals and Social Media: Compliance Hurdles and Business Va...
Operations Professionals and Social Media: Compliance Hurdles and Business Va...Operations Professionals and Social Media: Compliance Hurdles and Business Va...
Operations Professionals and Social Media: Compliance Hurdles and Business Va...Hearsay Social
 
Mobile Influences on Managed Travel
Mobile Influences on Managed TravelMobile Influences on Managed Travel
Mobile Influences on Managed TravelTim Hines
 
Digital Marketing Championship
Digital Marketing ChampionshipDigital Marketing Championship
Digital Marketing ChampionshipYogesh M. A.
 
World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportSergey Nazarov
 

Andere mochten auch (9)

How to support content creators
How to support content creatorsHow to support content creators
How to support content creators
 
Operations Professionals and Social Media: Compliance Hurdles and Business Va...
Operations Professionals and Social Media: Compliance Hurdles and Business Va...Operations Professionals and Social Media: Compliance Hurdles and Business Va...
Operations Professionals and Social Media: Compliance Hurdles and Business Va...
 
Mobile Influences on Managed Travel
Mobile Influences on Managed TravelMobile Influences on Managed Travel
Mobile Influences on Managed Travel
 
Digital Marketing Championship
Digital Marketing ChampionshipDigital Marketing Championship
Digital Marketing Championship
 
World Economic Forum Tipping Points Report
World Economic Forum Tipping Points ReportWorld Economic Forum Tipping Points Report
World Economic Forum Tipping Points Report
 
Medienseminar TopSoft 2006
Medienseminar TopSoft 2006Medienseminar TopSoft 2006
Medienseminar TopSoft 2006
 
Content Marketing Canvas
Content Marketing CanvasContent Marketing Canvas
Content Marketing Canvas
 
SlideShare 101
SlideShare 101SlideShare 101
SlideShare 101
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Ähnlich wie Visualize data using the split-apply-combine approach

6 sigma introduction
6 sigma introduction6 sigma introduction
6 sigma introductionGlobal Vision
 
20100119 mis
20100119 mis20100119 mis
20100119 misamikom
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfcookie1969
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015Modern Data Stack France
 
OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia Bharat Kalia
 
Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar OrientationAdvanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar OrientationDATAVERSITY
 
Pivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew RayPivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew RaySpark Summit
 
Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Mydbops
 
RTB Update 4 - Dominic Trigg, RocketFuel
RTB Update 4 - Dominic Trigg, RocketFuelRTB Update 4 - Dominic Trigg, RocketFuel
RTB Update 4 - Dominic Trigg, RocketFuelHusetMarkedsforing
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
4.Data-Visualization.pptx
4.Data-Visualization.pptx4.Data-Visualization.pptx
4.Data-Visualization.pptxPratyushJain37
 
A Picture is Worth a Thousand Words
A Picture is Worth a Thousand WordsA Picture is Worth a Thousand Words
A Picture is Worth a Thousand WordsJohn Park
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
2014-04-09, Data mining demo for astronomy researchers
2014-04-09, Data mining demo for astronomy researchers2014-04-09, Data mining demo for astronomy researchers
2014-04-09, Data mining demo for astronomy researchersSamuel Harrold
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Jim Stafford
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Jim Stafford
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Jim Stafford
 

Ähnlich wie Visualize data using the split-apply-combine approach (20)

6 sigma introduction
6 sigma introduction6 sigma introduction
6 sigma introduction
 
20100119 mis
20100119 mis20100119 mis
20100119 mis
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
 
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015Record linkage, a real use case with spark ml  - Paris Spark meetup Dec 2015
Record linkage, a real use case with spark ml - Paris Spark meetup Dec 2015
 
OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia
 
Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar OrientationAdvanced Analytics: Analytic Platforms Should Be Columnar Orientation
Advanced Analytics: Analytic Platforms Should Be Columnar Orientation
 
Pivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew RayPivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew Ray
 
Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.
 
RTB Update 4 - Dominic Trigg, RocketFuel
RTB Update 4 - Dominic Trigg, RocketFuelRTB Update 4 - Dominic Trigg, RocketFuel
RTB Update 4 - Dominic Trigg, RocketFuel
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
4.Data-Visualization.pptx
4.Data-Visualization.pptx4.Data-Visualization.pptx
4.Data-Visualization.pptx
 
A Picture is Worth a Thousand Words
A Picture is Worth a Thousand WordsA Picture is Worth a Thousand Words
A Picture is Worth a Thousand Words
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
Access intro
Access introAccess intro
Access intro
 
Chap12
Chap12Chap12
Chap12
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
2014-04-09, Data mining demo for astronomy researchers
2014-04-09, Data mining demo for astronomy researchers2014-04-09, Data mining demo for astronomy researchers
2014-04-09, Data mining demo for astronomy researchers
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 
Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010Fundamentals Of Data Mining 2010
Fundamentals Of Data Mining 2010
 

Kürzlich hochgeladen

How to Be Famous in your Field just visit our Site
How to Be Famous in your Field just visit our SiteHow to Be Famous in your Field just visit our Site
How to Be Famous in your Field just visit our Sitegalleryaagency
 
ARt app | UX Case Study
ARt app | UX Case StudyARt app | UX Case Study
ARt app | UX Case StudySophia Viganò
 
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)jennyeacort
 
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full NightCall Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full Nightssuser7cb4ff
 
3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdfSwaraliBorhade
 
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一Fi L
 
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
How to Empower the future of UX Design with Gen AI
How to Empower the future of UX Design with Gen AIHow to Empower the future of UX Design with Gen AI
How to Empower the future of UX Design with Gen AIyuj
 
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubai
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services DubaiDubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubai
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubaikojalkojal131
 
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...katerynaivanenko1
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfneelspinoy
 
Design principles on typography in design
Design principles on typography in designDesign principles on typography in design
Design principles on typography in designnooreen17
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造kbdhl05e
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Mookuthi is an artisanal nose ornament brand based in Madras.
Mookuthi is an artisanal nose ornament brand based in Madras.Mookuthi is an artisanal nose ornament brand based in Madras.
Mookuthi is an artisanal nose ornament brand based in Madras.Mookuthi
 
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一diploma 1
 
Call Girls Meghani Nagar 7397865700 Independent Call Girls
Call Girls Meghani Nagar 7397865700  Independent Call GirlsCall Girls Meghani Nagar 7397865700  Independent Call Girls
Call Girls Meghani Nagar 7397865700 Independent Call Girlsssuser7cb4ff
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Rndexperts
 

Kürzlich hochgeladen (20)

How to Be Famous in your Field just visit our Site
How to Be Famous in your Field just visit our SiteHow to Be Famous in your Field just visit our Site
How to Be Famous in your Field just visit our Site
 
ARt app | UX Case Study
ARt app | UX Case StudyARt app | UX Case Study
ARt app | UX Case Study
 
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
 
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full NightCall Girls Aslali 7397865700 Ridhima Hire Me Full Night
Call Girls Aslali 7397865700 Ridhima Hire Me Full Night
 
3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf3D Printing And Designing Final Report.pdf
3D Printing And Designing Final Report.pdf
 
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一
办理学位证(NUS证书)新加坡国立大学毕业证成绩单原版一比一
 
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制堪培拉大学毕业证(UC毕业证)#文凭成绩单#真实留信学历认证永久存档
 
How to Empower the future of UX Design with Gen AI
How to Empower the future of UX Design with Gen AIHow to Empower the future of UX Design with Gen AI
How to Empower the future of UX Design with Gen AI
 
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubai
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services DubaiDubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubai
Dubai Calls Girl Tapes O525547819 Real Tapes Escort Services Dubai
 
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdf
 
Call Girls in Pratap Nagar, 9953056974 Escort Service
Call Girls in Pratap Nagar,  9953056974 Escort ServiceCall Girls in Pratap Nagar,  9953056974 Escort Service
Call Girls in Pratap Nagar, 9953056974 Escort Service
 
Design principles on typography in design
Design principles on typography in designDesign principles on typography in design
Design principles on typography in design
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Mookuthi is an artisanal nose ornament brand based in Madras.
Mookuthi is an artisanal nose ornament brand based in Madras.Mookuthi is an artisanal nose ornament brand based in Madras.
Mookuthi is an artisanal nose ornament brand based in Madras.
 
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一
办理(USYD毕业证书)澳洲悉尼大学毕业证成绩单原版一比一
 
Call Girls Meghani Nagar 7397865700 Independent Call Girls
Call Girls Meghani Nagar 7397865700  Independent Call GirlsCall Girls Meghani Nagar 7397865700  Independent Call Girls
Call Girls Meghani Nagar 7397865700 Independent Call Girls
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025
 

Visualize data using the split-apply-combine approach

  • 1. A Methodic Approach to Good Data Visualization Luca Candela - @luckymethod
  • 2. Luca Candela DataPad Inc. // UX Eye // @luckymethod
  • 3. Men of great rank, or active business, can only pay attention to particulars of use […] it is hoped that with the assistance of these Charts, information will be got, without the fatigue and trouble of studying the particulars [...] William Playfair - Commercial and Political Atlas, 1786
  • 4. Data visualization is the art of *reducing information in a data set while preserving the knowledge contained in it. *we can talk about what “reducing information” means in this case...
  • 5. Data Preparation Data Visualization Discovery of knowledge Conceptual data analysis workflow
  • 6. Hadley Wickham popularized a concept called split-apply-combine as a way of thinking about data querying. http://www.jstatsoft.org/v40/i01/paper
  • 7. For the four most revenue generating countries, what are the top three most revenue generating categories? Country Venue Type Sum Revenue United States Fast Food $16 Street $10 Restaurant $9 France Cafe $18 Pub $12 Restaurant $2 Canada Cafe $10 Fast Food $4 Street $3 Japan Street $5 Fast Food $4 Pub $1
  • 8. apply: Sum Revenue Canada United States Germany France Japan split by country combine: sort descending by Sum Revenue, limit 4 Country Sum Revenue United States France Canada Japan $ 83 $ 42 $ 36 $ 18 data Sum Revenue = $ 36 Sum Revenue = $ 83 Sum Revenue = $ 8 Sum Revenue = $ 42 Sum Revenue = $ 18 The basics of split-apply-combine
  • 9. Canada United States Germany France Japan data bus stop fastfood park ... restaurant hair saloon pub ... restaurant street cafe ... park pub street Country Sum Revenue United States France Canada Japan $ 16 $ 10 $ 9 $ 18 $ 12 $ 2 $ 10 $ 4 $ 3 $ 5 $ 4 $ 1 Venue type fastfood street restaurant cafe pub restaurant cafe fastfood park street fastfood pub ... The basics of split-apply-combine
  • 10. Country Sum Revenue United States France Canada Japan split by country, combine by sorting desc. on Sum Revenue, map to the vertical axis using an ordinal scale. add labels apply: sum revenue, call it Sum Revenue, plot rectangles and map length to the horizontal axis using a linear scale, Color with #45808E. Use `Country` as label Split-apply-combine thinking translates to visualizations
  • 11. 1. split on state apply sum population combine: sort desc. by population; limit 6 Nested split-apply-combine underpins more complex visualizations 2. split on age (bin by 5 year) combine: sort by age apply sum population
  • 12. Data Visualization can be thought as a visual mapping function applied during the *Apply and Combine steps. *although it can be thought as applied exclusively during the combine step…
  • 13. Name Operation Lines Vadim Added 100 Luca Removed 34 Vadim Added 65 Vadim Removed 5 Luca Added 24 Vadim Removed 71 Luca Removed 45 Vadim Added 7 ... ... ... -960 LucaVadim 1531 -321 739 0 1k 2k -2k -1k “plot” AdditionsDeletions Reduce information, preserve knowledge...
  • 14. Question: Mapping of what, to what?
  • 15. Types of data ID Timestamp Location Name Operation Lines Pass Test? 0000001 11-05-2013 10.45 am San Francisco Vadim Added 100 Yes 0000002 11-05-2013 11.12 am San Bruno Luca Removed 34 Yes 0000003 11-05-2013 11.30 am San Francisco Vadim Added 65 Yes 0000004 11-05-2013 11.34 am San Francisco Vadim Removed 5 Yes 0000005 11-05-2013 11.43 am San Bruno Luca Added 24 No 0000006 11-05-2013 11.45 am San Francisco Vadim Removed 71 Yes 0000007 11-05-2013 12.51 pm San Francisco Luca Removed 45 Yes 0000008 11-05-2013 12.55 pm San Francisco Vadim Added 7 No ... ... ... ... ... ... ... Categorical # Discrete # Continuous# Discrete Boolean
  • 16. There are other ways to classify data, but this one will get you very far. pick up a good statistics book and just start reading...
  • 17. Types of variables 1. Independent a. a variable that isn't changed by the other variables you are trying to measure. It usually goes on the x axis. 2. Dependent a. It is a variable that changes depending on other variable(s). It usually goes on the y axis.
  • 19. Variables of a visualization 1. Position (x,y) 2. Size (big, small…) 3. Value (bright, dark…) 4. Texture (hatched, dotted…) 5. Color (blue, red…) 6. Orientation (degree) 7. Shape (triangle, circle…) y x
  • 20. # Discrete # Continuous Categorical Boolean y x y x y x y x Optimal mappings by type
  • 21. -960 LucaVadim 1531 -321 739 0 1k 2k -2k -1k AddedRemoved Name Operation Lines Vadim Added 100 Luca Removed 34 Vadim Added 65 Vadim Removed 5 Luca Added 24 Vadim Removed 71 Luca Removed 45 Vadim Added 7 ... ... ... Split on Name Split on Operation Apply Sum(Added) Apply Sum(Removed) Combine -Removed map to Red, value to size Combine Added map to Green, value to size Combine Name map to x axis
  • 22. Apply the minimum number of mappings that illustrates the underlying question you are trying to answer.
  • 24. 1. Label your axes 2. Include measurement units 3. Explain your encodings (add a legend) 4. Remove redundant information 5. Don’t fuck with distort the axis, especially with time series Golden rules - Part 1
  • 25. Golden rules - Part 2 1. If you are trying to visualize rate of change, then do it 2. Remove outliers, but know they are there 3. Tools have their own biases and quirks, know them. 4. The solution to 80% of your problems are bar charts and histograms 5. Data Tables are visualizations too ...there are thousands of good rules, but the best one is still “keep it simple”
  • 26. Some examples this is going to be fun...
  • 27. Example 1 Simple bar chart Linear scale Missing bucket (4.8 - 4.9) Missing bucket (4.8 - 4.9)
  • 29. Example 2 - better No - Human Yes - Robot
  • 32. Example 5 OK, this is comically bad, I was just going for a good collective giggle...
  • 33. Books you should read everybody knows about Tufte, so please don’t bring it up
  • 34. The Semiology of Graphics, 1967 Jaques Bertin
  • 35. The Elements of Graphing Data, 1985 & Visualizing Data, 1993 William S. Cleveland
  • 37. Thank you! for questions, tweet me at @luckymethod