SlideShare ist ein Scribd-Unternehmen logo
1 von 104
Downloaden Sie, um offline zu lesen
A Tinkerer’s Toolbox:
Data Driven Journalism




                                           Tony Hirst
                     Dept of Communication and Systems
                                   The Open University
               Visiting Senior Research Fellow, University of Lincoln
@psychemedia

blog.ouseful.info

      #???
Where I situate myself…
Visualising data helps me make
sense of the world around me
Do you know
   what’s
 possible?
Lincoln ddj
#ddj
Lincoln ddj
Google
Spreadsheets
Data Distributions




                     Outliers
Trends and (anti)correlations...
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Explanatory visualization
Data visualizations that are used to
transmit information or a point of
view from the designer to the
reader. Explanatory visualizations
typically have a specific “story” or
information that they are intended
to transmit.

Exploratory visualization
Data visualizations that are used by
the designer for self-informative
purposes to discover patterns,
trends, or sub-problems in a
dataset. Exploratory visualizations
typically don’t have an already-
known story.
Exploiting
Structure
Lincoln ddj
Hierarchical data and treemaps - medals




Pivot tables
Templated data views
Lincoln ddj
Lincoln ddj
Macroscopes
Lincoln ddj
Lincoln ddj
Lincoln ddj
Look for
Differences
Lincoln ddj
Lincoln ddj
Data Can Tell a
    Story
http://www.musik-therapie.at/PederHill/Structure&Plot.htm
Lincoln ddj
Lincoln ddj
Visual Data
Summaries
ggplot() +
geom_linerange(data = d1,aes(x= car, ymin = ymin,ymax = ymax)) +
geom_point(data = d2,aes(x= car, y= value,shape = variable),size = 2) +
opts(title="F1 2011 Korea nRace Summary Chart",
    axis.text.x=theme_text(angle=-90, hjust=0)) +
labs(x = NULL, y = "Position", shape = "")
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Data
Clean(s)ing
Google Refine
(Inner) Joins &
 Reconciliation
Lincoln ddj
Lincoln ddj
Lincoln ddj
Google Fusion
Tables
Google Refine
Lincoln ddj
OpenHeatMap
“Data Flow”
“Analog Synth Meeting”, Todd Huffman
Lincoln ddj
Lincoln ddj
Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Lincoln ddj
Lincoln ddj
Find the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Get the data as data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Transform the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Enrich the data and transform again…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Display the data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Publish the displayed data…

                        Google
                                                   Yahoo! Pipe
Wikipedia       HTML    Spreadsheet        CSV
                                                   Import CSV
                        =importHTML



            Embedded
                       <embed>        Google Map    KML
            object
Lincoln ddj
The onlineCSV file
      becomes a spreadsheet
          becomes A DATABASE
Lincoln ddj
Finding data…
site:.gov.uk
filetype:xls
underspend
inurl:http://phx.corporate-ir.net/phoenix.zhtml?
intitle:press
site:phx.corporate-ir.net
inurl:http://phx.corporate-ir.net/phoenix.zhtml?
intitle:press
site:phx.corporate-ir.net
Tapping the
Data Burden
Lincoln ddj
Reporting body              Receiving body


                      Data tap




Data Burdens and FOI
Opening Data
 Up via FOI
Lincoln ddj
Lincoln ddj
Lincoln ddj
“Public Data” &
 Social Media
   Mapping
Lincoln ddj
Lincoln ddj
Emergent views
 of structural
  properties
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
Lincoln ddj
My “journalism” is tracking down
tools and working out recipes that
     help datasets tell stories
http://delicious.com/stacks/view/CROBXt
Build lazy…
Electrical Safety 101

We get a lot of stuff from
Asia, so it all comes with
funny plugs, travelling just
adds to the fun.

Left to right top to bottom we
have:

Singapore wall socket UK
Adapter UK -> NZ/AU
Double adapter NZ/AU
My cell charger NZ/AU
Adapter NZ/AU -> everything
Andreas cell charger Euro
Camera charger US




                    tolomea
Lincoln ddj
Lincoln ddj
Lincoln ddj
“Hands Passing Baton at Sporting Event”, tableatny
Lincoln ddj
Lincoln ddj
Lincoln ddj
@psychemedia

blog.ouseful.info

Weitere ähnliche Inhalte

Mehr von Tony Hirst

Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptxTony Hirst
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacksTony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyterTony Hirst
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2Tony Hirst
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopTony Hirst
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireTony Hirst
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interestTony Hirst
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXTony Hirst
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefineTony Hirst
 
Conversations with data
Conversations with dataConversations with data
Conversations with dataTony Hirst
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingoTony Hirst
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Tony Hirst
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalismTony Hirst
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismTony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear talesTony Hirst
 

Mehr von Tony Hirst (20)

Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptx
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacks
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyter
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 Workshop
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wire
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interest
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKX
 
Week4
Week4Week4
Week4
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefine
 
Conversations with data
Conversations with dataConversations with data
Conversations with data
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingo
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 

Kürzlich hochgeladen

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 

Kürzlich hochgeladen (20)

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 

Lincoln ddj

Hinweis der Redaktion

  1. Do we have a hashtag for the workshop?
  2. Collaborative commentary
  3. Through the provision of an API on top of the aggregated local council data, OpenlyLocal can also be treated as a database in its own right. In the example shown here, committee membership is displayed via a treemap showing party affiliations of committee members. (Hovering over a particular grouping displays a list of names of council members on that committee from that party political grouping.) Whilst it would be a major task to take data from every council website in a variety of formats in order to generate similar views for other councils, the work done by OpenlyLocal in aggregating this data and then republishing it via a single API in a single format means that the treemap view can be applied to each council whose data is stored in OpenlyLocal.In passing, it is also worth mentioning how the use of visualisations can be helpful in cleaning data or identifying possible errors in it. In the above example, we see that party affiliations for councillors on the Isle of Wight Council are declared as both Liberal Democrat and and Liberal Democrat Group.
  4. The top, blue strip shows the gear (1 to 7); the green strip shows the throttle pedal depression (0-100%), and the red strip shows the brake (0-100%). The light blue strip is a composite of the previous three strips. The whiter the pixel, the closer it is to 100% throttle in 7th gear with no braking.The bottom two traces show the longitudinal and lateral g-force respectively. For the longitudinal trace, red shows braking – being forced into the steering wheel; green shows acceleration – being forced back into your seat. You’ll see the greatest g-force under braking occurs when the brakes are slapped full on… (the red bits in the third and fifth traces line up). For the latitudinal g-force, the red shows the driving being flung to the left (i.e. right hand corner), the green shows them being pushed out to the right.
  5. Analogsynth – pretty much ultimate freedom to linlk audio processing effects modules together. Simplified by having a common plug.
  6. Some scene setting about what I mean by “flow”…
  7. Suppose we have a table of numerical data associated with placenames on something like Wikipedia. How do we knock up a quick map view of the data?
  8. UK city population search onwikipedia
  9. This can all be a bit flakey – a bit like balancing stones… But It can also be surprisingly stable (for a time at least!)
  10. Here we see the result of pulling data into a Google Spreadsheet from a CSV file published at a particular web address. We now have the ability to run the full range of spreadsheet tools over the data – data which is being pulled in from the datastore, remember.(A similar functionality presumably exists in Microsoft Excel?)
  11. Emergent Social Positioning: origins: 1.5 degree egonet (how followers follow each other, how hashtaggers follow each other)- projection maps from followers to folk they commonly follow;-- projection maps from hashtaggers to folk they commonly follow- projection maps from friends to folk who commonly follow them
  12. Lots of the time, things don’t quite fit: the import format for one tool does not match up with the export formats of another… so sometimes we need an adapter. (Cf. also the notion of impedance mismatch.)
  13. Do we have a hashtag for the workshop?