Student Profile Sample - We help schools to connect the data they have, with ...
Data Lingo (v. ITA 2021)
1. Frieda Brioschi - frieda.brioschi@gmail.com
Emma Tracanella - emma.tracanella@gmail.com
DATA LINGO
LESSON 4 - 2020/21
2. LESSON 4
THE COURSE
1. Introduction. What are data and information, why they matter
2. How to collect and organize data
3. Information classification
4. Data Lingo
5. Data Science
6. Computer & Humans: how we perceive information
7. Visual communication of numerical data
8. Visual communication of non numerical data
9. Content type and effectiveness
10.Storytelling with data
11.Tools for analysis and data visualization
12.Artificial Intelligence demythologized
2
8. LESSON 4
A MATTER OF NO NEUTRALITY
No information organization is neutral.
Organizing an information space in a certain way means giving shape to that
space, giving it an identity and a meaning.
It means modeling the experience of those who live in or pass through that
space. In this sense, every information architecture creates a vision of the world,
influences our perception of reality, and inevitably shapes our experience.
▸ Luca Rosati, DigitalUpdate
8
9. LESSON 4
A MATTER OF NO NEUTRALITY
How we organize information is just as important as the information itself: bad
organization does not only affect the findability of the contents, but also their
comprehensibility.
Organizing means first of all establishing relations between the elements of a
system, relations of similarity or difference, proximity or distance.
▸ Luca Rosati, DigitalUpdate
9
10. LESSON 4
ISSUE
▸ Different people have different interpretation of the subject of the books
▸ Library of Congress sort book in 2 ways:
▸ call numbers
▸ subject headings
10
11. LESSON 4
CALL NUMBER
▸ For each book they define a unique string of letters and number - the “call
number”
▸ Is a sort of address to help you to locate the book in the library
11
12. LESSON 4
AN EXAMPLE
▸ The call number for A short History of the Spanish Civil War
12
▸ D = word history &
history of Europe, Asia,
Africa, Australia, …
▸ P = about Spain-
Portugal
▸ 269 to 269.9 = about
Spanish Civil War
13. LESSON 4
CALL NUMBER
▸ Call numbers are assign based on what the book is about.
▸ Book are place in the library according to the call numbers
▸ Books with similar topics are located near each other
13
15. LESSON 4
SUBJECT HEADING
▸ There is a set of terms that librarians can pick from to describe a book.
▸ The official term can be different from the term commonly used
▸ The official term for “death penalty” is “capital punishment”
▸ Keyword search
15
16. OUR INNATE BRAIN STRUCTURE
REFLECTS HOW WE CLASSIFY THE
WORLD AROUND US
-A study by Harvard University
LESSON 4
16
19. LESSON 4
HISTORY OF ONTOLOGIES
Ontologies arise out of the branch of philosophy known as metaphysics, which
deals with questions like "what exists?" and "what is the nature of reality?". One
of five traditional branches of philosophy, metaphysics is concerned with
exploring existence through properties, entities and relations such as those
between particulars and universals, intrinsic and extrinsic properties,
or essence and existence. Metaphysics has been an ongoing topic of discussion
since recorded history.
▸ https://en.wikipedia.org/wiki/Ontology_(information_science)
19
20. LESSON 4
ONTOLOGIES & ARTIFICIAL INTELLINGENCE
AI systems are based on knowledge engineering. AI researchers argued that they
could create new ontologies as computational models that enable certain kinds
of automated reasoning, which was only marginally successful. In the 1980s, the
AI community began to use the term ontology to refer to both a theory of a
modeled world and a component of knowledge-based systems.
20
21. LESSON 4
A DEFINITION OF ONTOLOGY
An ontology defines a common vocabulary for persons who need to share information in a domain. It
includes machine-interpretable definitions of basic concepts in the domain and relations among them.
Why would someone want to develop an ontology? Some of the reasons are:
▸ To share common understanding of the structure of information among people or software agents
▸ To enable reuse of domain knowledge
▸ To make domain assumptions explicit
▸ To separate domain knowledge from the operational knowledge
▸ To analyze domain knowledge
21
24. [17] EVENTO
[3] AZIENDA
[4] TIPOLOGIA DI ATTIVITÀ
[2] PROGETTO
3Azienda
4
7Tecnologia
8Impianto
6Contratto
9Key fact
1Nazione
9
[1] NAZIONE
Immagine
Mappa di riferimento
Data inizio presenza
✔ Descrizione
Key fact
✔ ATTIVITÀ
Nome
Immagine
Data fine
Descrizione
Data inizio
3Azienda
4
6Contratto
7Tecnologia
8Impianto
9Key fact
[9] KEY FACT
Immagine
✔ Descrizione
URL
[10] MINISITO WEB
URL
Descrizione
Immagine
[6] CONTRATTO
▶︎ Tipologia
Immagine
Data inizio esistenza
Data fine esistenza
✔ SEDE
Nome
Descrizione
9Key fact
10Sito web
[13] FIXED ANSWER
✔ NOTA
▶︎ Tipologia
Nota
URL
[15] LINEA DI BUSINESS
Descrizione
Immagine
9Key fact
URL
3Azienda
[16] ORGANO
3Azienda
Statuto
URL
URL
1Nazione
▶︎ Dimensioni
[B] DIM. DEL PROGETTO
- Piccolo
- Medio
- Grande
- Gigante
B
[C] TIP. DI AZIENDA
- Società per Azioni
- Fondazione
C
D
[D] TIP. DI NOTA
- Brand
- Storia
- HSE
- Sicurezza
- Strategia
✔ Descrizione
✔ Descrizione
Immagine
✔ Descrizione
Data fine
Data inizio
✔ NOTA
▶︎ Tipologia
Nota
URL
D
[8] IMPIANTO
✔ NOTA
▶︎ Tipologia
Nota
URL
D
Tipologia di attività
Tipologia di attività
✔ NOTA
▶︎ Tipologia
Nota
URL
D
TopicsTopic / sub-topic
Altra entità della KB
✔ TABELLA DATI
RIGA
Intestazione
Dato
Anno (colonna)
1 > 20
ID
Sinonimo
ID
Sinonimo
ID
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
ID
Sinonimo
ID
Sinonimo
ID
URL
9Key fact
URL
9Key fact
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
Nome
25. [18] OGGETTO D’ATTIVITÀ
[17] EVENTO
[12] PERSONA
[4] TIPOLOGIA DI ATTIVITÀ
Immagine
✔ Descrizione
URL
[5] AREA GEOGRAFICA
Immagine
✔ Descrizione
Mappa di riferimento
1Nazione
9Key fact
[7] TECNOLOGIA
Lemma in Enipedia
✔ Descrizione
Immagine
[6] CONTRATTO
✔ Descrizione
URL
Data di inizio
Data di fine
Immagine
✔ Descrizione
Data di inizio
URL
Data di fine
▶︎ Tipologia
Immagine
RUOLO RICOPERTO
11Ruolo
[11] RUOLO
Descrizione
3Azienda
Data di inizio
Data di fine
15
[13] FIXED ANSWER
HTML
[14] SERP SPONSORIZ.
Abstract
URL
Immagine
3Azienda
Enipedia
Lemma in Enipedia
Descrizione
Immagine
Enipedia
Oggetto d’attività 18
Linea di business
18
[19] DATO FINANZIARIO
- Grande
- Gigante - HSE
- Sicurezza
- Strategia
[E] TIP. DI EVENTO
- Finanziario
- Di comunicazione
- Iniziativa
E
▶︎ Tipologia F
[F] TIP. DI FIXED ANSWER
- Editorial Result
- Landing Result
9Key fact
9Key fact
9Key fact
9Key fact
9Key fact
✔ NOTA
▶︎ Tipologia
Nota
URL
D
[8] IMPIANTO
URL
✔ Descrizione
Immagine
▶︎ Tipologia
[A] TIP. DI IMPIANTO
- Campo petrolifero
- Piattaforma
- Raffineria
- Impianto di liquefazione
- Impianto di rigasificazione
A
Località
✔ NOTA
▶︎ Tipologia
Nota
URL
D
Oggetto d’attività
15Linea di business
URL
[20] ALTRO
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
ID
Sinonimo
ID
Sinonimo
ID
Sinonimo
ID
9Key fact
URL
URL
URL
9Key fact
9Key fact
URL
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
↳ Nome redazionale
Nome
31. LESSON 4
AUTHOR’S RIGHT
In Italy, law 22 aprile 1941 n. 633 “Protezione del diritto d'autore e di altri diritti
connessi al suo esercizio”.
Two distinct components:
1. economic rights in the work
2. the moral rights of the author
31
32. LESSON 4
MORAL RIGHT
1. Right of attribution
2. The right to have a work published anonymously or pseudonymously
3. Right to the integrity of the work (bars the work from alteration, distortion, or
mutilation)
Anything else that may detract from the artist's relationship with the work even after
it leaves the artist's possession or ownership may bring these moral rights into play.
Moral rights are inalienable.
32
33. LESSON 4
ECONOMIC RIGHTS
The economic rights are a property right which is limited in time (70 years after the
author’s death in Italy) and which may be transferred by the author to other people.
They are intended to allow the author or their holder to profit financially from his/her
creation, and include the right to authorize the reproduction of the work in any form.
The authors of dramatic works (plays, etc.) also have the right to authorize the
public performance of their works.
33
34. LESSON 4
COPYLEFT
It allows for rights to distribute copies and modified versions of a work, and requires
that the same rights are preserved in modified versions of the work.
Copyleft is a general method for making a work free (libre), and requiring all
modified and extended versions of the work to be free as well.
This free does not necessarily mean free of cost, but free as in freely available to be
used, distributed or modified.
34
35. LESSON 4
COPYLEFT VS COPYRIGHT
Copyright law is usually used to prohibit others from reproducing, adapting, or
distributing copies of the author's work.
Under copyleft an author may give every person who receives a copy of a work
permission to reproduce, adapt or distribute it and require that any resulting copies
or adaptations are also bound by the same licensing agreement.
Creative Commons are the most known copyleft licenses.
35
36. LESSON 4
CREATIVE COMMONS
Creative Commons is an US foundation, created in 2001, which aims to develop,
support and steward legal and technical infrastructure that maximizes digital
creativity, sharing and innovation.
36
47. LESSON 4
DEFINITION
‘Open knowledge’ is any content, information or data that people are free to use,
re-use and redistribute — without any legal, technological or social restriction.
This is the summary of the full Open Definition which the Open Knowledge
Foundation (https://okfn.org) created in 2005 to provide both a succinct
explanation and a detailed definition of open data and open knowledge.
47
48. LESSON 4
KEY FEATURE OF OPENNESS
▸ Availability and access: the data must be available as a whole and at no more than a
reasonable reproduction cost, preferably by downloading over the internet. The data
must also be available in a convenient and modifiable form.
▸ Reuse and redistribution: the data must be provided under terms that permit reuse
and redistribution including the intermixing with other datasets. The data must be
machine-readable.
▸ Universal participation: everyone must be able to use, reuse and redistribute — there
should be no discrimination against fields of endeavour or against persons or groups.
For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or
restrictions of use for certain purposes (e.g. only in education), are not allowed.
48
https://opendefinition.org/
49. LESSON 4
OPEN DATA
Open data is the idea that some data should be freely available to everyone to
use and republish as they wish, without restrictions from copyright, patents or
other mechanisms of control.
One of the most important forms of open data is open government data (OGD),
which is a form of open data created by ruling government institutions. Open
government data's importance is borne from it being a part of citizens' everyday
lives, down to the most routine/mundane tasks that are seemingly far removed
from government.
49
50. LESSON 4
WHY OPEN DATA?
50https://www.europeandataportal.eu/sites/default/files/the-economic-impact-of-open-data.pdf
51. LESSON 4
WHY OPEN DATA?
51https://www.europeandataportal.eu/sites/default/files/the-economic-impact-of-open-data.pdf
52. LESSON 4
WHY OPEN DATA?
52https://www.europeandataportal.eu/sites/default/files/the-economic-impact-of-open-data.pdf
53. LESSON 4
OPEN DATA IMPACT MAP
The Map was developed to provide
governments, international
organizations, and researchers with a
more comprehensive understanding
of the demand for open data. By
mapping these organizations using
open data, we can better identify, get
feedback on, and improve the most
valuable government datasets.
53
https://opendataimpactmap.org/
54. LESSON 4
STATS NZ TATAURANGA AOTEAROA
Stats NZ Tatauranga Aotearoa is New
Zealand's official data agency.
They collect information from people and
organisations through censuses and
surveys. We use this information to publish
insights and data about New Zealand, and
support others to use the data.
https://www.stats.govt.nz/
54
58. LESSON 4
WHAT IS A DB?
According to Wikipedia “a database is an organized collection of data, generally
stored and accessed electronically from a computer system”.
Ideally it is organised in such a way that it can be easily accessed, managed, and
updated.
58
59. LESSON 4
DB JARGON: QUERY
When you want to perform an operation on data stored in a db, you should run a
query. This is typically one of SELECT, INSERT, UPDATE, or DELETE.
SELECT wakeUpTime FROM dCDCourse
59
60. LESSON 4
DB JARGON: TRANSACTION
When you need to perform a sequence of operations as a single unit of work,
that’s a transaction.
If one of you decide to withdraw from this course, then I need to update both the
list of students enrolled to this course and the total count of students. If I didn’t
operate inside a transaction, there’s a moment when one information (list of
students or total count) is wrong.
60
61. LESSON 4
DB JARGON: ACID
Wikipedia: ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database
transactions intended to guarantee validity even in the event of errors, power failures, etc.
▸ Atomicity means that you guarantee that either all of the transaction succeeds or none of
it does.
▸ Consistency ensures that you guarantee that all data will be consistent.
▸ Isolation guarantees that all transactions will occur in isolation. No transaction will be
affected by any other transaction.
▸ Durability means that, once a transaction is committed, it will remain permanently in the
system.
61