Miroslav Liska (Marek Šurek): Toward Government Linked Data: A Slovak Case
1. ★★★★★
Miroslav Líška, Marek Šurek
Datalan (Bratislava, Slovakia)
l
Five Star Open Data in SR, 16.9.2015
Toward Government Linked Data : A Slovak Case
data.gov.sk-semanticweb
2. I. Introduction
1. Five Star Open Data
2. History of Semantic Web in Slovakia/Datalan
II. Method
3. Main Principles
4. data.gov.sk Resource General* URI Pattern
5. Supported ontologies (ODP Ontology + SEMIC Recommendations)
III. Process
6. URI Registration – process
7. URI Registration – use case model
IV. Searching for Business Cases
8. Slovpedia (Tripleskop)
9. Slovpedia (PharmaGuard)
*Annext A: data.gov.sk Resource URI Patterns – detail specification
Agenda
4. 1) Five Star Open Data
● Slovak Government Data ? Sad story.
But this can change! Semantic Web !
5. 2) History of Semantic Web in SK, Datalan
Datalan
Slovakia
● 2006 … – 1st Workshop on Intelligent and
Knowledge oriented Technologies.
SAV, FIIT STU, FEI TUKE
● 2009 … – start of Sestate, Susan,
Tripleskop, Slovpedia, Pharmaguard, SemTelcoSearch
● 2013 … – DTLN became a member of Data Standardization
Process in SK /Ministry of Finance SR/ (as ITAS Deputy)
● 1. formal proposal of sk semantic standards [too soon]
● 2015
● 2. formal proposal of data.gov.sk-semanticweb_1.0
(we believe for approval until end of 2015)
202X
Miroslav Líška
We fought for
the Semantic Web
7. 3) Method Overwiev
URI Pattern Rules
+
Simple / Extendable Government URI System
data.gov.sk Semantic Standards
data.gov.sk general URI pattern
Catalog URI
Ontology URI
Class URI
Individuals
Template URI
URIVersioningrules
URI
Individual
URI
Dataset URI
Dataset
Item
URI
Object Property URI
DataType Property URI
SupplementaryURIrules
Supported Ontologies
+ URI Registration
Process
method
process
)(
8. 4) data.gov.sk Resource URI Patterns
TYPE
● id = concrete individual („Lukas Liska“, „Datalan“, „Bratislava“ ...)
● def = ontology entity definition
● doc = document, file, electronic form ...
● set = catalog, dataset (codelist), distribution
CLASS - resource classification
IDENTITY – standard relationalDB-like ID (0000001, 0000002 … )
VERSION - resource version/distribution (2015-09-17, 1.0, A, B …)
General URI Pattern for data.gov.sk Resource
http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}
§1
9. General URI Pattern for data.gov.sk Resource
http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}
§1
Example – Legal Form Class (ODP Ontology)
http://data.gov.sk/def/ontology/odp/LegalForm
Example – Legal Form 121 represents a joint stock form of company
http://data.gov.sk/def/legalform/121
Example – Legal Forms Codelist
http://data.gov.sk/set/codelist/legalform
Example – Distribution of the Legal Forms codelist
http://data.gov.sk/set/codelist/legalform/2015-09-16
examples
See Annext A for full specification
4) data.gov.sk Resource General URI Patterns
10. 5) Supported Ontologies (1/2)
A. ODP Ontology: Knowledge Kernel
Mapping to the
actual KDP
element
Mapping to
actual codelist
Mapping to
SEMIC
recommended
ontology
= OntologizationOf (ElementsOf(KDP + MetaIS)) +
mapping to SEMIC Ontologies
11. 5) Supported Ontologies (2/2)
B. SEMIC Recommended ontologies
DCAT Data Catalog Vocabulary
ADMS Asset Description Metadata Schema
ADMS.SW ADMS for Software
CPSV Core Public Service Vocabulary
ROV Registered Organization Vocabulary
LOCN Location Core Vocabulary
PERSON Person Core Vocabulary
RDF Resource Description Framework
RDFS Resource Description Framework Scheme
OWL Web Ontology Language
SKOS Simple Knowledbe Organizational System
C. Semantic Core Ontologies
22. 9) Slovpedia/PharmaGuard (1/2)
Líška, M., Šurek, M.: An Approach to NLP based Drug
Interactions with Inferencing. Unpublished yet.
PharmaGuard.EU (1.0)
A Drug & Medication Mobile Application based on
government drug data (sk data + drugbank.ca)
uses
25. TYPE
● id = concrete individual („Lukas Liska“, „Datalan“, „Bratislava“ ...)
● def = ontology entity definition
● doc = document, file, electronic form ...
● set = catalog, dataset (codelist), distribution
CLASS - resource classification
IDENTITY – standard relationalDB-like ID (0000001, 0000002 … )
VERSION - resource version/distribution (2015-09-17, 1.0, A, B …)
General URI Pattern for data.gov.sk Resource
http://data.gov.sk/[TYPE]/[CLASS]/[IDENTITY]/{VERSION}
§1
Annext A: data.gov.sk Resource URI Patterns
26. Individual URI
http://data.gov.sk/id/[class]/[code]
Example – Bratislava Self-Governing Region
http://data.gov.sk/id/nuts3/SK01
Example - Datalan
http://data.gov.sk/id/corporatebody/35810734
Example – Drug Concor 30x5mg
http://data.gov.sk/id/drug/94164
Example – Andrej Kiska (Slovak President)
http://data.gov.sk/id/president/andrej-kiska
Example – this document
http://data.gov.sk/doc/pdf/method/uri-for-slovak-public-data/201509-09-16
§1.3
Document URI
http://data.gov.sk/doc/[docType/filename]/[version]
§1.2
Example – Andrej Kiska (Slovak President)
http://data.gov.sk/id/president/andrej-kiska
The Public Procurement Information Systemversion
http://data.gov.sk/id/isvs/5854
Annext A: data.gov.sk Resource URI Patterns
27. Dataset (codelist)
http://data.gov.sk/setset/[datasetType]/[dataset]
[datasetType]
● codelist = a set that contains codelist elements
● data = a set that contains data „records“
[dataset] = english name of actual dataset
Example – Legal Forms codelist
http://data.gov.sk/set/codelist/legalform
Example – Approved and Categorized Drugs Datasets
http://data.gov.sk/set/data/categorizeddrug
§1.4
Annext A: data.gov.sk Resource URI Patterns
28. Dataset item
http://data.gov.sk/[type]/[class]/[identity]
[type]
● def = an item represents ontology entity definition (§1.1)
● id = an item represents individual (§1.4.3)
[class] = type of the item
[identity] = present item code
Example – Joint Stock Company as the item of the Legal Forms codelist
http://data.gov.sk/def/legalform/121
Example – Bratislava Region as the item of the NUT3 codelist
http://data.gov.sk/id/nuts3/SK01
Extended example
legalform:121 rdf:type odp:LegalForm .
legalform:121 rdfs:label “Joint Stock Company“@en .
legalform:121 rdfs:label “Akciová spoločnosť“@sk .
legalform:121 rdfs:label “Aktiengesellschaft“@de .
§1.4.1
Annext A: data.gov.sk Resource URI Patterns
29. Catalog (set of datasets)
http://data.gov.sk/set/cat/[catalog]
Example – drug related datasets gropu
http://data.gov.sk/set/cat/registered-drugs
§1.4.2
Annext A: data.gov.sk Resource URI Patterns
30. ● Versionable resource is a resource which
versions can exists in parallel, such as
● an information systems, a service ...
● an ontology
● dataset distribution …
● Otherwise a resource is unversionable, such as
● a person
● geo entity
● ...
Example – The Public Procurement Information Systemversion 1.0
http://data.gov.sk/id/isvs/5854/1.0
Example – A second version of
http://data.gov.sk/set/codelist/legalform/2015-09-04
Example – The Legal Forms Dataset published 2015-09-04
http://data.gov.sk/set/codelist/legalform/2015-09-04
§1.5 Resources versioning
Annext A: data.gov.sk Resource URI Patterns
31. URI – identify content
URL - navigate to content
Example – an eform
<http://data.gov.sk/doc/eform/DCOM_eDemokracia_StaznostFO_sk/1.0>
Example – eforms XSD
<http://data.gov.sk/doc/xsdschema/DCOM_eDemokracia_StaznostFO_sk/1.0>
Example – NOT this
<http://data.gov.sk/doc/eform/DCOM_eDemokracia_StaznostFO_sk/1.0/share/files/schema.xsd>
Annext A: data.gov.sk Resource URI Patterns
§1.6 URI is not URL