SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Data Shapes
                        and
               Data Transformations
Michael Hausenblas1, Boris Villazón-Terrazas2, and Richard Cyganiak1
                    1 DERI, NUI Galway, Ireland

                  firstname.lastname@deri.org
                       2 iSOCO, Madrid, Spain

                        bvillazon@isoco.com

             Paper available at: http://arxiv.org/abs/1211.1565
ToC

» Motivation

» Fundamental data shapes

» Data shapes transformations

» Discussion




                                2
ToC

» Motivation

» Fundamental data shapes

» Data shapes transformations

» Discussion




                                3
Motivation

Current data systems combine data from a
tremendous number of resources 1.




                                                                        load


                     extract               transform



          1. Pat Helland. If You Have Too Much Data, then 'Good Enough' Is Good Enough. Queue,
          9:40:40-40:50, May 2011.
          http://queue.acm.org/detail.cfm?id=1988603
                                                                                                 4
Motivation

We use the term data shape to refer on how data is
arranged and structured.

            resource       data shape




                                                         5
ToC

» Motivation

» Fundamental data shapes

» Data shapes transformations

» Discussion




                                6
Tabular

A tabular data shape organizes data items into a
table.




                         Location                 Environmental Services
                         Carlow County Council    40
                         Cavan County Council     36
                         Clare County Council     38
                         Cork City Council        51
                         Cork County Council      47
                         Donegal County Council   45
                         Dublin City Council      43




                                                                                 7
Tree

A tree data shape organizes data items into a
hierarchy. A data item is designated to be the root of
the tree while the remaining data items are
partitioned into non-empty sets each of which is a
subtree of the root.




                                                            8
Graph

A graph data shape consists of a set of vertexes,
and a set of edges. An edge is a pair of vertexes.
The two vertexes are called edge endpoints.

                                     TM




                                                         9
ToC

» Motivation

» Fundamental data shapes

» Data shapes transformations

» Discussion




                                10
Features

 Input/Output, generic data shape, and specific
  implementation




 Declarative/Operational




                                                         11
Features

 Lossy transformation: all queries that are
  possible on the original shape are also possible
  on the resultant shape




                                                           12
Tabular - Tabular

   • RDB – RDB
      • SQL Select
       SELECT Location as Region, EServices as EnvServices
       FROM services


Location                 EServices                       Regjon                   EnvServices
Carlow County Council    40                              Carlow County Council    40
Cavan County Council     36                              Cavan County Council     36
Clare County Council     38
                                          Data shape     Clare County Council     38
Cork City Council        51             transformation   Cork City Council        51
Cork County Council      47                              Cork County Council      47
Donegal County Council   45                              Donegal County Council   45
Dublin City Council      43                              Dublin City Council      43




           •   Declarative
           •   No Information loss
           •   No provenance
           •   Standard language, SQL

                                                                                                13
Tabular - Tree

     • RDB – XML
            • XML representation of a relational database

Location                 EnvironmentalServices
Carlow County Council    40
Cavan County Council     36
Clare County Council     38                        Data shape
Cork City Council        51                      transformation
Cork County Council      47
Donegal County Council   45
Dublin City Council      43




            • Operational
            • No Information loss




                                                                              14
Tabular - Graph

     • RDB – RDF
       • W3C RDB2RDF WG – R2RML 1


ID      Name
10      Venus
                                     Data shape
20      Felipe
                                   transformation




                 R2RML Mapping




       • Declarative
       • No Information loss
       • W3C Recommendation

                         1. http://www.w3.org/TR/r2rml/
                                                                       15
Tree - Tabular

• XML - RDB
  • A technique and tool that rely on the XSD of the XML 1

                                                     Location                 EnvironmentalServices
                                                     Carlow County Council    40
                                                     Cavan County Council     36
                                  Data shape         Clare County Council     38
                                transformation       Cork City Council        51
                                                     Cork County Council      47
                                                     Donegal County Council   45
                                                     Dublin City Council      43




  • Operational
  • No Information loss




          1. Amy Flik, Transforming XML into a Relational Database Using XML Schema Document Type, 2009.
          http://scholarworks.gvsu.edu/cistechlib/48/                                             16
Tree - Tree

• XML - XML
  • XSLT 1


                                          Data shape
                                        transformation




  • Declarative
  • No Information loss
  • W3C Recommendation



         1. http://www.w3.org/TR/xslt
                                                                  17
Tree - Graph

• XML - RDF
 • Gleaning Resource Descriptions from Dialects of Languages -
   GRDDL 1


                                                Data shape
                                              transformation




  • Declarative
  • No Information loss
  • W3C Recommendation


             1. http://www.w3.org/TR/grddl/
                                                                         18
Graph - Tabular

• RDF - RDB
 • SPARQL 1 SELECT


                                  Data shape
                                transformation




  • Declarative
  • Information loss
  • W3C Recommendation



         1. http://www.w3.org/TR/rdf-sparql-query/
                                                                  19
Graph - Tree

• RDF - XML
 • Rhizomik ReDeFer RDF2XHTML 1, relies on XSLT



                               Data shape
                             transformation




  • Declarative (XSLT)
  • Information loss
  • Ad-hoc tool



          1. http://rhizomik.net/html/redefer/
                                                            20
Graph - Graph

• RDF - RDF
 • SPARQL 1 Construct


                               Data shape
                             transformation




  • Declarative
  • No Information loss
  • W3C Recommendation



          1. http://www.w3.org/TR/rdf-sparql-query/
                                                                 21
Summary




      22
ToC

» Motivation

» Fundamental data shapes

» Data shapes transformations

» Discussion




                                23
Discussion

 We can perform (loss-less) data shape transformations
  between certain shapes.

 A number of data shape transformations are already
  standards
   - For RDB2RDF, see R2RML and Direct Mapping.
   - For XML2XML, see XSLT.
   - For XML2RDF, see GRDDL.


 Some data shape transformations are declarative in nature.

 In certain cases we have to deal with lossy transformations.

                                                               24
25
Data Shapes
                        and
               Data Transformations
Michael Hausenblas1, Boris Villazón-Terrazas2, and Richard Cyganiak1
                    1 DERI, NUI Galway, Ireland

                  firstname.lastname@deri.org
                       2 iSOCO, Madrid, Spain

                        bvillazon@isoco.com

             Paper available at: http://arxiv.org/abs/1211.1565

Weitere ähnliche Inhalte

Andere mochten auch

A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...Boris Villazón-Terrazas
 
Thomson 368
Thomson 368Thomson 368
Thomson 368skywu26
 
Verismo per zacademy
Verismo per zacademyVerismo per zacademy
Verismo per zacademymarziafontana
 
Curso redes sociais cidade rio 24012011
Curso redes sociais cidade rio 24012011Curso redes sociais cidade rio 24012011
Curso redes sociais cidade rio 24012011Fernando Flessati
 
Forbes usaee lecture lehigh university nov 5 2015
Forbes usaee lecture    lehigh university nov 5 2015Forbes usaee lecture    lehigh university nov 5 2015
Forbes usaee lecture lehigh university nov 5 2015Kevin Forbes
 
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012AUDITORIA Y CONSULTORIA, S.A.
 
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)Red stone gorge, jiaozuo henan (河南焦作 紅石峽)
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)Chung Yen Chang
 
Orientação Educacional
Orientação EducacionalOrientação Educacional
Orientação Educacionaltanialins
 
Cotton textile processing waste generation and effluent treatment
Cotton textile processing  waste generation and effluent treatmentCotton textile processing  waste generation and effluent treatment
Cotton textile processing waste generation and effluent treatmentreaderpravin
 

Andere mochten auch (19)

A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
 
Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data
 
iSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief IntroductioniSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief Introduction
 
Thomson 368
Thomson 368Thomson 368
Thomson 368
 
Presentación1
Presentación1Presentación1
Presentación1
 
Verismo per zacademy
Verismo per zacademyVerismo per zacademy
Verismo per zacademy
 
Curso redes sociais cidade rio 24012011
Curso redes sociais cidade rio 24012011Curso redes sociais cidade rio 24012011
Curso redes sociais cidade rio 24012011
 
Teorias
TeoriasTeorias
Teorias
 
Teorias
TeoriasTeorias
Teorias
 
HTML Básico 2
HTML Básico 2HTML Básico 2
HTML Básico 2
 
Celebracao da vida
Celebracao da vidaCelebracao da vida
Celebracao da vida
 
Forbes usaee lecture lehigh university nov 5 2015
Forbes usaee lecture    lehigh university nov 5 2015Forbes usaee lecture    lehigh university nov 5 2015
Forbes usaee lecture lehigh university nov 5 2015
 
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012
Gaceta 241 ley_822_ley_de_concertación_tributaria_del_17-dic-2012
 
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)Red stone gorge, jiaozuo henan (河南焦作 紅石峽)
Red stone gorge, jiaozuo henan (河南焦作 紅石峽)
 
Cultural or religious celebration portugal
Cultural or religious celebration portugalCultural or religious celebration portugal
Cultural or religious celebration portugal
 
Orientação Educacional
Orientação EducacionalOrientação Educacional
Orientação Educacional
 
Curso de Atencao ao cliente p/ callcenter
Curso de Atencao ao cliente p/ callcenterCurso de Atencao ao cliente p/ callcenter
Curso de Atencao ao cliente p/ callcenter
 
Personajes, Lugar, Hechos
Personajes, Lugar, HechosPersonajes, Lugar, Hechos
Personajes, Lugar, Hechos
 
Cotton textile processing waste generation and effluent treatment
Cotton textile processing  waste generation and effluent treatmentCotton textile processing  waste generation and effluent treatment
Cotton textile processing waste generation and effluent treatment
 

Ähnlich wie Data Shapes and Data Transformations

Some problems with standard geospatial metadata
Some problems with standard geospatial metadataSome problems with standard geospatial metadata
Some problems with standard geospatial metadataSimon Cox
 
Coursemodule dbms
Coursemodule dbmsCoursemodule dbms
Coursemodule dbmsrupalidhir
 
Incremental Export of Relational Database Contents into RDF Graphs
Incremental Export of Relational Database Contents into RDF GraphsIncremental Export of Relational Database Contents into RDF Graphs
Incremental Export of Relational Database Contents into RDF GraphsNikolaos Konstantinou
 
Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)ALATechSource
 
Multi dimensional model vs (1)
Multi dimensional model vs (1)Multi dimensional model vs (1)
Multi dimensional model vs (1)JamesDempsey1
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsGDi Techno Solutions
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiersLars Marius Garshol
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...Lucas Jellema
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)EUCLID project
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
Charper.lawdi.20120601
Charper.lawdi.20120601Charper.lawdi.20120601
Charper.lawdi.20120601charper
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Rabin BK
 
Sieve - Data Quality and Fusion - LWDM2012
Sieve - Data Quality and Fusion - LWDM2012Sieve - Data Quality and Fusion - LWDM2012
Sieve - Data Quality and Fusion - LWDM2012Pablo Mendes
 
UCD Digital Library: Creating online access to historical and contemporary co...
UCD Digital Library: Creating online access to historical and contemporary co...UCD Digital Library: Creating online access to historical and contemporary co...
UCD Digital Library: Creating online access to historical and contemporary co...UCD Library
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNRDatiGovIT
 

Ähnlich wie Data Shapes and Data Transformations (20)

Some problems with standard geospatial metadata
Some problems with standard geospatial metadataSome problems with standard geospatial metadata
Some problems with standard geospatial metadata
 
Coursemodule dbms
Coursemodule dbmsCoursemodule dbms
Coursemodule dbms
 
Incremental Export of Relational Database Contents into RDF Graphs
Incremental Export of Relational Database Contents into RDF GraphsIncremental Export of Relational Database Contents into RDF Graphs
Incremental Export of Relational Database Contents into RDF Graphs
 
Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)
 
Multi dimensional model vs (1)
Multi dimensional model vs (1)Multi dimensional model vs (1)
Multi dimensional model vs (1)
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, ...
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)Relational Database to RDF (RDB2RDF)
Relational Database to RDF (RDB2RDF)
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Charper.lawdi.20120601
Charper.lawdi.20120601Charper.lawdi.20120601
Charper.lawdi.20120601
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)
 
NoSQL
NoSQLNoSQL
NoSQL
 
Sieve - Data Quality and Fusion - LWDM2012
Sieve - Data Quality and Fusion - LWDM2012Sieve - Data Quality and Fusion - LWDM2012
Sieve - Data Quality and Fusion - LWDM2012
 
Linked Data:Libraries and Beyond
Linked Data:Libraries and BeyondLinked Data:Libraries and Beyond
Linked Data:Libraries and Beyond
 
UCD Digital Library: Creating online access to historical and contemporary co...
UCD Digital Library: Creating online access to historical and contemporary co...UCD Digital Library: Creating online access to historical and contemporary co...
UCD Digital Library: Creating online access to historical and contemporary co...
 
Linked Open data: CNR
Linked Open data: CNRLinked Open data: CNR
Linked Open data: CNR
 
Digital Curation for Excel (DCXL)
Digital Curation for Excel (DCXL)Digital Curation for Excel (DCXL)
Digital Curation for Excel (DCXL)
 

Mehr von Boris Villazón-Terrazas

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingBoris Villazón-Terrazas
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsBoris Villazón-Terrazas
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusBoris Villazón-Terrazas
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationBoris Villazón-Terrazas
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupBoris Villazón-Terrazas
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesBoris Villazón-Terrazas
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseBoris Villazón-Terrazas
 

Mehr von Boris Villazón-Terrazas (12)

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial Datasets
 
Statistical Linked Data
Statistical Linked DataStatistical Linked Data
Statistical Linked Data
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current Status
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and Organization
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering Group
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful Services
 
Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 
GeoLinkedData
GeoLinkedDataGeoLinkedData
GeoLinkedData
 

Data Shapes and Data Transformations

  • 1. Data Shapes and Data Transformations Michael Hausenblas1, Boris Villazón-Terrazas2, and Richard Cyganiak1 1 DERI, NUI Galway, Ireland firstname.lastname@deri.org 2 iSOCO, Madrid, Spain bvillazon@isoco.com Paper available at: http://arxiv.org/abs/1211.1565
  • 2. ToC » Motivation » Fundamental data shapes » Data shapes transformations » Discussion 2
  • 3. ToC » Motivation » Fundamental data shapes » Data shapes transformations » Discussion 3
  • 4. Motivation Current data systems combine data from a tremendous number of resources 1. load extract transform 1. Pat Helland. If You Have Too Much Data, then 'Good Enough' Is Good Enough. Queue, 9:40:40-40:50, May 2011. http://queue.acm.org/detail.cfm?id=1988603 4
  • 5. Motivation We use the term data shape to refer on how data is arranged and structured. resource data shape 5
  • 6. ToC » Motivation » Fundamental data shapes » Data shapes transformations » Discussion 6
  • 7. Tabular A tabular data shape organizes data items into a table. Location Environmental Services Carlow County Council 40 Cavan County Council 36 Clare County Council 38 Cork City Council 51 Cork County Council 47 Donegal County Council 45 Dublin City Council 43 7
  • 8. Tree A tree data shape organizes data items into a hierarchy. A data item is designated to be the root of the tree while the remaining data items are partitioned into non-empty sets each of which is a subtree of the root. 8
  • 9. Graph A graph data shape consists of a set of vertexes, and a set of edges. An edge is a pair of vertexes. The two vertexes are called edge endpoints. TM 9
  • 10. ToC » Motivation » Fundamental data shapes » Data shapes transformations » Discussion 10
  • 11. Features  Input/Output, generic data shape, and specific implementation  Declarative/Operational 11
  • 12. Features  Lossy transformation: all queries that are possible on the original shape are also possible on the resultant shape 12
  • 13. Tabular - Tabular • RDB – RDB • SQL Select SELECT Location as Region, EServices as EnvServices FROM services Location EServices Regjon EnvServices Carlow County Council 40 Carlow County Council 40 Cavan County Council 36 Cavan County Council 36 Clare County Council 38 Data shape Clare County Council 38 Cork City Council 51 transformation Cork City Council 51 Cork County Council 47 Cork County Council 47 Donegal County Council 45 Donegal County Council 45 Dublin City Council 43 Dublin City Council 43 • Declarative • No Information loss • No provenance • Standard language, SQL 13
  • 14. Tabular - Tree • RDB – XML • XML representation of a relational database Location EnvironmentalServices Carlow County Council 40 Cavan County Council 36 Clare County Council 38 Data shape Cork City Council 51 transformation Cork County Council 47 Donegal County Council 45 Dublin City Council 43 • Operational • No Information loss 14
  • 15. Tabular - Graph • RDB – RDF • W3C RDB2RDF WG – R2RML 1 ID Name 10 Venus Data shape 20 Felipe transformation R2RML Mapping • Declarative • No Information loss • W3C Recommendation 1. http://www.w3.org/TR/r2rml/ 15
  • 16. Tree - Tabular • XML - RDB • A technique and tool that rely on the XSD of the XML 1 Location EnvironmentalServices Carlow County Council 40 Cavan County Council 36 Data shape Clare County Council 38 transformation Cork City Council 51 Cork County Council 47 Donegal County Council 45 Dublin City Council 43 • Operational • No Information loss 1. Amy Flik, Transforming XML into a Relational Database Using XML Schema Document Type, 2009. http://scholarworks.gvsu.edu/cistechlib/48/ 16
  • 17. Tree - Tree • XML - XML • XSLT 1 Data shape transformation • Declarative • No Information loss • W3C Recommendation 1. http://www.w3.org/TR/xslt 17
  • 18. Tree - Graph • XML - RDF • Gleaning Resource Descriptions from Dialects of Languages - GRDDL 1 Data shape transformation • Declarative • No Information loss • W3C Recommendation 1. http://www.w3.org/TR/grddl/ 18
  • 19. Graph - Tabular • RDF - RDB • SPARQL 1 SELECT Data shape transformation • Declarative • Information loss • W3C Recommendation 1. http://www.w3.org/TR/rdf-sparql-query/ 19
  • 20. Graph - Tree • RDF - XML • Rhizomik ReDeFer RDF2XHTML 1, relies on XSLT Data shape transformation • Declarative (XSLT) • Information loss • Ad-hoc tool 1. http://rhizomik.net/html/redefer/ 20
  • 21. Graph - Graph • RDF - RDF • SPARQL 1 Construct Data shape transformation • Declarative • No Information loss • W3C Recommendation 1. http://www.w3.org/TR/rdf-sparql-query/ 21
  • 22. Summary 22
  • 23. ToC » Motivation » Fundamental data shapes » Data shapes transformations » Discussion 23
  • 24. Discussion  We can perform (loss-less) data shape transformations between certain shapes.  A number of data shape transformations are already standards - For RDB2RDF, see R2RML and Direct Mapping. - For XML2XML, see XSLT. - For XML2RDF, see GRDDL.  Some data shape transformations are declarative in nature.  In certain cases we have to deal with lossy transformations. 24
  • 25. 25
  • 26. Data Shapes and Data Transformations Michael Hausenblas1, Boris Villazón-Terrazas2, and Richard Cyganiak1 1 DERI, NUI Galway, Ireland firstname.lastname@deri.org 2 iSOCO, Madrid, Spain bvillazon@isoco.com Paper available at: http://arxiv.org/abs/1211.1565