SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
+

Question Answering on
Interlinked Data
Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Soeren Auer
AKSW Research Group, Leipzig University
December 5 2013, IBM Research Center
+ Motivation
Retrieving information from LOD

AKSW group - Question Answering on Interlinked Data (published in www2013)

2
+ Motivation
Text	
  queries	
  (either	
  keyword	
  or	
  natural	
  language	
  )	
  are:	
  
n 

Simple	
  retrieval	
  approach	
  

n 

Popular	
  

n 

Implicit	
  and	
  ambiguous	
  seman=cs.	
  

SPARQL	
  queries	
  require:	
  
n 

Knowledge	
  about	
  the	
  ontology	
  

n 

Proficiency	
  in	
  formula=ng	
  formal	
  queries	
  	
  

n 

Explicit	
  and	
  unambigious	
  seman=cs.	
  

AKSW	
  group	
  -­‐	
  Ques=on	
  Answering	
  on	
  Interlinked	
  Data	
  (published	
  in	
  www2013)	
  

3
+ Comparison of Search Approaches

Data-Semantic
aware

Data-Semantic
unaware

Our
approach:
SINA

4

Question
Answering
Systems

Information
Retrieval
Keyword-based
query

AKSW group - Question Answering on Interlinked Data (published in www2013)

Natural language
query
+ Example

5

1
n 

3

Which televisions shows were created by Walt Disney?
select * where !
{ ?v0 a
!
?v0 dbo:creator

AKSW group - Question Answering on Interlinked Data (published in www2013)

2
!dbo:TelevisionShow.!
dbr:Walt_Disney. }!
+ Aim and Challenges

Aim: Question answering over a set of interlinked data sources.
n 

Query segmentation.

n 

Resource disambiguation.

n 

To construct a formal query (expressed in SPARQL)

AKSW group - Question Answering on Interlinked Data (published in www2013)

6
+ Further Challenges over Interlinked Data
1. 

Information for answering a certain question can be spread
among different datasets employing heterogeneous schemas.

2. 

Constructing a federated formal query across different datasets
requires exploiting links between the different datasets on both the
schema and instance levels.

AKSW group - Question Answering on Interlinked Data (published in www2013)

7
+ SINA Architecture

AKSW group - Question Answering on Interlinked Data (published in www2013)

8
+ Test bed datasets
*  One single dataset: DBpedia.
*  Three interlinked datasets
from life-science:

ü  Drugbank: is a
comprehensive knowledge
base containing information
about drugs, drug target (i.e.
protein) information,
interactions and enzymes.

ü  Diseasome: contains
information about diseases and
genes associated with these
diseases.

ü  Sider: contains information
about drugs and their side effects.

AKSW group - Question Answering on Interlinked Data (published in www2013)

9
+ Main characteristics of federated queries
1. 

Queries requiring fused information, e.g. side
effects of drugs used for Tuberculosis.

2. 

Queries targeting combined information, e.g.
side effect an enzymes of drugs used for ASTHMA.

3. 

10

Queries requiring keyword expansion, e.g. side
effects of Valdecoxib.

DrugBank

Sider
Drug

a

a
?v1

enzyme

?v0

Disease

?v2
sameAs

a
Diseasome

AKSW group - Question Answering on Interlinked Data (published in www2013)

Side Effect

Drug

a

Enzymes

Asthma

a
side effect

?v3
+ Challenge 1: Query Segmentation and Resource
Disambiguation

l 

Sample	
  ques5on:	
  What	
  is	
  the	
  side	
  effects	
  of	
  drugs	
  used	
  for	
  Tuberculosis?	
  	
  

l 

	
  Transformed	
  to	
  4-­‐tuple	
  (side	
  #	
  effect	
  #	
  drug	
  #	
  Tuberculosis)	
  

l 

Different	
  segmenta=ons	
  are	
  possible:	
  	
  
1. 

(	
  side	
  effect	
  #	
  drug	
  #	
  Tuberculosis)	
  

2. 

(	
  side	
  effect	
  drug	
  #	
  Tuberculosis	
  )

Mapping	
  of	
  the	
  segments	
  to	
  the	
  resources	
  in	
  the	
  underlying	
  knowledge	
  bases.	
  
Each valid segment

AKSW group - Question Answering on Interlinked Data (published in www2013)

11
12

Segment validation
	
  
ü 
ü 

	
  Original tuple: (side # effect # drug # Tuberculosis).
Using a naive approach for finding all valid segments.

	
  

Valid Segments

Samples of Candidate Resources

Side effect

1.  sider:class:sideeffect
!
2.  sider:property:side_effects!

drug

1. drugbank: drugs
2.class:offer!
3.sider:drugs
4.diseases:possibledrug!

tuberculosis

1.  diseases:1154
!
2.  side_effects: C0041296!

AKSW group - Question Answering on Interlinked Data (published in www2013)
+

13

Concurrent	
  
Segmenta5on	
  and	
  Disambigua5on	
  	
  

AKSW group - Question Answering on Interlinked Data (published in www2013)
14

Hidden Markov Model

• 
• 
• 
• 

A statistics model containing a set of states.
Moving from one state to another state generates a sequence of observations.
The probability of entering state only depends on the previous state.
Output is the most likely states generating the sequence of the observation.

AKSW group - Question Answering on Interlinked Data (published in www2013)
15

State Space

• 
• 
• 
• 

A state represents a knowledge base resource.
Contains all resources in the knowledge base.
In practice, we prune the state space by excluding irrelevant states.
Adding an unknown entity state comprising all resources, which are not
available (anymore) in the pruned state space.

•  Extension of State Space with reasoning: An extension of the state space
by including resources inferred from lightweight owl:sameAs reasoning.

AKSW group - Question Answering on Interlinked Data (published in www2013)
16

Bootstrapping the Model Parameters
Emission Probability
• 

The set-similarity level measures the difference between the label and the
segment in terms of the number of words using the Jaccard similarity.

• 

The string-similarity level measures the string similarity of each word in the
segment with the most similar word in the label using the Levenshtein
distance.

AKSW group - Question Answering on Interlinked Data (published in www2013)
17

Bootstrapping the Model Parameters
Transition Probability & Initial Probability
•  Computing the transition probability and initial probability based on Semantic
relatedness of two resources.
•  Semantic relatedness is based on two values: distance and connectivity
degree.
•  We transform these two values to hub and authority values using HITS
algorithm.
•  Initial probability and Transition probability
are defined as a uniform
distribution over the hub and and authority values.

AKSW group - Question Answering on Interlinked Data (published in www2013)
Evaluation of Bootstrapping

18

•  The accuracy of different distribution functions, i.e., Normal, Zipfian and
uniform distributions for transition probability.
•  We ran the distribution functions with two different inputs, i.e. distance and
connectivity degree values as well as hub and authority values.

AKSW group - Question Answering on Interlinked Data (published in www2013)
+ Viterbi Algorithm
Aim: The most likely path generating the sequence of input keywords.

AKSW group - Question Answering on Interlinked Data (published in www2013)

19
+

20

Output of the HMM for the following query:
Which televisions shows were created by Walt Disney?
Probability
0.0023
0.0014
5.89E-4
3.53E-4
3.76E-5

Path of states

dbo:TelevisionShow , dbo:creator , dbr:
dbo:TelevisionShow , dbo:creator , dbr:
dbr:TelevisionShow , dbo:creator , dbr:
dbr:TelevisionShow , dbo:creator , dbr:
dbp:television , dbp:show , dbo:creator

AKSW group - Question Answering on Interlinked Data (published in www2013)

Walt_Disney!
Category:Walt_Disney!
Walt_Disney!
Category:Walt_Disney!
, dbr: Category:Walt_Disney!
+

21

Query Construction	
  	
  

AKSW group - Question Answering on Interlinked Data (published in www2013)
Query Construction Method

Input: set of resources R = {r , r ,..., r }
Output: A query graph QG = (V, E)
is a directed, connected multi-graph.
1

2

n

Forward Chaining:
1.  CT: Comprehensive type.
2.  CD: Comprehensive domain.
3.  CR: Comprehensive range.

AKSW group - Question Answering on Interlinked Data (published in www2013)

22
Query Construction Method

Input: set of resources R = {r , r ,..., r }
Output: A query graph QG = (V, E)
is a directed, connected multi-graph.
1

2

n

Generating the Incomplete Query Graph (IQG)
Initializing vertices and primary edges.
•  A vertex is added to IQG (1) If r is an instance, (2) If r is a class.
•  Properties are added along with zero, one or two vertices.

AKSW group - Question Answering on Interlinked Data (published in www2013)

23
24

Query Construction Method

Example: What is the side effects of drugs used for Tuberculosis?
•  diseasome:1154 !
!
•  diseasome:possibleDrug !
•  sider:sideEffect !
!(type

!(type
!(type

Graph 1

!!

property)

sideEffect

possibleDrug
1154

instance) !!
property)!

?v0

?v1
Graph 2

AKSW group - Question Answering on Interlinked Data (published in www2013)

?v2
25

Query Construction Method

Connecting Sub-graphs of an IQG:
1.  Minimum spanning tree: a minimum set of edges (i.e., properties) to span a set of
disjoint graphs.
2.  Prim’s algorithm: incrementally includes edges to connect disjoint sub-graphs.
•  Direct properties: ?v0 ?p ?v1.
•  Properties via owl:sameAs link.
(1) ?v0 owl:sameAs ?x. ?x ?p ?v1. !
(2) ?v0 ?p ?x. ?x owl:sameAs ?v1. !
(3) ?v0 owl:sameAs ?x. ?x ?p ?y. ?y owl:sameAs ?v1. !

Template 1

Template 2

possibleDrug
1154

?v0

1154

?v2

?v1

sideEffect
?v1

AKSW group - Question Answering on Interlinked Data (published in www2013)

possibleDrug

sideEffect

?v0

?v2
Evaluation

Goal of experiment:
How well:
1.  resource disambiguation
2.  query construction approaches perform.
Measurement of the performance:
1.  For disambiguation using the Mean Reciprocal Rank (MRR).
2.  Query construction in terms of precision and recall.
Benchmark
1.  A natural- language query and the equivalent conjunctive SPARQL query.
2.  25 queries on the 3 interlinked datasets Drugbank, Sider and Diseasome.
3.  QALD1 and QALD3 benchmark for DBpedia.

AKSW group - Question Answering on Interlinked Data (published in www2013)

26
Evaluation using life-science datasets

Without reasoning: precision = 0.91 recall = 0.88
With reasoning:
precision = 0.95 recall = 0.90
AKSW group - Question Answering on Interlinked Data (published in www2013)

27
+ Evaluation using DBpedia
n 

QALD3 Benchmark:

ü 

contains 100 questions.

ü 

32 original questions can be answered correctly.

n 

QALD1 Benchmark:

ü 

contains 50 questions.

ü 

7 complex questions.

ü 

13 questions requiring information beyond DBpedia, i.e., from YAGO and FOAF.

ü 

14 slightly were modified to remove expansion and cleaning problem.

ü 

MRR of disambiguation = 96%

ü 

Query construction accuracy = 83%

AKSW group - Question Answering on Interlinked Data (published in www2013)

28
Runtime

Parallization over three components:
1.  Segment validation
2.  Resource retrieval
3.  Query construction

AKSW group - Question Answering on Interlinked Data (published in www2013)

29
+ Related work

AKSW group - Question Answering on Interlinked Data (published in www2013)

30
31

Thank you

Saeedeh Shekarpour
shekarpour@informatik-leipzig.de
sa.shekarpour@gmail.com
AKSW group - Question Answering on Interlinked Data (published in www2013)

Weitere ähnliche Inhalte

Was ist angesagt?

Public PhD Defense - Ben De Meester
Public PhD Defense - Ben De MeesterPublic PhD Defense - Ben De Meester
Public PhD Defense - Ben De MeesterBen De Meester
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Datagpano
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsPaul Hofmann
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked DataISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked DataEvangelia Daskalaki
 
Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data ebrahim_bagheri
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysRoy Clariana
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingJoe Raad
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engineLars Marius Garshol
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalSujit Pal
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyAmrapali Zaveri, PhD
 

Was ist angesagt? (17)

Public PhD Defense - Ben De Meester
Public PhD Defense - Ben De MeesterPublic PhD Defense - Ben De Meester
Public PhD Defense - Ben De Meester
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked DataISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
ISWC 2014 Tutorial - Instance Matching Benchmarks for Linked Data
 
Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
 
Pula 5 Giugno 2007
Pula 5 Giugno 2007Pula 5 Giugno 2007
Pula 5 Giugno 2007
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
 
Fasta
FastaFasta
Fasta
 
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
 
On the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema MatchingOn the Impact of sameAs on Schema Matching
On the Impact of sameAs on Schema Matching
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engine
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
 
Sub1579
Sub1579Sub1579
Sub1579
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 

Ähnlich wie Sina presentation in IBM

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...BigMine
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Dataebrahim_bagheri
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingVrije Universiteit Amsterdam
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataCarlos Pedrinaci
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsArijit Khan
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia StudyMaribel Acosta Deibe
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Databricks
 
Multivariate Data Analysis Project Report
Multivariate Data Analysis Project ReportMultivariate Data Analysis Project Report
Multivariate Data Analysis Project ReportUtkarsh Agrawal
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchFabien Gandon
 
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1Vijay Srinivas Agneeswaran, Ph.D
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021hala Skaf
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDatabricks
 

Ähnlich wie Sina presentation in IBM (20)

NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Data
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Noshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked DataNoshir Contractor's view on the future of Linked Data
Noshir Contractor's view on the future of Linked Data
 
Dagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphsDagstuhl seminar talk on querying big graphs
Dagstuhl seminar talk on querying big graphs
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
 
Mcs 021
Mcs 021Mcs 021
Mcs 021
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Multivariate Data Analysis Project Report
Multivariate Data Analysis Project ReportMultivariate Data Analysis Project Report
Multivariate Data Analysis Project Report
 
VOLT - ESWC 2016
VOLT - ESWC 2016VOLT - ESWC 2016
VOLT - ESWC 2016
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
 
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
Dl surface statistical_regularities_vs_high_level_concepts_draft_v0.1
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
Drug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge GraphsDrug Repurposing using Deep Learning on Knowledge Graphs
Drug Repurposing using Deep Learning on Knowledge Graphs
 

Mehr von Saeedeh Shekarpour

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on RelationsSaeedeh Shekarpour
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchSaeedeh Shekarpour
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSaeedeh Shekarpour
 

Mehr von Saeedeh Shekarpour (7)

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 

Kürzlich hochgeladen

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxAneriPatwari
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 

Kürzlich hochgeladen (20)

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptx
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 

Sina presentation in IBM

  • 1. + Question Answering on Interlinked Data Saeedeh Shekarpour, Axel-Cyrille Ngonga Ngomo, Soeren Auer AKSW Research Group, Leipzig University December 5 2013, IBM Research Center
  • 2. + Motivation Retrieving information from LOD AKSW group - Question Answering on Interlinked Data (published in www2013) 2
  • 3. + Motivation Text  queries  (either  keyword  or  natural  language  )  are:   n  Simple  retrieval  approach   n  Popular   n  Implicit  and  ambiguous  seman=cs.   SPARQL  queries  require:   n  Knowledge  about  the  ontology   n  Proficiency  in  formula=ng  formal  queries     n  Explicit  and  unambigious  seman=cs.   AKSW  group  -­‐  Ques=on  Answering  on  Interlinked  Data  (published  in  www2013)   3
  • 4. + Comparison of Search Approaches Data-Semantic aware Data-Semantic unaware Our approach: SINA 4 Question Answering Systems Information Retrieval Keyword-based query AKSW group - Question Answering on Interlinked Data (published in www2013) Natural language query
  • 5. + Example 5 1 n  3 Which televisions shows were created by Walt Disney? select * where ! { ?v0 a ! ?v0 dbo:creator AKSW group - Question Answering on Interlinked Data (published in www2013) 2 !dbo:TelevisionShow.! dbr:Walt_Disney. }!
  • 6. + Aim and Challenges Aim: Question answering over a set of interlinked data sources. n  Query segmentation. n  Resource disambiguation. n  To construct a formal query (expressed in SPARQL) AKSW group - Question Answering on Interlinked Data (published in www2013) 6
  • 7. + Further Challenges over Interlinked Data 1.  Information for answering a certain question can be spread among different datasets employing heterogeneous schemas. 2.  Constructing a federated formal query across different datasets requires exploiting links between the different datasets on both the schema and instance levels. AKSW group - Question Answering on Interlinked Data (published in www2013) 7
  • 8. + SINA Architecture AKSW group - Question Answering on Interlinked Data (published in www2013) 8
  • 9. + Test bed datasets *  One single dataset: DBpedia. *  Three interlinked datasets from life-science: ü  Drugbank: is a comprehensive knowledge base containing information about drugs, drug target (i.e. protein) information, interactions and enzymes. ü  Diseasome: contains information about diseases and genes associated with these diseases. ü  Sider: contains information about drugs and their side effects. AKSW group - Question Answering on Interlinked Data (published in www2013) 9
  • 10. + Main characteristics of federated queries 1.  Queries requiring fused information, e.g. side effects of drugs used for Tuberculosis. 2.  Queries targeting combined information, e.g. side effect an enzymes of drugs used for ASTHMA. 3.  10 Queries requiring keyword expansion, e.g. side effects of Valdecoxib. DrugBank Sider Drug a a ?v1 enzyme ?v0 Disease ?v2 sameAs a Diseasome AKSW group - Question Answering on Interlinked Data (published in www2013) Side Effect Drug a Enzymes Asthma a side effect ?v3
  • 11. + Challenge 1: Query Segmentation and Resource Disambiguation l  Sample  ques5on:  What  is  the  side  effects  of  drugs  used  for  Tuberculosis?     l   Transformed  to  4-­‐tuple  (side  #  effect  #  drug  #  Tuberculosis)   l  Different  segmenta=ons  are  possible:     1.  (  side  effect  #  drug  #  Tuberculosis)   2.  (  side  effect  drug  #  Tuberculosis  ) Mapping  of  the  segments  to  the  resources  in  the  underlying  knowledge  bases.   Each valid segment AKSW group - Question Answering on Interlinked Data (published in www2013) 11
  • 12. 12 Segment validation   ü  ü   Original tuple: (side # effect # drug # Tuberculosis). Using a naive approach for finding all valid segments.   Valid Segments Samples of Candidate Resources Side effect 1.  sider:class:sideeffect ! 2.  sider:property:side_effects! drug 1. drugbank: drugs 2.class:offer! 3.sider:drugs 4.diseases:possibledrug! tuberculosis 1.  diseases:1154 ! 2.  side_effects: C0041296! AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 13. + 13 Concurrent   Segmenta5on  and  Disambigua5on     AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 14. 14 Hidden Markov Model •  •  •  •  A statistics model containing a set of states. Moving from one state to another state generates a sequence of observations. The probability of entering state only depends on the previous state. Output is the most likely states generating the sequence of the observation. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 15. 15 State Space •  •  •  •  A state represents a knowledge base resource. Contains all resources in the knowledge base. In practice, we prune the state space by excluding irrelevant states. Adding an unknown entity state comprising all resources, which are not available (anymore) in the pruned state space. •  Extension of State Space with reasoning: An extension of the state space by including resources inferred from lightweight owl:sameAs reasoning. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 16. 16 Bootstrapping the Model Parameters Emission Probability •  The set-similarity level measures the difference between the label and the segment in terms of the number of words using the Jaccard similarity. •  The string-similarity level measures the string similarity of each word in the segment with the most similar word in the label using the Levenshtein distance. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 17. 17 Bootstrapping the Model Parameters Transition Probability & Initial Probability •  Computing the transition probability and initial probability based on Semantic relatedness of two resources. •  Semantic relatedness is based on two values: distance and connectivity degree. •  We transform these two values to hub and authority values using HITS algorithm. •  Initial probability and Transition probability are defined as a uniform distribution over the hub and and authority values. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 18. Evaluation of Bootstrapping 18 •  The accuracy of different distribution functions, i.e., Normal, Zipfian and uniform distributions for transition probability. •  We ran the distribution functions with two different inputs, i.e. distance and connectivity degree values as well as hub and authority values. AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 19. + Viterbi Algorithm Aim: The most likely path generating the sequence of input keywords. AKSW group - Question Answering on Interlinked Data (published in www2013) 19
  • 20. + 20 Output of the HMM for the following query: Which televisions shows were created by Walt Disney? Probability 0.0023 0.0014 5.89E-4 3.53E-4 3.76E-5 Path of states dbo:TelevisionShow , dbo:creator , dbr: dbo:TelevisionShow , dbo:creator , dbr: dbr:TelevisionShow , dbo:creator , dbr: dbr:TelevisionShow , dbo:creator , dbr: dbp:television , dbp:show , dbo:creator AKSW group - Question Answering on Interlinked Data (published in www2013) Walt_Disney! Category:Walt_Disney! Walt_Disney! Category:Walt_Disney! , dbr: Category:Walt_Disney!
  • 21. + 21 Query Construction     AKSW group - Question Answering on Interlinked Data (published in www2013)
  • 22. Query Construction Method Input: set of resources R = {r , r ,..., r } Output: A query graph QG = (V, E) is a directed, connected multi-graph. 1 2 n Forward Chaining: 1.  CT: Comprehensive type. 2.  CD: Comprehensive domain. 3.  CR: Comprehensive range. AKSW group - Question Answering on Interlinked Data (published in www2013) 22
  • 23. Query Construction Method Input: set of resources R = {r , r ,..., r } Output: A query graph QG = (V, E) is a directed, connected multi-graph. 1 2 n Generating the Incomplete Query Graph (IQG) Initializing vertices and primary edges. •  A vertex is added to IQG (1) If r is an instance, (2) If r is a class. •  Properties are added along with zero, one or two vertices. AKSW group - Question Answering on Interlinked Data (published in www2013) 23
  • 24. 24 Query Construction Method Example: What is the side effects of drugs used for Tuberculosis? •  diseasome:1154 ! ! •  diseasome:possibleDrug ! •  sider:sideEffect ! !(type !(type !(type Graph 1 !! property) sideEffect possibleDrug 1154 instance) !! property)! ?v0 ?v1 Graph 2 AKSW group - Question Answering on Interlinked Data (published in www2013) ?v2
  • 25. 25 Query Construction Method Connecting Sub-graphs of an IQG: 1.  Minimum spanning tree: a minimum set of edges (i.e., properties) to span a set of disjoint graphs. 2.  Prim’s algorithm: incrementally includes edges to connect disjoint sub-graphs. •  Direct properties: ?v0 ?p ?v1. •  Properties via owl:sameAs link. (1) ?v0 owl:sameAs ?x. ?x ?p ?v1. ! (2) ?v0 ?p ?x. ?x owl:sameAs ?v1. ! (3) ?v0 owl:sameAs ?x. ?x ?p ?y. ?y owl:sameAs ?v1. ! Template 1 Template 2 possibleDrug 1154 ?v0 1154 ?v2 ?v1 sideEffect ?v1 AKSW group - Question Answering on Interlinked Data (published in www2013) possibleDrug sideEffect ?v0 ?v2
  • 26. Evaluation Goal of experiment: How well: 1.  resource disambiguation 2.  query construction approaches perform. Measurement of the performance: 1.  For disambiguation using the Mean Reciprocal Rank (MRR). 2.  Query construction in terms of precision and recall. Benchmark 1.  A natural- language query and the equivalent conjunctive SPARQL query. 2.  25 queries on the 3 interlinked datasets Drugbank, Sider and Diseasome. 3.  QALD1 and QALD3 benchmark for DBpedia. AKSW group - Question Answering on Interlinked Data (published in www2013) 26
  • 27. Evaluation using life-science datasets Without reasoning: precision = 0.91 recall = 0.88 With reasoning: precision = 0.95 recall = 0.90 AKSW group - Question Answering on Interlinked Data (published in www2013) 27
  • 28. + Evaluation using DBpedia n  QALD3 Benchmark: ü  contains 100 questions. ü  32 original questions can be answered correctly. n  QALD1 Benchmark: ü  contains 50 questions. ü  7 complex questions. ü  13 questions requiring information beyond DBpedia, i.e., from YAGO and FOAF. ü  14 slightly were modified to remove expansion and cleaning problem. ü  MRR of disambiguation = 96% ü  Query construction accuracy = 83% AKSW group - Question Answering on Interlinked Data (published in www2013) 28
  • 29. Runtime Parallization over three components: 1.  Segment validation 2.  Resource retrieval 3.  Query construction AKSW group - Question Answering on Interlinked Data (published in www2013) 29
  • 30. + Related work AKSW group - Question Answering on Interlinked Data (published in www2013) 30
  • 31. 31 Thank you Saeedeh Shekarpour shekarpour@informatik-leipzig.de sa.shekarpour@gmail.com AKSW group - Question Answering on Interlinked Data (published in www2013)