SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
INSIGHT Centre for Data Analytics

www.insight-centre.org

Characterising concepts of interest
leveraging Linked Data
and the Social Web
Fabrizio Orlandi, Pavan Kapanipathi,
Amit Sheth, Alexandre Passant
IEEE/WIC/ACM Web Intelligence
Atlanta, GA, USA

20th November 2013

Copyright 2013 INSIGHT Centre for Data Analytics. All rights reserved.

Semantic Web & Linked Data
Research Programme
Scenario:
Personalisation and User Profiling on the Social Web

INSIGHT Centre for Data Analytics

www.insight-centre.org

Semantic Web & Linked Data
Research Programme
http://www.flickr.com/photos/giladlotan/
INSIGHT Centre for Data Analytics

www.insight-centre.org

Semantic Web & Linked Data
Research Programme
INSIGHT Centre for Data Analytics

www.insight-centre.org

Semantic Web & Linked Data
Research Programme
Solution
INSIGHT Centre for Data Analytics

www.insight-centre.org

Interlink social websites

Integration
&
User Modelling

Merge and model user data

Personalise users’ experience
using their profile

User Profile

Recommendations

Adaptive Systems

Search Personalisation
[Orlandi et al., I-Semantics 2012]

Semantic Web & Linked Data
Research Programme
Problem
INSIGHT Centre for Data Analytics



www.insight-centre.org

Entity-based user profiles of interests:

Sport
CEV Volleyball Cup
Music
Heavy Metal
Mastodon

Atlanta
…
6

Semantic Web & Linked Data
Research Programme
Problem
INSIGHT Centre for Data Analytics



www.insight-centre.org

Entity-based user profiles of interests:
Semantics?
Pragmatics?
Sport
CEV Volleyball Cup
Music
Heavy Metal
Mastodon

Relevance?

Atlanta
…
7

Semantic Web & Linked Data
Research Programme
Linking Open Data
INSIGHT Centre for Data Analytics



8

www.insight-centre.org

The Semantics of the Web of Data

LOD Cloud by R. Cyganiak
and A. Jentzsch

Semantic Web & Linked Data
Research Programme
Example
INSIGHT Centre for Data Analytics

www.insight-centre.org

“Mastodon is the best heavy metal band from Atlanta…
Can’t wait to see them live again!”

“Trentino vs Lugano about to start - Diatec youngster to
impress again in CEV Champions League #volleyball”
“W3C Invites Implementations of five Candidate
Recommendations for RDF 1.1 #SemanticWeb”

Music

Heavy Metal
Mastodon
• Named entity recognition
and disambiguation

• Frequency + time-decay
weighting scheme

Atlanta
CEV Champions League
Volleyball
Semantic Web
RDF

9

Semantic Web & Linked Data
Research Programme
Example
INSIGHT Centre for Data Analytics



www.insight-centre.org

Are all the extracted entities useful for personalisation?


How are concepts/entities being used on the Social Web? (Pragmatics)

Music
Heavy Metal
Mastodon (band)

Atlanta (GA.)
CEV Champions League
Volleyball

Very abstract, very popular
Very popular
Specific and time-dependent on events, etc.
Specific, very popular and time-dependent

Specific and time-dependent on events, etc.
Abstract and popular

Semantic Web
RDF
10

Abstract and not popular
Specific and not popular

Semantic Web & Linked Data
Research Programme
The Dimensions of our
Characterisation
INSIGHT Centre for Data Analytics



Specificity




www.insight-centre.org

The level of abstraction that an entity has in a common
conceptual schema shared by humans

Popularity


How popular an entity is on the Social Web
– How frequently is it mentioned/used at that point of time?



Temporal Dynamics


The trend and evolution of the frequency of mentions of an
entity on the Social Web
– i.e. popularity over time

11

Semantic Web & Linked Data
Research Programme
Requirements
INSIGHT Centre for Data Analytics



www.insight-centre.org

Our use case: real-time personalisation of Social
Web streams
1.

(quasi-) Real-time computation of the dimensions

2.

Results constantly up to date with the real world

3.

Knowledge base and domain independent approach

12

Semantic Web & Linked Data
Research Programme
Popularity
INSIGHT Centre for Data Analytics



www.insight-centre.org

We chose the Twitter Search API


We search for an entity on the Twitter stream in a short recent time
frame.



Run entity disambiguation on the resulting tweets to filter out noisy
tweets.



Count the remaining tweets in a given timeframe.



The Popularity measure is the resulting value in tweets/second.



This is fast, simple, up-to-date, only for short recent timeframe.

e.g. “Music”~ 16.6 tw/s
“Heavy Metal”~ 0.09 tw/s
“Semantic Web”~ 0.0008 tw/s
13

Semantic Web & Linked Data
Research Programme
Temporal Dynamics
INSIGHT Centre for Data Analytics



www.insight-centre.org

We use Wikipedia page views


Entities are already mapped to DBpedia



MediaWiki API provides a long history of daily page views of
Wikipedia articles



We use Mean and Standard Deviation for the last 30 days of page
views to identify if the popularity of an entity is:
– Stable/Unstable
– Trendy/Non-Trendy

CEV_Champions_League

Typhoon_Haiyan (2013)

(Diagrams from: stats.grok.se)

Semantic Web & Linked Data
Research Programme
Specificity
INSIGHT Centre for Data Analytics



www.insight-centre.org

We use the Linking Open Data (LOD) cloud


Most of the available knowledge bases (e.g. DMOZ, Wordnet,
OpenCyc) are not up-to-date.



Wikipedia would be large, domain-independent, continuously
updated, but:
– entities are not organised hierarchically in a taxonomy
– We cannot use taxonomy-based methods (i.e. super/sub -type rel.)
– PLUS: expensive algorithms would not be good for real-time computation

LOD Links Structure!
15

Semantic Web & Linked Data
Research Programme
Graph based measures
INSIGHT Centre for Data Analytics



www.insight-centre.org

SOA graph based method:


indegree and outdegree
(here called Incoming/Outgoing Predicates – IP and OP)



We can use these methods with RDF triples



We introduce “distinct in/out-degree” (IDP and ODP )
s1
p1
p1

s2

p2

p3
m

o1

p4

o2

Values for “m”:
IP (indegree) = 3
OP (outdegree) = 2
IDP (distinct indegree) = 2
ODP (distinct outdegree) = 2

s3
16

Semantic Web & Linked Data
Research Programme
Our Specificity Measure
INSIGHT Centre for Data Analytics



www.insight-centre.org

DRR (Distinct Relations Ratio):
Incoming Distinct Predicates (IDP)

DRR =



Outgoing Distinct Predicates (ODP)

Compared with:
IP/OP, IP+OP, IP, IDP



Computed on Sindice SPARQL
endpoint in less than 1sec.

17

Semantic Web & Linked Data
Research Programme
Alternative SOA Method
INSIGHT Centre for Data Analytics



www.insight-centre.org

DMOZ (Open Directory Project) taxonomy




18

We use the hierarchical structure of DMOZ as an alternative method to
measure specificity.
We manually map entities to the DMOZ entities and compute the
distance from the root of the DMOZ tree.

Semantic Web & Linked Data
Research Programme
Generation of a Gold Standard
INSIGHT Centre for Data Analytics



www.insight-centre.org

Binary classification of entities


5 humans classified 160 entities in:
– Generic (38%)
– Specific (62%)





Substantial agreement (k=0.61)

Ranking of entities


5 humans rated the specificity of 160 entities in:
– 1 to 10 scale (1=very generic, 10=very specific)
Average Rate

7.03

Average Std. Dev.

1.45

AVG Top 30 High Std. Dev.

5.66

AVG Top 30 Low Std. Dev.

7.51

Abstract entities are harder
for humans to rate

19

Semantic Web & Linked Data
Research Programme
Evaluation: Classification
INSIGHT Centre for Data Analytics



www.insight-centre.org

We compared the different methods against the gold standard
created manually by the users


Agreement with gold std. in the binary classification task:
DMOZ

IP/OP

IP+OP

IP

random

83.9%



DRR
84.1%

70.0%

70.0%

72.5%

61.9%

The performance of the DRR measure for this classification task
is comparable to a manual classification done using the DMOZ
taxonomy and to human judgement.

20

Semantic Web & Linked Data
Research Programme
Evaluation: Ranking
INSIGHT Centre for Data Analytics



www.insight-centre.org

We rank the specificity of 50 randomly chosen entities using:


Gold standard (average of the 5 users’ rates for each entity)



DMOZ levels (integers, 0 to 9)
– We compute “DMOZ-” and “DMOZ+” as the worst and best possible rankings
compared to the gold standard ranking.





DRR, IP/OP, IP+OP, random, values (real numbers)

We compute NDCG (Normalized Discounted Cumulative Gain) at
different ranking positions “p”.

(DCGideal is the ranking of the gold std.)

Semantic Web & Linked Data
Research Programme
Evaluation: Ranking
INSIGHT Centre for Data Analytics

www.insight-centre.org

DRR: +5% for NDCG at 10 and 20

Semantic Web & Linked Data
Research Programme
Evaluation on User Profiles
INSIGHT Centre for Data Analytics



www.insight-centre.org

We evaluate the impact of the proposed measures on user
profiles of interests, a real use case



Interests extracted from users’ posts on Facebook and Twitter
with NLP tools (as described in our previous work [1])



Frequency-based + time decay weighting strategy



Each user rated his/her Top 30 list of interests generated (total
of 794 user ratings)



23

27 volunteers

Ratings on a “1 to 5” scale according to how relevant/interesting
is each entity of interest to the user (5 is highly relevant)

[1] Orlandi et al., I-Semantics 2012

Semantic Web & Linked Data
Research Programme
Evaluation on User Profiles
INSIGHT Centre for Data Analytics



www.insight-centre.org

Average score (1 to 5 scale) is computed according to groups of types of
entities

(+8%)

(17%)

(+12%)





24

Not-popular and generic entities better represent users’ perception of
their interests (but we have only 17% of them)
This behaviour might be different in other applications and use cases!
(e.g. news recommendations, etc.)

Semantic Web & Linked Data
Research Programme
Conclusions
INSIGHT Centre for Data Analytics

www.insight-centre.org



Introduced dimensions for characterisation of concepts of interest:
specificity, popularity and temporal dynamics.



Proposed methods for their computation satisfying requirements for
real-time personalisation of Social Web streams:




Introduced a novel measure (DRR) for specificity of concepts based
on the LOD cloud




Evaluated for two different tasks (classification and ranking) against SOA
methods (humans, DMOZ, graph measures)

Evaluated the impact of the measures on user profiles of interests
(27 users and ~800 ratings)


25

Real-time, domain independent, up to date.

Abstract and non-popular interests are preferred by users

Semantic Web & Linked Data
Research Programme
Future work
INSIGHT Centre for Data Analytics



www.insight-centre.org

Experiment the measures on user profiles used for different
personalisation tasks.
 E.g. a tweets recommender system should give priority to trendy,
popular and specific entities instead.



Improve the simple popularity and trend detection methods.



Improve the DRR measure adding more “semantics”, i.e. considering
the different types of edges.

26

Semantic Web & Linked Data
Research Programme
Thanks!
INSIGHT Centre for Data Analytics

www.insight-centre.org

@badmotorf
fabrizio.orlandi@deri.org
@pavankaps
pavan@knoesis.org
@amit_p
amit@knoesis.org
@terraces
alex@seevl.net

Semantic Web & Linked Data
Research Programme

Weitere ähnliche Inhalte

Was ist angesagt?

SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYAmit Sheth
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsCraig Knoblock
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceGabriel Moreira
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceTrey Grainger
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondNeo4j
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Connected Data World
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayAmit Sheth
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTourismFastForward
 
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Paris Sud University
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with SparkGhulam Imaduddin
 
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Cataldo Musto
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataCristina Sarasua
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge GraphLukas Masuch
 
Propelling the Potential of Linked Data in Enterprises
Propelling the Potential of Linked Data in EnterprisesPropelling the Potential of Linked Data in Enterprises
Propelling the Potential of Linked Data in EnterprisesSabrina Kirrane
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Gabriel Moreira
 
Autodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataAutodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataConnected Data World
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020Pistoia Alliance
 

Was ist angesagt? (20)

SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
 
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge GraphsExtracting, Aligning, and Linking Data to Build Knowledge Graphs
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
 
Web Mining
Web MiningWeb Mining
Web Mining
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and Beyond
 
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World Today
 
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen TechnologienTFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
TFF2016, Rudi Studer, Smarte Dienstleistungen mit semantischen Technologien
 
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with Spark
 
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Propelling the Potential of Linked Data in Enterprises
Propelling the Potential of Linked Data in EnterprisesPropelling the Potential of Linked Data in Enterprises
Propelling the Potential of Linked Data in Enterprises
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018
 
Autodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataAutodiscovery or The long tail of open data
Autodiscovery or The long tail of open data
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020
 

Andere mochten auch

iRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationiRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationFabrizio Orlandi
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaFabrizio Orlandi
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebFabrizio Orlandi
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebFabrizio Orlandi
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Rinke Hoekstra
 

Andere mochten auch (6)

iRap - Interest based RDF update propagation
iRap - Interest based RDF update propagationiRap - Interest based RDF update propagation
iRap - Interest based RDF update propagation
 
Semantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in WikipediaSemantic Representation of Provenance in Wikipedia
Semantic Representation of Provenance in Wikipedia
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic Web
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 

Ähnlich wie Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked Data and the Social Web

Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsIRJET Journal
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016Jessie Chuang
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bhaskar Ghosh
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Denodo
 
Graph-based Network & IT Management.
Graph-based Network & IT Management.Graph-based Network & IT Management.
Graph-based Network & IT Management.Linkurious
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportDavid Kennedy
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfBalasundaramSr
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Gabriel Moreira
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
Graph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementGraph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementLinkurious
 

Ähnlich wie Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked Data and the Social Web (20)

Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Enabling Citizen-empowered Apps over Linked Data
Enabling Citizen-empowered Apps over Linked DataEnabling Citizen-empowered Apps over Linked Data
Enabling Citizen-empowered Apps over Linked Data
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
 
Graph-based Network & IT Management.
Graph-based Network & IT Management.Graph-based Network & IT Management.
Graph-based Network & IT Management.
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf export
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
Graph-based Product Lifecycle Management
Graph-based Product Lifecycle ManagementGraph-based Product Lifecycle Management
Graph-based Product Lifecycle Management
 

Mehr von Fabrizio Orlandi

Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Fabrizio Orlandi
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Fabrizio Orlandi
 
Modelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsModelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsFabrizio Orlandi
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebFabrizio Orlandi
 
Semantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamSemantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamFabrizio Orlandi
 
Semantic search on heterogeneous wiki systems - Wikimania 2010
Semantic search on heterogeneous wiki systems - Wikimania 2010Semantic search on heterogeneous wiki systems - Wikimania 2010
Semantic search on heterogeneous wiki systems - Wikimania 2010Fabrizio Orlandi
 
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Fabrizio Orlandi
 
Semantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterSemantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterFabrizio Orlandi
 
Semantic Search on Heterogeneous Wiki Systems - Short
Semantic Search on Heterogeneous Wiki Systems - ShortSemantic Search on Heterogeneous Wiki Systems - Short
Semantic Search on Heterogeneous Wiki Systems - ShortFabrizio Orlandi
 
Enabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyEnabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyFabrizio Orlandi
 

Mehr von Fabrizio Orlandi (10)

Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
Modelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphsModelling context and statement-level metadata in knowledge graphs
Modelling context and statement-level metadata in knowledge graphs
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 
Semantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter streamSemantic user profiling and Personalised filtering of the Twitter stream
Semantic user profiling and Personalised filtering of the Twitter stream
 
Semantic search on heterogeneous wiki systems - Wikimania 2010
Semantic search on heterogeneous wiki systems - Wikimania 2010Semantic search on heterogeneous wiki systems - Wikimania 2010
Semantic search on heterogeneous wiki systems - Wikimania 2010
 
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010Semantic Search on Heterogeneous Wiki Systems - wikisym2010
Semantic Search on Heterogeneous Wiki Systems - wikisym2010
 
Semantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - posterSemantic Search on Heterogeneous Wiki Systems - poster
Semantic Search on Heterogeneous Wiki Systems - poster
 
Semantic Search on Heterogeneous Wiki Systems - Short
Semantic Search on Heterogeneous Wiki Systems - ShortSemantic Search on Heterogeneous Wiki Systems - Short
Semantic Search on Heterogeneous Wiki Systems - Short
 
Enabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontologyEnabling cross-wikis integration by extending the SIOC ontology
Enabling cross-wikis integration by extending the SIOC ontology
 

Kürzlich hochgeladen

Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubaikojalkojal131
 
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digitalmacawdigitalseo2023
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfSoftServe HRM
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsSocioCosmos
 
Top 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediaTop 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediae-Definers Technology
 
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfINDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfcarlos784vt
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesNetqom Solutions
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch17mos052
 
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT17mos052
 

Kürzlich hochgeladen (9)

Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In DubaiDubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
Dubai Calls Girls Busty Babes O525547819 Call Girls In Dubai
 
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw DigitalTop 5 Ways To Use Reddit for SEO  SEO Expert in USA - Macaw Digital
Top 5 Ways To Use Reddit for SEO SEO Expert in USA - Macaw Digital
 
Values Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdfValues Newsletter teamwork section 2023.pdf
Values Newsletter teamwork section 2023.pdf
 
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the StarsUnveiling SOCIO COSMOS: Where Socializing Meets the Stars
Unveiling SOCIO COSMOS: Where Socializing Meets the Stars
 
Top 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social mediaTop 10 Ways to Know If a Song on social media
Top 10 Ways to Know If a Song on social media
 
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdfINDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
INDIGENOUS GODS AND INDIGENOUS GODDESSES.pdf
 
Amplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing ServicesAmplify Your Brand with Our Tailored Social Media Marketing Services
Amplify Your Brand with Our Tailored Social Media Marketing Services
 
The--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media PitchThe--Fraud: Netflix Original Media Pitch
The--Fraud: Netflix Original Media Pitch
 
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECTTHE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
THE FRAUD NETFLIX ORIGINAL MEDIA PITCH PROJECT
 

Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked Data and the Social Web

  • 1. INSIGHT Centre for Data Analytics www.insight-centre.org Characterising concepts of interest leveraging Linked Data and the Social Web Fabrizio Orlandi, Pavan Kapanipathi, Amit Sheth, Alexandre Passant IEEE/WIC/ACM Web Intelligence Atlanta, GA, USA 20th November 2013 Copyright 2013 INSIGHT Centre for Data Analytics. All rights reserved. Semantic Web & Linked Data Research Programme
  • 2. Scenario: Personalisation and User Profiling on the Social Web INSIGHT Centre for Data Analytics www.insight-centre.org Semantic Web & Linked Data Research Programme http://www.flickr.com/photos/giladlotan/
  • 3. INSIGHT Centre for Data Analytics www.insight-centre.org Semantic Web & Linked Data Research Programme
  • 4. INSIGHT Centre for Data Analytics www.insight-centre.org Semantic Web & Linked Data Research Programme
  • 5. Solution INSIGHT Centre for Data Analytics www.insight-centre.org Interlink social websites Integration & User Modelling Merge and model user data Personalise users’ experience using their profile User Profile Recommendations Adaptive Systems Search Personalisation [Orlandi et al., I-Semantics 2012] Semantic Web & Linked Data Research Programme
  • 6. Problem INSIGHT Centre for Data Analytics  www.insight-centre.org Entity-based user profiles of interests: Sport CEV Volleyball Cup Music Heavy Metal Mastodon Atlanta … 6 Semantic Web & Linked Data Research Programme
  • 7. Problem INSIGHT Centre for Data Analytics  www.insight-centre.org Entity-based user profiles of interests: Semantics? Pragmatics? Sport CEV Volleyball Cup Music Heavy Metal Mastodon Relevance? Atlanta … 7 Semantic Web & Linked Data Research Programme
  • 8. Linking Open Data INSIGHT Centre for Data Analytics  8 www.insight-centre.org The Semantics of the Web of Data LOD Cloud by R. Cyganiak and A. Jentzsch Semantic Web & Linked Data Research Programme
  • 9. Example INSIGHT Centre for Data Analytics www.insight-centre.org “Mastodon is the best heavy metal band from Atlanta… Can’t wait to see them live again!” “Trentino vs Lugano about to start - Diatec youngster to impress again in CEV Champions League #volleyball” “W3C Invites Implementations of five Candidate Recommendations for RDF 1.1 #SemanticWeb” Music Heavy Metal Mastodon • Named entity recognition and disambiguation • Frequency + time-decay weighting scheme Atlanta CEV Champions League Volleyball Semantic Web RDF 9 Semantic Web & Linked Data Research Programme
  • 10. Example INSIGHT Centre for Data Analytics  www.insight-centre.org Are all the extracted entities useful for personalisation?  How are concepts/entities being used on the Social Web? (Pragmatics) Music Heavy Metal Mastodon (band) Atlanta (GA.) CEV Champions League Volleyball Very abstract, very popular Very popular Specific and time-dependent on events, etc. Specific, very popular and time-dependent Specific and time-dependent on events, etc. Abstract and popular Semantic Web RDF 10 Abstract and not popular Specific and not popular Semantic Web & Linked Data Research Programme
  • 11. The Dimensions of our Characterisation INSIGHT Centre for Data Analytics  Specificity   www.insight-centre.org The level of abstraction that an entity has in a common conceptual schema shared by humans Popularity  How popular an entity is on the Social Web – How frequently is it mentioned/used at that point of time?  Temporal Dynamics  The trend and evolution of the frequency of mentions of an entity on the Social Web – i.e. popularity over time 11 Semantic Web & Linked Data Research Programme
  • 12. Requirements INSIGHT Centre for Data Analytics  www.insight-centre.org Our use case: real-time personalisation of Social Web streams 1. (quasi-) Real-time computation of the dimensions 2. Results constantly up to date with the real world 3. Knowledge base and domain independent approach 12 Semantic Web & Linked Data Research Programme
  • 13. Popularity INSIGHT Centre for Data Analytics  www.insight-centre.org We chose the Twitter Search API  We search for an entity on the Twitter stream in a short recent time frame.  Run entity disambiguation on the resulting tweets to filter out noisy tweets.  Count the remaining tweets in a given timeframe.  The Popularity measure is the resulting value in tweets/second.  This is fast, simple, up-to-date, only for short recent timeframe. e.g. “Music”~ 16.6 tw/s “Heavy Metal”~ 0.09 tw/s “Semantic Web”~ 0.0008 tw/s 13 Semantic Web & Linked Data Research Programme
  • 14. Temporal Dynamics INSIGHT Centre for Data Analytics  www.insight-centre.org We use Wikipedia page views  Entities are already mapped to DBpedia  MediaWiki API provides a long history of daily page views of Wikipedia articles  We use Mean and Standard Deviation for the last 30 days of page views to identify if the popularity of an entity is: – Stable/Unstable – Trendy/Non-Trendy CEV_Champions_League Typhoon_Haiyan (2013) (Diagrams from: stats.grok.se) Semantic Web & Linked Data Research Programme
  • 15. Specificity INSIGHT Centre for Data Analytics  www.insight-centre.org We use the Linking Open Data (LOD) cloud  Most of the available knowledge bases (e.g. DMOZ, Wordnet, OpenCyc) are not up-to-date.  Wikipedia would be large, domain-independent, continuously updated, but: – entities are not organised hierarchically in a taxonomy – We cannot use taxonomy-based methods (i.e. super/sub -type rel.) – PLUS: expensive algorithms would not be good for real-time computation LOD Links Structure! 15 Semantic Web & Linked Data Research Programme
  • 16. Graph based measures INSIGHT Centre for Data Analytics  www.insight-centre.org SOA graph based method:  indegree and outdegree (here called Incoming/Outgoing Predicates – IP and OP)  We can use these methods with RDF triples  We introduce “distinct in/out-degree” (IDP and ODP ) s1 p1 p1 s2 p2 p3 m o1 p4 o2 Values for “m”: IP (indegree) = 3 OP (outdegree) = 2 IDP (distinct indegree) = 2 ODP (distinct outdegree) = 2 s3 16 Semantic Web & Linked Data Research Programme
  • 17. Our Specificity Measure INSIGHT Centre for Data Analytics  www.insight-centre.org DRR (Distinct Relations Ratio): Incoming Distinct Predicates (IDP) DRR =  Outgoing Distinct Predicates (ODP) Compared with: IP/OP, IP+OP, IP, IDP  Computed on Sindice SPARQL endpoint in less than 1sec. 17 Semantic Web & Linked Data Research Programme
  • 18. Alternative SOA Method INSIGHT Centre for Data Analytics  www.insight-centre.org DMOZ (Open Directory Project) taxonomy   18 We use the hierarchical structure of DMOZ as an alternative method to measure specificity. We manually map entities to the DMOZ entities and compute the distance from the root of the DMOZ tree. Semantic Web & Linked Data Research Programme
  • 19. Generation of a Gold Standard INSIGHT Centre for Data Analytics  www.insight-centre.org Binary classification of entities  5 humans classified 160 entities in: – Generic (38%) – Specific (62%)   Substantial agreement (k=0.61) Ranking of entities  5 humans rated the specificity of 160 entities in: – 1 to 10 scale (1=very generic, 10=very specific) Average Rate 7.03 Average Std. Dev. 1.45 AVG Top 30 High Std. Dev. 5.66 AVG Top 30 Low Std. Dev. 7.51 Abstract entities are harder for humans to rate 19 Semantic Web & Linked Data Research Programme
  • 20. Evaluation: Classification INSIGHT Centre for Data Analytics  www.insight-centre.org We compared the different methods against the gold standard created manually by the users  Agreement with gold std. in the binary classification task: DMOZ IP/OP IP+OP IP random 83.9%  DRR 84.1% 70.0% 70.0% 72.5% 61.9% The performance of the DRR measure for this classification task is comparable to a manual classification done using the DMOZ taxonomy and to human judgement. 20 Semantic Web & Linked Data Research Programme
  • 21. Evaluation: Ranking INSIGHT Centre for Data Analytics  www.insight-centre.org We rank the specificity of 50 randomly chosen entities using:  Gold standard (average of the 5 users’ rates for each entity)  DMOZ levels (integers, 0 to 9) – We compute “DMOZ-” and “DMOZ+” as the worst and best possible rankings compared to the gold standard ranking.   DRR, IP/OP, IP+OP, random, values (real numbers) We compute NDCG (Normalized Discounted Cumulative Gain) at different ranking positions “p”. (DCGideal is the ranking of the gold std.) Semantic Web & Linked Data Research Programme
  • 22. Evaluation: Ranking INSIGHT Centre for Data Analytics www.insight-centre.org DRR: +5% for NDCG at 10 and 20 Semantic Web & Linked Data Research Programme
  • 23. Evaluation on User Profiles INSIGHT Centre for Data Analytics  www.insight-centre.org We evaluate the impact of the proposed measures on user profiles of interests, a real use case   Interests extracted from users’ posts on Facebook and Twitter with NLP tools (as described in our previous work [1])  Frequency-based + time decay weighting strategy  Each user rated his/her Top 30 list of interests generated (total of 794 user ratings)  23 27 volunteers Ratings on a “1 to 5” scale according to how relevant/interesting is each entity of interest to the user (5 is highly relevant) [1] Orlandi et al., I-Semantics 2012 Semantic Web & Linked Data Research Programme
  • 24. Evaluation on User Profiles INSIGHT Centre for Data Analytics  www.insight-centre.org Average score (1 to 5 scale) is computed according to groups of types of entities (+8%) (17%) (+12%)   24 Not-popular and generic entities better represent users’ perception of their interests (but we have only 17% of them) This behaviour might be different in other applications and use cases! (e.g. news recommendations, etc.) Semantic Web & Linked Data Research Programme
  • 25. Conclusions INSIGHT Centre for Data Analytics www.insight-centre.org  Introduced dimensions for characterisation of concepts of interest: specificity, popularity and temporal dynamics.  Proposed methods for their computation satisfying requirements for real-time personalisation of Social Web streams:   Introduced a novel measure (DRR) for specificity of concepts based on the LOD cloud   Evaluated for two different tasks (classification and ranking) against SOA methods (humans, DMOZ, graph measures) Evaluated the impact of the measures on user profiles of interests (27 users and ~800 ratings)  25 Real-time, domain independent, up to date. Abstract and non-popular interests are preferred by users Semantic Web & Linked Data Research Programme
  • 26. Future work INSIGHT Centre for Data Analytics  www.insight-centre.org Experiment the measures on user profiles used for different personalisation tasks.  E.g. a tweets recommender system should give priority to trendy, popular and specific entities instead.  Improve the simple popularity and trend detection methods.  Improve the DRR measure adding more “semantics”, i.e. considering the different types of edges. 26 Semantic Web & Linked Data Research Programme
  • 27. Thanks! INSIGHT Centre for Data Analytics www.insight-centre.org @badmotorf fabrizio.orlandi@deri.org @pavankaps pavan@knoesis.org @amit_p amit@knoesis.org @terraces alex@seevl.net Semantic Web & Linked Data Research Programme