SlideShare ist ein Scribd-Unternehmen logo
1 von 188
Downloaden Sie, um offline zu lesen
Delft
University of
Technology
Link, Like, Follow, Friend:
The Social Element in User Modeling and
Adaptation
UMAP, Rome, June, 2013
Geert-Jan Houben
Web Information Systems, TU Delft
2
Social Web & UMAP
3
Social Web & UMAP
We observe, reflect, speculate, and raise discussion
about evolutions and opportunities
for UMAP to make a difference.
Triggered by
the social element in UMAP and other conferences
& our own experience in the field.
4
Social Web in UMAP:
a number of mentions of ‘social’, and
a small number of ‘social web’ in the papers.
New U (in UMAP), new users
And we see more.
We see how the Social Web mirrors people, mirrors users.
What we learn at the Social Web,
learn (more) about users
and for user modeling
and adaptation.
5
UMAP in the new Web world
What we learn at the Social Web
allows us to reconsider UMAP in the Web.
It brings new opportunities for us as researchers.
Perhaps it brings new needs.
Surely, these are opportunities that we can position within our
UMAP research agenda
and UMAP application portfolio.
6
SWUMAP: 1 + 1 = 3
Experience shows to combine:
Understanding & Creating
UM & AP
Machines & Humans
Arrive at a body of knowledge
for turning insights about
users and usage into added
value in society and economy.
7
UMAP systems are Web systems
Lessons tell us to reconsider our system concept.
On the Web systems are ‘in vivo’: open and dynamic.
•  Users & data are not (longer) ‘inside the system’.
•  Users & data change, move (more) quickly.
This impacts understanding and creating of systems.
This also impacts the systems’ architecture.
With the (Social) Web as our laboratory,
this also impacts our research discipline.
8
APPLICATION
HUMANS FOR AUGMENTATION
USERSDOMAIN
DOMAIN
Augmented with
Web Semantics
USERS
Augmented with
Web Semantics
REAL
DOMAIN
REAL
USERS
9
10
Inspiring domain
11
Domain: Incidents and emergencies
In literature we see a fair attention
for the domain of incidents and emergencies.
Our own experience from several years
is situated in that domain.
It has given us a good feeling
for what is needed and
how UMAP research can be part of a bigger effort
to solve real-world problems.
12
Domain: Incidents and emergencies
In literature, most attention is directed towards
understanding and detecting.
Sometimes we see further objectives in
responding,
creating situational awareness (specially in massive
incidents), and
prevention.
Most used in these studies is Twitter.
13
2011 Tohoku eartquakes 1200 tweets/minmin
14
2011 Pukkelpop storm 570 tweets/min
15
Twitter
With Twitter, we have a whole new reflection of
what is happening in the world.
A whole new source of digital data
that reflects the (real) world.
We need to understand that reflection
to understand the world
and help the world.
Two challenges:
1.  Understand the world, and
2.  Understand its reflection in the Social Web.
16
CrowdSense BV
http://twitcident.org http://tno.nl/twitcidenthttp://twitcident.com
Twitcident spin-off collaboration
Our real-world lab
17
400+ million tweets per day
•  Netherlands ranks #1 in Twitter penetration
Twitter users publish about “anything”
•  Work/private life
•  Interesting events
•  Etc.
Twitter tells us a lot about the world.
And its users can be seen to act as social sensors and
citizen journalists.
Monitoring Twitter
18
Train accident
Train driver got a stroke
19
Train hits a block
First eyewitness
20
1 min. later
Eyewitness
21
15 min. later
Eyewitnesses
22
1:17 hour later
News media
23
30 min. later
Entertainment
wrong
photo
24
A new source of knowledge
An example of the speed
and the nature
of knowledge that Twitter provides
and what it does
to provide knowledge about what really happened.
Also, it shows what we need to know and understand
to use and interpret this effectively.
25
1.  Early warning
•  Twitter users publish early signals that might indicate an increased
risk or potential incident.
2.  Crisis management
•  (Eye-witness) Twitter users disseminate information about incidents
which can support operational emergency services.
3.  Post evaluation
•  Post analyzing incident data (in retrospect) to measure the
effectiveness of emergency services.
Twitcident goals
26
• Emergency services
•  Law enforcement, fire fighters, governments
• Big event organizers
•  Festival security companies
• Utility organizations
•  Public transport, energy supply, other vital infrastructures
Stakeholders
27
Pukkelpop 2011
Storm incident with casualties
28
80.000 tweets in 4 hours
29
570 tweets per min.
30
Could we see this impact coming?
Semantics 25 minutes before incident
1.  Weather: storm, cloud-burst, wind, ….
2.  Locations: Brussel, Gent, Hasselt, …
3.  Intensity: heavy, crazy, massive…
4.  Impact: hail balls, falling trees…
Impact
stormWhy is there a peak?
“ ”
31
Damage reports from incident site
32
Real-time intelligence by photos
33
Example festival disaster
The research into this example
created a lot of knowledge
about what is possible
and what is desired.
It was also a good example to follow
and approach new use cases
to build more general understanding and theory.
34
Dutch national rail infrastructure company
Example public infrastructure
35
Social Weather Map
36
Twitcident processes 100k tweets/day
The social weather map provides ProRail with a timely
and accurate overview of citizen observations.
In addition to other sources of knowledge.
Value
37
Big Events
New Year’s Eve
Serious Request
Elections
Lowlands
Summer Carnaval
Fantasy Island
Queen’s Day
38
Crowd Control Room
39
Social media monitoring was done with 1-3 security officers
Violence, riots, fires, fireworks, crowds, ..
40
Not only monitoring
The previous examples are not only about
monitoring Twitter
to know what is happening out there.
41
She was about to turn 16…
42
So she invited some friends …
43
Which invited their friends…
44
She pulled back her invitation, but…
45
45
46
200.000 invited, 40.000 “going”
47
Atmosphere turned hostile
48
Teenagers vs. police
49
Alcohol & violence
49
50
Massive damage at local stores
50
51
Officials cannot ignore social media
52
Mayor of Haren resigns after Haren-debacle
53
Recommendations by Cohen
•  Clear communication strategy
•  Planning & organizing in advance
•  Social media monitoring
•  Clear intervention policy
54
Why social media monitoring?
Content tells
what to expect
55
Finding the needle
56
Recommendation from experience
Let us go and find the needle
that tells us what appears to be happening out there
But let us also think about how to support the action
to make the world out there a better one.
57
Meaningful and actionable
Twitcident has learned us
how information obtained from Twitter needs to be
meaningful and actionable.
58
“Polling meaningful information”
“Sifting thousands of tweets during hurricane Irene”
“Getting situational awareness”
“Finding the eye’s on the ground”
“Finding actionable information”
“Providing timely reaction”
“
“Volunteers are great”
“But we need hybrid approaches to
monitor social media”
Patrick Meier
Today’s challenges
59
Hybrid approach
Twitcident has also shown us how
these problems ask for a hybrid approach
with humans in the loop
that handle and interpret the knowledge
derived from the Social Web.
Big Data is available from the Social Web,
but Small Interpretations are needed, to get it right!
60
Human interpretation inside
The nature of these problems makes
that solutions are not fully automatic.
They involve users of systems
that help the interpretation and decision taking.
It is a special kind of users
that we (as UMAP) can consider
and that is fast growing
and in urgent need of support.
61
Take home from experience
Learn from concrete cases:
•  Case-based experimental approaches bring specific understanding and
experience necessary for general understanding and theory.
•  Cases can have great value for stakeholders.
It is all about correct and actionable interpretation:
•  Make information meaningful and actionable in the context.
•  Employ hybrid, human-enhanced approaches for the context.
62
APPLICATION
HUMANS FOR AUGMENTATION
USERSDOMAIN
DOMAIN
Augmented with
Web Semantics
USERS
Augmented with
Web Semantics
REAL
DOMAIN
REAL
USERS
63
64
Technology for sense-making
65
Challenge: Making sense of Twitter
Inspired by different applications and domains,
researchers have given attention
to underlying technology
for making sense of Twitter.
‘Finding the needle’
as the research challenge.
66
Technology for making sense
The sense-making usually relies on
application and domain specific knowledge and
researchers investigate how to do it effectively.
Semantics and interactivity
prove to be important ingredients.
In fact, it turns out that
sense-making, i.e. finding the needle,
is a combination of many things
that need to be coming together.
67
Technology for making sense
68
Semantics for filtering and search
69
Semantics for filtering and search
In [HT2012] we considered
what is needed as first steps in processing tweets,
before we can ‘analyze’ them.
70
1.  (Automatic) Filtering: Given an incident, how can one
automatically identify those tweets that are relevant to
the incident?
2.  Search & Analytics: How can one improve search and
analytical capabilities so that users can explore
information in the streams of tweets?
Twitter streams
Challenges
Filtering
topic
Search &
Analytics
information need
71
Dataset
• Twitter corpus (TREC Microblog Track 2011)
• 16 million tweets (Jan. 24th – Feb. 8th, 2011 )
• 4,766,901 tweets classified as English
• 6.2 million entity-extractions
• News (Same time period)
• 62 RSS News Feeds
• 13,959 News Articles
• 357,559 entity-extractions
72
Filtering evaluation
!"#$%
!"&'% !"&&%
!"$'%
!"#(%
!"&)%!"$*%
!"#)%
!"&#%
!"'&%
!")#%
!"+$%
!%
!"&%
!"#%
!"$%
!"+%
!")%
!"'%
!"(%
,-./012%
3456-7408%%
,-./012%
3456-7408%946:%
;-9<%
=>06-?6@/54A/1>0%
B/<-540-C%
D-E9>7F%3456-7408%
GHI%
IJ&!%
IJ$!%
K-2/55%
Semantic strategies outperform the keyword-
based filtering regarding all metrics.
73
Filtering evaluation
The semantic strategy is more robust and
achieves higher precisions for complex topics.
1 2 3 4
number of entities extracted from
inital topic description
0
0.2
0.4
0.6
0.8
1
Precision@30andRecall
Precision@30
Recall
1 2 3 4 5
number of words in the inital topic
description
0
0.2
0.4
0.6
0.8
1
Precision@30andRecall
Precision@30
Recall
74
Faceted search evaluation
!"#$%
!"&'%
!"'#%
!%
!"(%
!")%
!"'%
!"*%
+%
,-./0.1234567.8%
,62.9.8%7.6-2:%
:67:96;4567.8%
,62.9.8%7.6-2:%
:67:96;4567.8%
<.3=>-8%7.6-2:%
!"#$%&"'()*+'#,%&#$-%
.!&&/%+0%1#*2"1%(1"3%
with semantic enrichment without semantic enrichment
The semantic faceted search strategy improves
the search performance by 34.8% and 22.4%.
75
Faceted search evaluation
Strategies with semantic enrichment outperform
those without in predicting appropriate facet-values.
3Adaptive Faceted Search on Twitter
!"#$% !"#&%
!"#'%
!"'(%
!"#&%
!")'%
!"#(%
!"'*%
!"#+% !"#)%
!",+%
!"',%
!%
!"!+%
!"'%
!"'+%
!",%
!",+%
!"#%
!"#+%
!")%
!")+%
-./0123456.7%
89
.:0.2058;.%
</.=>.2?@%
A30AB3C:D30.7%
EF+%
EF'!%
GHH%
with semantic enrichment without semantic enrichment
76
Lessons
The context: a (Twitcident-inspired) framework for
filtering, searching, and analyzing information
about incidents that people publish on Twitter.
We have seen how to obtain
• better filtering of Twitter messages for a given incident,
• better search for relevant information about an incident
within the filtered messages.
For these first steps in processing Twitter messages,
the semantic interpretation is the key element
that we need to understand for the given context.
77
Semantics for enrichment and linking
78
Semantics for enrichment and linkage
In [ESWC2011] we focused more on
the semantics for enrichment and linkage
to connect the tweets to background knowledge
and thus enhance what we can learn from them.
79
SI Sportsman of the
year: Surprise French
Open champ
Francesca Schiavone
Thirty in women's tennis is primordially
old, …
news article
topic:Sports topic:Sports
topic:Tennis
person:Francesca_Schiavone
oc:SportsGame
event:FrenchOpen
francesca is becoming #sport
idol of the year!
microblog post
user
enrichment enrichment
user modeling
linkage
Profile
Topics of interest:
- topic:Tennis
- topic:Sports
People of interest:
- person:Francesca_Schiavone
Events of interest:
- event:FrenchOpen
Example: Semantic enrichment of Twitter posts
80
SI Sportsman of the
year: Surprise French
Open champ
Francesca Schiavone
Thirty in women's tennis is primordially
old, …
news article
francesca is becoming #sport
idol of the year!
microblog post
user linkage
How?
Goal	
  in	
  this	
  linkage	
  discovery	
  is	
  to	
  iden3fy	
  news	
  resources	
  
that	
  are	
  related	
  to	
  a	
  given	
  Twi8er	
  message:	
  
1.  Web	
  resource	
  has	
  to	
  be	
  related	
  to	
  the	
  given	
  tweet	
  
2.  Web	
  resource	
  has	
  to	
  be	
  related	
  to	
  news	
  
	
  
Linkage discovery
81
Francesca Schiavone is
sportsman of the year
#sport #tennis
Content-based
SI Sportsman of the year:
Surprise French Open
champ
Francesca Schiavone
Thirty in women's tennis is
primordially old…
Francesca Schiavone is
sportsman of the year
#sport #tennis
Hashtag-based
Petkovic & Goerges
leading German tennis
revival
there are signs that German
tennis is…
The image
cannot be
displayed.
Linkage discovery strategies
82
nice! http://bit.ly/eiU33c URL-based
SI Sportsman of the year:
Surprise French Open
champ
Francesca Schiavone
Thirty in women's tennis is
primordially old…
news article URL
Entity-based
Olympic champion and world
number nine Elena
Dementieva announced her
retirement
The 29-year-old Russian delivered
the shock news after losing to
Francesca Schiavone in the group
stages of the season-ending
tournamen …
news article
Entity-based
Francesca Schiavone is
sportsman of the year
#sport #tennis temporal constraint
Old news L
publish date
publish date
•  URL-based (Strict): only consider content of the Twitter message
•  URL-based (Lenient): also consider reply or re-tweet messages
Linkage discovery strategies
83
Evaluation on linkage discovery
!"#!#$%
!"&!'$%
!"&'()%
!")#$$%
!")*+%
!"*!(,%
!% !"#% !"'% !"$% !"&% !"(% !"+% !")% !"*% !",%
-./01/0234516%78492.:2;.<65=%
>45?049234516%
@/A0B234516%7CD0?.E0%01FG.<4H%I./50<4D/05=%
@/A0B234516%
JKL234516%7H1/D1/0=%
JKL234516%750<DI0=%
!"#$%&%'()
URL-based strategies offer good linkage.
84
Analysis on linkage discovery and
semantic enrichment
•  URL-based strategies: more than 10 tweet-news relations for c.a. more than 1000
•  Entity-based strategy: found
a far more higher number of
tweet-news relations
•  Hashtag-based strategy failed
for more than 79% of the users
because of the limited usage of
hashtags
•  Combination of all strategies:
higher than 10 tweet-news
relation found for more than 20%
of the users
Entity-based URL-based
Hashtag-based
Combination
Combined strategies perform better.
85
Lessons
There is good background knowledge out there,
if we are able to understand how it connects
to the domain and context we are considering.
Many applications can share
the same enrichment and linking,
but not all.
With common descriptions of the problem,
we can share enrichment and linking (more) effectively.
86
Social Web for profiles
87
Challenge: Social web for profiles
An ambition often seen in conferences like this one is
to exploit the semantic enriched social web knowledge
for the purpose of creating or enhancing user profiles.
These profiles can then be used for
adaptation and personalization.
88
Components for profiling
For applications such as
personalized news recommendation,
like in our [UMAP2011] work,
components for profiling
can be carefully selected and assembled.
It can also help the
development of the deeper understanding
and theory about how to
link the data to background knowledge
and thus make sense of the data.
89
Library
GeniUS [JIST2011] is a topic and user modeling software
library that
• produces semantically meaningful profiles, to enhance
the interoperability of profiles between applications;
• provides functionality for aggregating relevant
information about a user from the Social Web;
• generates domain-specific user profiles according to the
information needs of different applications;
• is flexible and extensible to serve different applications.
90
GeniUS: Generic Topic and User Modeling Library
for the Social Semantic Web
Item
Fetcher
Enrichment
Weighting
Function
RDF
Repository
Filter
Modeling
Configuration
RDF
Serialization
Social Web
Semantic Web
user data
items
enriched
items
semantic data
user profiles
interested in:
locationproduct
91
(a) hashtag-based
(b) entity-based
(c) topic-based
2. Profile
Type
1. Temporal
Constraints
3. Semantic
Enrichment
4. Weighting
Scheme
(a) time period
(b) temporal patterns
(a) tweet-based
(b) further enrichment
(a) concept
frequency
User Modeling Building Blocks
92
User modeling with rich semantics:
interested in:
people topics events …linkage
user profile construction
#sport
person:Francesca_Schiavone
topic:Sports
event:FrenchOpen
topic:Tennis
time
weekday weekend
Profile types
• hashtag-
based
• topic-based
• entity-based
enrichment
• tweet-only
• exploitation of
external news
resources
temporal
patterns
• specific time
period
• temporal pattern
• No constrains
User profile construction
93
RDF Gears UI
94
RDF Gears Plugin Architecture
95
1 10 100 1000
user profiles
0
10
100
1000
10000
entitiesperuserprofile
News-based
Tweet-based
1 10 100 1000
user profiles
0
10
distincttopicsperuserprofile
News-based
Tweet-based
Entity-based profiles Topic-based profiles
profiles enriched
with external news
resource
profiles enriched
with external news
resource
By exploiting the linkage between tweets and news articles, we get
more distinct entities / topics (semantics)!
Richer semantics through linking strategies.
Analysis of profile characteristics
96
Lessons
For profiles, we observed:
• Semantic enrichment allows for richer user profiles.
• Profiles change over time (hashtag-based more): fresh
profiles seem to better reflect current user demands.
• Temporal patterns: weekend profiles differ significantly
form weekday profiles (more than day/night).
For personalized news recommendation, we learned:
• Best user modeling strategy:
Entity-based > topic-based > hashtag-based.
• Semantic enrichment improves recommendation quality.
• Adapting to temporal context helps for topic-based
strategy.
97
Social Web for augmentation
98
Augment with what is there
Systems can use technology to augment their knowledge
with data from the Social Web.
Lessons learned show that
for adaptive systems on the Social Web
there is a lot of knowledge (easily) available,
from other systems and other domains.
Understanding how to leverage it, even to a basic level,
can bring a lot.
99
Cross-system augmentation
100
Cross-system profiles
An example to show the added value of
‘cross-system’ on the Social Web
is the work in [UMUAI 2013]
where interweaving of public profiles is studied.
101
User data on the Social Web
Cross-system user modeling on
the Social Web
102
Google	
  Profile	
  URI	
  	
  
h.p://google.com/profile/XY	
  	
  
4.	
  enrich	
  data	
  with	
  
seman?cs	
  	
  
WordNet®	
  
Seman'c	
  
Enhancement	
  
Profile	
  
Alignment	
  
3.	
  Map	
  profiles	
  to	
  
target	
  user	
  model	
  
FOAF	
   vCard	
  
Blog	
  posts:	
  
Bookmarks:	
  
Other	
  media:	
  
Social	
  networking	
  profiles:	
  
2.	
  aggregate	
  	
  
public	
  profile	
  	
  
data	
  	
  
Social	
  Web	
  
Aggregator	
  
1.	
  get	
  other	
  accounts	
  	
  
of	
  user	
  	
  
SocialGraph	
  API	
  
Account	
  
Mapping	
  
Aggregated,	
  	
  
enriched	
  profile	
  
(e.g.,	
  in	
  RDF	
  or	
  vCard)	
  
Analysis	
  and	
  user	
  
modeling	
  
5.	
  generate	
  user	
  
profiles	
  
Interweaving public user data with Mypes
103
1.  Characteristics of distributed tag-based profiles:
•  Overlap of tag-based profiles, which an individual user creates at
different services, is low
•  Aggregated profiles reveal significantly more information
(regarding entropy) than service-specific profiles
2.  Performance of cross-system user modeling for cold-
start recommendations:
•  Cross-system UM leads to tremendous (and significant)
improvements of the tag and bookmark recommendation quality
•  To optimize the performance one has to adapt the cross-system
strategies to the concrete application setting
http://persweb.org
Lessons
104
Location estimation
Another nice example
follows from our work in the ImREAL project
on augmentation (of adaptation) with the Social Web.
105
Improved location estimation by
mixing Social Web streams
+ =
external data sources:
Enriching the image’s textual meta-data with the user’s
tweets improves the accuracy of the location estimation.
106
Accuracy of social web metadata
This work has also raised attention
for the accuracy of Social Web metadata.
There are many reasons
why this data cannot be taken as the universal truth.
In application and domain specific contexts,
we need to understand the accuracy of social metadata.
Also, the work of [Rout et al. 2013] on location estimation
based on social ties, shows the feasibility
as well as the context-dependency.
107
Linked Open Data for augmentation
108
LOD and cross-system
With these results in hand,
in our [ICWE2012] work,
we considered cross-system modeling
with Linked Open Data.
With the aim to understand how
Linked Open Data background knowledge
can be leveraged for cross-system and cross-domain
augmentation.
109
Johannes Vermeer
dbpedia:LouvreLooking forward to
visit Paris next week!
dbpedia:Paris
The lacemaker
The astronomer
Recommending Points of Interest
110
c1	
  
c4	
  
c5	
  
c6	
  
weigh'ng	
  strategies	
  
Applica'on	
  
that	
  demands	
  user	
  	
  
interest	
  profile	
  regarding	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  -­‐concepts	
  
c2	
  
c3	
  
cx	
  
cy	
  
c9	
  
User	
  Profile	
  
concept	
  	
  	
  weight	
  
0.4	
  
0.1	
  
0.2	
  
c1	
  
c2	
  
c3	
  
…	
  …	
  
concepts	
  that	
  can	
  be	
  extracted	
  
from	
  the	
  user	
  data	
  	
  
user	
  data	
  
Social	
  Web	
  
background	
  knowledge	
  	
  
(graph	
  structures)	
  
Linked	
  Data	
  
LOD-based User Modeling
111
tags: girl with
pearl earring
geo: The Hague
dbpedia:Girl_with_pearl_earring
A	
  
Artifact
B	
  
The
lacemaker
C	
  
The
astronomer
…	
  
rdf:type
Johannes Vermeer
foaf:maker
foaf:maker
Strategies for exploiting the RDF-based
background knowledge graph
dbpedia:The_Hague
dbpedia:Louvre
dbpprop:locationlocatedIn
112
Lessons
With LOD-based user modeling on the Social Web,
different strategies for exploiting RDF-based
background knowledge are possible.
Findings:
• Combination of different user data sources (Flickr &
Twitter) is beneficial for the user modeling performance.
• User modeling quality increases the more background
knowledge one considers.
• Combination of strategies achieves the best performance.
To investigate further: dependency of strategies of
entities and relationships, and temporal effects (eg
temporal relationships or upcoming trends).
113
Interlinked online society
If you take a semantic technology perspective,
then strong interlinking could be the direction to go.
[Passant et al. 2009] studies applying semantic
technologies to social media, creating a Web where data is
socially created and maintained through end-user
interactions, but is also machine-readable and therefore
open towards sophisticated queries and large-scale
information integration.
"Social Semantic Information Spaces”, where any social
data is a component in a worldwide collective intelligence
ecosystem.
114
Origin of semantics
These social semantic spaces can trigger us in UMAP
to articulate where we see the role and origin of
semantics.
Making all social data available ‘with semantics’
or
observing that a lot of semantics
is (only) effective in a specific domain or application?
Experience showing the fine-grained nature of effects
suggests the latter.
115
Human-enhanced
116
Humans & adaptive faceted search
An important element in the process of sense-making
is its hybrid nature:
humans involved in the sense-making.
The control rooms have shown us that the
human aspect in search is crucial,
for judgment and interpretation.
In our [ISWC2011] work,
we looked at adaptive faceted search.
117
Adaptive faceted search framework
Adaptive Faceted Search
Twitter posts
Semantic Enrichment
User and Context Modeling
user
How to adapt the
facet-value pair
ranking to the
current demands
of the user?
How to represent
the content of a
tweet?
 facet extraction
118
Facet extraction and semantic enrichment
@bob: Julian Assange got
arrested
Julian Assange
Julian Assange Tweet-based
enrichment
Julian Assange arrested
Julian Assange, the founder of
WikiLeaks, is under arrest in
London…
Link-based
enrichment
Julian Assange
London
WikiLeaks
Julian Assange
Julian Assange
London
WikiLeaks
powered by
119
Impact of Link-based enrichment
Representation of
tweets:
significantly more
facets per tweet
with link-based
enrichment
120
Faceted search strategies
Goal: most relevant facet-value pair should appear at the top
of the ranking
Faceted Search Strategies:
1.  Occurrence frequency: count occurrence frequencies of FVP
2.  Personalization: adapt ranking to user profile (eg user tweeting history)
3.  Diversification: increase variety among the top-ranked FVPs
4.  Time-sensitivity: adapt FVP ranking to temporal context
Semantic enrichment: (i) tweet-based and (ii) link-based enrichment
Locations
1.  Aachen
2.  Aalborg
3.  Aalesund
4.  Aarhus
…
2145. Eindhoven
Locations
1.  Eindhoven
2.  Delft
3.  Amsterdam
4.  Rotterdam
5.  London
…
Link-based enrichment and occurrence-based and
personalized rankings have large effect.
121
Twitcident.com
Twitter-based crisis
management system
1.
2.
3. 4.
Semantic
enrichment
allows for:
1.  Grouping tweets
into incidents
2.  Faceted search
3.  Thematic Views
4.  Analysis
122
Lessons
Semantic enrichment allows for structured
representation of the content of tweets:
a good basis for faceted search.
Faceted search performs significantly better than
hashtag-based keyword search
Different building blocks for making faceted search on
Twitter adaptive improve the search quality:
•  Link-based enrichment: more discoverable tweets, better search
performance.
•  Personalization leads to significant improvements.
•  Time-sensitivity improves performance as well.
123
Redundancy reduction
124
Duplicate detection
Important for reducing the volume of social data,
is to categorize the social chatter
and reduce redundancy in information.
In our [WWW2013] work we have considered
duplicate detection.
125
Twitter is more like a news media.
How do people search on Twitter?
[Teevan et al. 2011] has shown how this is characterized by
repeated queries & monitoring for new content.
Problems:
•  Short tweets è lots of similar information.
•  Few people produce contents è many retweets, copied content.
Search and retrieval on Twitter
126
Near-duplicates in Twitter search
Analysis of the Tweets2011 corpus (TREC microblog track) [WWW2013]
1.89%&
9.51%&
21.09%&
48.71%&
18.80%&
Exact&copy&
Nearly&exact&
copy&
Strong&near;
duplicate&
Weak&near;
duplicate&
Low&overlapping&
•  For the 49 topics (queries), 2,825
topic-tweet pairs are relevant.
•  We manually labeled 55,362
tweet pairs
•  We found 2,745 pairs of
duplicates in different levels.
127
Twinder Framework
Search infrastructure
Feature'Extrac+on'
'
'
'
'
'
'
Relevance(Es+ma+on(
Social(Web(Streams(
Feature(Extrac+on(Task(
Broker(
Cloud
Computing
Infrastructure
Index(
Keyword?based(
Relevance(
messages
Twinder
Search
Engine
feature
extraction
tasks
Search(User(Interface(
query
results
feedback
users
Duplicate'Detec+on'and'Diversifica+on'
Seman+c?based(
Relevance(
Seman+c(Features(Syntac+cal(Features(
Contextual(Features( Further(Enrichment(
128
Lessons
Analyzing duplicate content in Twitter, we inferred a model
for categorizing different levels of duplicity.
We developed a near-duplicate detection framework
for microposts and for categorizing duplicity of tweet pairs.
Given the duplicate detection framework, we perform
extensive evaluations and analyses of different duplicate
detection strategies.
Our approach enables search result diversification,
also good to avoid ‘bubble effects’, and analyzes the
impact of the diversification on the search quality.
Follow Twinder progress: http://wis.ewi.tudelft.nl/twinder/
129
Take home from technology research
With semantics and humans, Social Web can help:
•  Semantics beneficial for filtering & search and enrichment & linking.
•  Semantic-enriched tweets beneficial for profiles and adaptation.
•  Social Web & Linked Data beneficial for cross-system augmentation.
•  Adaptive faceted search and duplicate detection beneficial for human-
enhanced processing.
For adaptive systems that rely on profiling,
Social Web is a fertile source for more knowledge.
ImREAL research & experiences elegantly show principles,
as well as the detailed work in domain & application:
•  Social Web & LOD usage is context-specific.
•  Big Data in need of Small Interpretations.
130
APPLICATION
HUMANS FOR AUGMENTATION
USERSDOMAIN
DOMAIN
Augmented with
Web Semantics
USERS
Augmented with
Web Semantics
REAL
DOMAIN
REAL
USERS
131
Take home from technology research
The human intelligence is to be arranged differently:
•  We have moved from a priori understanding the system, to on the fly
understanding the system.
•  We have moved from careful manual analysis before, to machines doing the
analysis on the fly.
•  The critical and context-specific approach to (small) data, about domain
and users, is a part of process and system we now need to (re-)include.
•  This task of the designer has now shifted to a task for the human interpretation
inside the hybrid system: human monitoring inside.
132
133
Challenges with sense-making
134
Not one truth
135
In reality, not one truth
In the beginning, social systems like Twitter were used
as ‘the’ semantic source of knowledge with an implicit
assumption that Twitter is one voice.
Over time, researchers have begun to investigate
how to identify and interpret different voices and
viewpoints in such a source.
Differences in viewpoints and opinions
are subject of study, but until now leverage is limited
136
Diversity and beliefs
[Flock et al. 2011] study the different backgrounds,
mindsets and biases of Wikipedia contributors,
to understand the effects - positive and negative –
of this diversity on the quality of the Wikipedia content,
and on the sustainability of the overall project.
• Analysis and approach for diversity-minded content
management within Wikipedia.
[Bhattachanya et al. 2012] estimate beliefs from posts
made on social media, to monitor the level of belief,
disbelief and doubt related to specific propositions.
137
Include the negative
Diversity of viewpoints and opinions also suggests to
include negative links in the approach.
[Symeonidis et al. 2010] give an example of how to
include negative links into friend recommendation
approaches, but this goes much further.
The effect they observe on improving accuracy
can be held as a principle
where accuracy improvement can be gained
using information about positive and negative edges.
138
ViewS
Modelling Viewpoints in User Generated Content
Text
processing
Viewpoint
extraction
(attention focus)
Ontology
(activity aspects
to analyse)
Semantic
enrichment
Viewpoint
exploration
139
Viewpoints in YouTube
Examples viewpoints in user comments on job interview videos
Comparing the viewpoints around ‘anger’ of young users (left)
and old users (right)
140
Not the truth
141
Truth is not always truth
Just like this source of knowledge is not a single one,
it is also clear that it might not be consisting of
‘true’ knowledge alone.
142
Malicious profiles
For example, profiles can be suspicious and made for the
wrong reasons.
In a context of online dating, [Pizzato et al. 2012] have
observed the need to gain understanding of the sensitivity
of recommender algorithms to scammers.
With people being the items to recommend,
fraudulent profiles can be having a serious impact on
recommender algorithms.
Identifying and detecting fraudulent profiles is a new
challenge for us.
143
Identity theft
Another aspect to ‘wrong profiles’ relates to
identity disambiguation and theft.
[Rowe et al. 2010] consider malevolent web practices such
as identity theft and lateral surveillance.
They study techniques for web users
to identify all web resources which cite them and
if necessary, remove the sensitive information.
144
Credibility of social content
The credibility of messages in social networks is for
example studied in [Seth et al. 2010] on stories from Digg.
Their model is based on theories developed in sociology,
political science and information science.
[Cramer et al. 2008] have nicely brought attention for
trust.
The study of social content credibility and trust are
important, and ask for cross-discipline effort.
145
Privacy
A lot can be said about privacy in these networks, for
example Facebook.
[Bachrach et al. 2012] shows how users’ activity on
Facebook (related to privacy) relates to their personality,
as measured by the standard Five Factor Model.
Nice example of understanding how Facebook features
relate to interesting aspects of users and usage.
146
Cultural variations
147
Cultural diversity
Studying diversity is not just relevant for understanding how
Twitter content is to be interpreted.
It is also relevant for understanding how the Social Web
is used and can be used with a purpose.
Cultural diversity is here one of the most interesting aspects
and perhaps also one of the most challenging ones.
148
Cultural diversity
A subject addressed in ImREAL.
Components are made available as services in ImREAL
for augmented user modeling,
e.g. for simulation designers.
149
150
Hofstede’s cultural dimensions
Describes stereotypical cultural characteristics of
nationalities, with scores relative to other nationalities
Five core dimensions:
•  Individualism versus Collectivism (IDV)
•  Power Distance (PDI)
•  Masculinity versus Femininity (MAS)
•  Uncertainty Avoidance (UAI)
•  Long-Term Orientation (LTO)
geert-hofstede.com
151
Analysis
• Datasets
•  Microblog data collected over a period of three months
•  22 million microposts from Sina Weibo and 24m from Twitter
•  a sample of 2616 Sina Weibo users and 1200 Twitter users
• Analyze and compare user behavior
•  on two levels (i) the entire user population and (ii) individual users
•  from different angles (i) syntactic, (ii) semantic, (iii) sentiment and
(iv) temporal analysis
152
0% 20% 40% 60% 80% 100%
users
0
0.01
0.1
1
avg.numberof
hashtags/URLsperpost
Hashtag-Weibo
URL-Weibo
Hashtag-Twitter
URL-Twitter
Hashtags and URLs are less
frequently applied on Sina
Weibo than on Twitter.
Users on Twitter are more triggered by
hashtags and URLs when propagating
information than on Sina Weibo.
Syntactic analysis
high collectivism in Weibo, a high individualism in Twitter
153
Semantic analysis
The topics that users discuss on Sina Weibo are to a large
extent related to locations and persons. In contrast to Sina
Weibo, users on Twitter are talking more about
organizations (such as companies, political parties).
0% 20% 40% 60% 80% 100%
users
0
0.001
0.01
0.1
1
10
avg.numberofentitiesperpost
Weibo
Twitter
low employee commitment to an organization in China - high long term orientation.
154
Sentiment analysis
Sina Weibo users have a stronger tendency to publish
positive messages than Twitter users.
0% 20% 40% 60% 80% 100%
users
0%
20%
40%
60%
80%
100%
ratioofpositveposts
Weibo
Twitter
more negative posts
more positive posts
high long term orientation.
155
Combined semantic sentiment analysis
The difference is amplified when discussing ‘people’ or
‘location’, with Sina Weibo users even more positive and
Twitter users more negative.
more longterm orientation in Weibo, more shortterm orientation in Twitter
156
Temporal analysis
Twitter users repost messages faster than Sina Weibo users.
time distance =
trepost - toriginal post
0% 20% 40% 60% 80% 100%
users
0
0.1
1
10
100
1000
timedistance(inhours)
Weibo
Twitter
large degree of power distance in Weibo, small one in Twitter
157
Cultural differences in tagging
Other work confirms the findings.
And the consistency with theories of cultural differences
between Asian and Western cultures.
[Dong et al. 2011] look at cultural differences in a
tagging system and find that American and Chinese
subjects differed in many ways:
• the number and types of tags they applied;
• the extent to which they applied suggested tags or
entered new tags of their own; and
• how often they applied tags that originated from a
different culture.
158
Cultural variations for Social Q&A
Another example is given by [Yang et al. 2011] that looks
at cultural differences in people’s social question asking
behaviors across the United States, the United Kingdom,
China, and India.
They analyzed the questions people ask via social
networking tools, and their motivations for asking and
answering questions online.
Results reveal culture as a consistently significant factor
in predicting people’s social question and answer behavior.
159
Real-time variations
160
Understand the source
When using the knowledge from Twitter
as a semantic source,
specially if it is the only semantic source,
there are a few things one needs to consider
that relate to the real-time nature of social contributions.
The ‘knowledge’ is not unambiguous:
inconsistency, moods, etc.
Real-time knowledge spreads and evolves fast.
161
Inconsistency & moods
Twitter is used as semantic sensor, sometimes as the only
semantic sensor, but consistency in user contributions
like ratings is a concern.
[Said et al. 2012] shows how users are inconsistent in
their ratings and tend to be more consistent for above
average ratings.
[De Choudhury et al. 2012] report on the relation between
moods and social activity, social relations and
participatory patterns like link sharing and conversational
engagement.
162
Understanding over time
While Twitter and the like were used in the beginning
as ‘fixed’ sources of knowledge,
researchers have become interested in
the evolution over time.
The nature and speed of the flow of content over time
have become great objects of study.
Two domains that in this light have received fair attention
is that of diseases and (political) news.
163
Flow in disease information
Domain of diseases and outbreaks is getting fair
attention.
Works by [Gomide et al. 2011] on Dengue and [Diaz-Aviles
et al. 2012] on EHEC, show how the people’s behavior on
Twitter can be used for surveillance and tasks such as
early warning and outbreak investigation.
164
Flow of news
From [Naveed et al. 2011] we learn how retweets reflect
what the Twitter community considers interesting on a
global scale.
In [Backstrom et al. 2011] we see the differences between
communication and observation in Facebook:
communication involves a much higher focus of attention
than observation activities.
We see in [Lerman et al. 2010] how network structure
affects dynamics of how interest in news stories spreads
among social networks in Digg and Twitter
165
Flow in political news
Coming back to our observation of the multiple truths,
political news is a great domain to look at.
For the contact of political speech, [Metaxas et al. 2010]
discuss how the real-time nature of Twitter provides
disproportionate exposure to personal opinions,
fabricated content, unverified events, lies and
misrepresentations, with viral spread as a consequence.
To act upon that, [Lumezanu et al. 2012] identify extreme
tweeting patterns that could characterize users who spread
propaganda (political propagandists), e.g. sending high
volumes of near-duplicate messages.
166
Temporal effects
In our [WebSci2011] work, we have considered how
user interests are manifest over time.
Most users, who are interested into the news topic,
become interested within a few days.
Lifespan of users’ interest:
• Long-term adopters - continuously interested
• Short-term adopters - interested only for a short period in
time (and influenced by “global trends”)
High overlap between early adopters and long-term
adopters.
167
Temporal effects
On Twitter the importance of entities for a topic varies
over time (long-term vs. short-term entities).
In terms of user interests over time, the majority of users
becomes quickly (few days) interested in a topic.
When using Twitter-based profiles for personalization,
time-sensitive user modeling improves recommendation
quality.
Also, the selection of user modeling strategy should take
the type of user into account:
• Long-term adopters: hashtag-based
• Short-term adopters: entity-based
168
Twitter-based Trend and User Modeling
Framework
Twitter posts
current tweets
of Twitter
community
news
recommender?
Profile
Semantic
Enrichment
Profile Type
Aggregation
Weighting
Scheme
trends
time
user’s
interests
169
Temporal effects with trends
For the domain of personalized news recommendations,
We have combined trend and user modeling in our
framework.
• We have seen how user profiles change over time, under
the influence of trends.
• Appropriate concept weighting strategies allow for the
discovery of local trends.
• Time sensitive weighting function is best for generating
trend profiles.
Aggregation of trend and user profile can improve the
performance of recommendations.
170
Validation
171
Check with the user
With all profiles based on augmentation,
it becomes (even more) vital to follow the lessons of
checking with the user.
By engaging with the user in a
common process of validating the profile
and the assumptions based on it.
172
Perico
Dialogue for Modelling Cultural Exposure using Linked Data
Initial User
Model
•  Visited Countries
•  Estimated Cultural
Exposure
Social
Web
Sensors
Perico Dialogue Agent
Cultural Fact
Extractor
Quiz Generator
User Profile
GeneratorDialogue Planner
Updated User
Model
•  Verified Visited
Countries
•  Enhanced Cultural
Exposure Score
173
Perico
Dialogue for Modelling Cultural Exposure using Linked Data
Initial User
Model
•  Visited Countries
•  Estimated Cultural
Exposure
Social
Web
Sensors
Perico Dialogue Agent
Cultural Fact
Extractor
Quiz Generator
User Profile
GeneratorDialogue Planner
Updated User
Model
•  Verified Visited
Countries
•  Enhanced Cultural
Exposure Score
174
Inspect and control
[Knijnenburg et al. 2012] consider how
users of social recommender systems may want to
inspect and control how their social relationships
influence the recommendations they receive:
friends are not always “nearest neighbors”.
The results show that high inspectability and control
indeed increase users’ perceived understanding of and
control over the system, their rating of the
recommendation quality, and their satisfaction with the
system, and thus an overall better user experience.
175
Communities
176
Understanding communities
Attention is given to communities and their dynamics.
[Chan et al. 2010] proposes a method for analysing user communication
roles in discussion forums.
[Schwagereit et al. 2011] study governance in web communities.
[Karnstedt et al. 2011] considers the relation between a user's value
within a community - constituted from various user features - and the
probability of a user churning.
[Yang et al. 2010] analyze users’ activity lifespan in online knowledge
sharing communities: acknowledgement of contributions leads to user
survival.
177
Involvement in communities
In order to understand how people behave in
Social Web and in communities,
it is relevant to understand their engagement and
involvement in more detail.
[Lehmann et al. 2012] study how users engage with online
services, and how to measure this engagement.
[Freyne et al. 2009] look at how social networking sites
rely on the contribution and participation of their
members: focus on early interventions for engagement.
178
Communities and expertise
Understanding communities is also relevant
as these communities can act as additional resource.
From finding evidence for profiles, we have seen recent
attention shift towards finding people and expertise.
For example, to enable active engagement of people.
For using expertise in UMAP,
it is also important to be able to specify expertise,
to enable reasoning about the expertise’s quality and fit.
179
Take home from challenges
The (Social) Web tells many stories:
•  Acknowledge multiple truths, opposing truths, and bad intentions.
•  Acknowledge multiple audiences and viewpoints.
•  Acknowledge cultural variations.
The (Social) Web moves fast:
•  Acknowledge the real-time nature of Web and applications.
•  Analyze and understand the flow of information.
•  Analyze and understand the nature of communities.
The (Social) Web includes people:
•  Involve the users actively in validation.
•  Involve (communities of) users in interpretation.
180
181
Social, Web & UMAP
182
Social & UMAP
Huge economic and societal potential for added value.
Social Web is a fertile source of knowledge for
augmentation.
•  Semantics can be beneficial for social-based augmentation.
•  Hybrid, human-enhanced approaches can be beneficial.
•  Technological feasibility of augmentation.
Research from specific cases towards general theory.
Next on the agenda:
•  Describe added value for stakeholders, describe goals.
•  Share and compare research challenges and evaluations.
183
Web & UMAP
UMAP systems are Web systems:
•  The (Social) Web tells many stories.
•  The (Social) Web moves fast.
•  The (Social) Web includes people.
The Web is the real laboratory for UMAP systems.
Next on the agenda:
•  Share and compare solutions, components, and systems.
•  Support more uniformity in methods and practices.
184
UMAP & Web
On the (Social) Web, systems are being made:
•  Take positions or prepare to take positions about bad
intentions.
•  Take responsibility and recommend about future
architectures.
On the (Social) Web, many systems are small:
•  Do (also) consider the specific problems of small and medium
sized stakeholders: bring UMAP into practice.
185
UMAP & Social
In SWUMAP, human intelligence is arranged
differently:
•  From careful manual analysis a priori, to machine
analysis on the fly.
•  Critical and context-specific approach to data is part of
the ‘in vivo’ system.
•  Human interpretation of data is inside the hybrid
system.
It makes for a new type of system, and one of
great value.
And plenty of fun and diverse challenges for
UMAP.
186
APPLICATION
HUMANS FOR AUGMENTATION
USERSDOMAIN
DOMAIN
Augmented with
Web Semantics
USERS
Augmented with
Web Semantics
REAL
DOMAIN
REAL
USERS
187
APPLICATION
HUMANS FOR AUGMENTATION
USERSDOMAIN
DOMAIN
Augmented with
Web Semantics
USERS
Augmented with
Web Semantics
SWUMAP
188
Thanks
Slides made with input from many,
including Alessandro, Claudia, Fabian, Ilknur, Jan,
Jasper, Ke, Qi, and Richard from WIS in Delft,
and friends from ImREAL, Net2, SEALINCMedia, and
Twitcident.

Weitere ähnliche Inhalte

Ähnlich wie Delft University Social Media Research

Web Science Session 2: Social Media
Web Science Session 2: Social MediaWeb Science Session 2: Social Media
Web Science Session 2: Social MediaStefanie Panke
 
Stanford Cs 01 29 10
Stanford Cs 01 29 10Stanford Cs 01 29 10
Stanford Cs 01 29 10dianascearce
 
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talk
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talkMark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talk
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talkHacerLaPazEs
 
Futures Thinking . Media & entertainment
Futures Thinking . Media & entertainmentFutures Thinking . Media & entertainment
Futures Thinking . Media & entertainmentJane Vita
 
Using social media to achieve organisational goals: implications for organisa...
Using social media to achieve organisational goals: implications for organisa...Using social media to achieve organisational goals: implications for organisa...
Using social media to achieve organisational goals: implications for organisa...Hazel Hall
 
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...Connie White
 
A Pattern Language of Social Media in Public Security
A Pattern Language of Social Media in Public SecurityA Pattern Language of Social Media in Public Security
A Pattern Language of Social Media in Public SecuritySebastian Denef
 
Content Matters - My View on the Importance of Content Strategy
Content Matters - My View on the Importance of Content StrategyContent Matters - My View on the Importance of Content Strategy
Content Matters - My View on the Importance of Content StrategyCognizant Technology Solutions
 
Technology as a Cultural Practice - UX Australia
Technology as a Cultural Practice - UX AustraliaTechnology as a Cultural Practice - UX Australia
Technology as a Cultural Practice - UX AustraliaRachel Hinman
 
Combining Real and Virtual Volunteers through Social Media
Combining Real and Virtual Volunteers through Social MediaCombining Real and Virtual Volunteers through Social Media
Combining Real and Virtual Volunteers through Social MediaMirjam-Mona
 
Thesis hamouda hoda
Thesis hamouda hodaThesis hamouda hoda
Thesis hamouda hodaHoda Hamouda
 
Keynote: The Future Internet [Dewandre] :: SESERV Workshop
Keynote:  The Future Internet [Dewandre] :: SESERV WorkshopKeynote:  The Future Internet [Dewandre] :: SESERV Workshop
Keynote: The Future Internet [Dewandre] :: SESERV Workshopictseserv
 
Social Media in Crisis Management: ISCRAM Summer School 2011
Social Media in Crisis Management: ISCRAM Summer School 2011Social Media in Crisis Management: ISCRAM Summer School 2011
Social Media in Crisis Management: ISCRAM Summer School 2011Connie White
 
Colaboración Juan Pablo Somiedo Foreknowledge issue3r
Colaboración Juan Pablo Somiedo Foreknowledge issue3rColaboración Juan Pablo Somiedo Foreknowledge issue3r
Colaboración Juan Pablo Somiedo Foreknowledge issue3rJuan Pablo Somiedo
 
Social Work in the Digital Age, November 2011
Social Work in the Digital Age, November 2011Social Work in the Digital Age, November 2011
Social Work in the Digital Age, November 2011Nancy J. Smyth, PhD
 
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...Amit Sheth
 
Impact and opportunities of social media
Impact and opportunities of social mediaImpact and opportunities of social media
Impact and opportunities of social mediaErnesto Peborgh
 

Ähnlich wie Delft University Social Media Research (20)

Web Science Session 2: Social Media
Web Science Session 2: Social MediaWeb Science Session 2: Social Media
Web Science Session 2: Social Media
 
Cook et al
Cook et alCook et al
Cook et al
 
Stanford Cs 01 29 10
Stanford Cs 01 29 10Stanford Cs 01 29 10
Stanford Cs 01 29 10
 
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talk
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talkMark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talk
Mark1 designing peace tech 101, april 21, 2015, pil bogota sponsor talk
 
P-case mit
P-case mitP-case mit
P-case mit
 
Futures Thinking . Media & entertainment
Futures Thinking . Media & entertainmentFutures Thinking . Media & entertainment
Futures Thinking . Media & entertainment
 
Using social media to achieve organisational goals: implications for organisa...
Using social media to achieve organisational goals: implications for organisa...Using social media to achieve organisational goals: implications for organisa...
Using social media to achieve organisational goals: implications for organisa...
 
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
 
A Pattern Language of Social Media in Public Security
A Pattern Language of Social Media in Public SecurityA Pattern Language of Social Media in Public Security
A Pattern Language of Social Media in Public Security
 
Content Matters - My View on the Importance of Content Strategy
Content Matters - My View on the Importance of Content StrategyContent Matters - My View on the Importance of Content Strategy
Content Matters - My View on the Importance of Content Strategy
 
Technology as a Cultural Practice - UX Australia
Technology as a Cultural Practice - UX AustraliaTechnology as a Cultural Practice - UX Australia
Technology as a Cultural Practice - UX Australia
 
Combining Real and Virtual Volunteers through Social Media
Combining Real and Virtual Volunteers through Social MediaCombining Real and Virtual Volunteers through Social Media
Combining Real and Virtual Volunteers through Social Media
 
Thesis hamouda hoda
Thesis hamouda hodaThesis hamouda hoda
Thesis hamouda hoda
 
Keynote: The Future Internet [Dewandre] :: SESERV Workshop
Keynote:  The Future Internet [Dewandre] :: SESERV WorkshopKeynote:  The Future Internet [Dewandre] :: SESERV Workshop
Keynote: The Future Internet [Dewandre] :: SESERV Workshop
 
Social Media in Crisis Management: ISCRAM Summer School 2011
Social Media in Crisis Management: ISCRAM Summer School 2011Social Media in Crisis Management: ISCRAM Summer School 2011
Social Media in Crisis Management: ISCRAM Summer School 2011
 
Colaboración Juan Pablo Somiedo Foreknowledge issue3r
Colaboración Juan Pablo Somiedo Foreknowledge issue3rColaboración Juan Pablo Somiedo Foreknowledge issue3r
Colaboración Juan Pablo Somiedo Foreknowledge issue3r
 
Social Work in the Digital Age, November 2011
Social Work in the Digital Age, November 2011Social Work in the Digital Age, November 2011
Social Work in the Digital Age, November 2011
 
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
 
Knowledge Sharing in the Networked World of the Internet of Things
Knowledge Sharing in the Networked World of the Internet of ThingsKnowledge Sharing in the Networked World of the Internet of Things
Knowledge Sharing in the Networked World of the Internet of Things
 
Impact and opportunities of social media
Impact and opportunities of social mediaImpact and opportunities of social media
Impact and opportunities of social media
 

Kürzlich hochgeladen

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Kürzlich hochgeladen (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Delft University Social Media Research

  • 1. Delft University of Technology Link, Like, Follow, Friend: The Social Element in User Modeling and Adaptation UMAP, Rome, June, 2013 Geert-Jan Houben Web Information Systems, TU Delft
  • 3. 3 Social Web & UMAP We observe, reflect, speculate, and raise discussion about evolutions and opportunities for UMAP to make a difference. Triggered by the social element in UMAP and other conferences & our own experience in the field.
  • 4. 4 Social Web in UMAP: a number of mentions of ‘social’, and a small number of ‘social web’ in the papers. New U (in UMAP), new users And we see more. We see how the Social Web mirrors people, mirrors users. What we learn at the Social Web, learn (more) about users and for user modeling and adaptation.
  • 5. 5 UMAP in the new Web world What we learn at the Social Web allows us to reconsider UMAP in the Web. It brings new opportunities for us as researchers. Perhaps it brings new needs. Surely, these are opportunities that we can position within our UMAP research agenda and UMAP application portfolio.
  • 6. 6 SWUMAP: 1 + 1 = 3 Experience shows to combine: Understanding & Creating UM & AP Machines & Humans Arrive at a body of knowledge for turning insights about users and usage into added value in society and economy.
  • 7. 7 UMAP systems are Web systems Lessons tell us to reconsider our system concept. On the Web systems are ‘in vivo’: open and dynamic. •  Users & data are not (longer) ‘inside the system’. •  Users & data change, move (more) quickly. This impacts understanding and creating of systems. This also impacts the systems’ architecture. With the (Social) Web as our laboratory, this also impacts our research discipline.
  • 8. 8 APPLICATION HUMANS FOR AUGMENTATION USERSDOMAIN DOMAIN Augmented with Web Semantics USERS Augmented with Web Semantics REAL DOMAIN REAL USERS
  • 9. 9
  • 11. 11 Domain: Incidents and emergencies In literature we see a fair attention for the domain of incidents and emergencies. Our own experience from several years is situated in that domain. It has given us a good feeling for what is needed and how UMAP research can be part of a bigger effort to solve real-world problems.
  • 12. 12 Domain: Incidents and emergencies In literature, most attention is directed towards understanding and detecting. Sometimes we see further objectives in responding, creating situational awareness (specially in massive incidents), and prevention. Most used in these studies is Twitter.
  • 13. 13 2011 Tohoku eartquakes 1200 tweets/minmin
  • 14. 14 2011 Pukkelpop storm 570 tweets/min
  • 15. 15 Twitter With Twitter, we have a whole new reflection of what is happening in the world. A whole new source of digital data that reflects the (real) world. We need to understand that reflection to understand the world and help the world. Two challenges: 1.  Understand the world, and 2.  Understand its reflection in the Social Web.
  • 17. 17 400+ million tweets per day •  Netherlands ranks #1 in Twitter penetration Twitter users publish about “anything” •  Work/private life •  Interesting events •  Etc. Twitter tells us a lot about the world. And its users can be seen to act as social sensors and citizen journalists. Monitoring Twitter
  • 19. 19 Train hits a block First eyewitness
  • 24. 24 A new source of knowledge An example of the speed and the nature of knowledge that Twitter provides and what it does to provide knowledge about what really happened. Also, it shows what we need to know and understand to use and interpret this effectively.
  • 25. 25 1.  Early warning •  Twitter users publish early signals that might indicate an increased risk or potential incident. 2.  Crisis management •  (Eye-witness) Twitter users disseminate information about incidents which can support operational emergency services. 3.  Post evaluation •  Post analyzing incident data (in retrospect) to measure the effectiveness of emergency services. Twitcident goals
  • 26. 26 • Emergency services •  Law enforcement, fire fighters, governments • Big event organizers •  Festival security companies • Utility organizations •  Public transport, energy supply, other vital infrastructures Stakeholders
  • 30. 30 Could we see this impact coming? Semantics 25 minutes before incident 1.  Weather: storm, cloud-burst, wind, …. 2.  Locations: Brussel, Gent, Hasselt, … 3.  Intensity: heavy, crazy, massive… 4.  Impact: hail balls, falling trees… Impact stormWhy is there a peak? “ ”
  • 31. 31 Damage reports from incident site
  • 33. 33 Example festival disaster The research into this example created a lot of knowledge about what is possible and what is desired. It was also a good example to follow and approach new use cases to build more general understanding and theory.
  • 34. 34 Dutch national rail infrastructure company Example public infrastructure
  • 36. 36 Twitcident processes 100k tweets/day The social weather map provides ProRail with a timely and accurate overview of citizen observations. In addition to other sources of knowledge. Value
  • 37. 37 Big Events New Year’s Eve Serious Request Elections Lowlands Summer Carnaval Fantasy Island Queen’s Day
  • 39. 39 Social media monitoring was done with 1-3 security officers Violence, riots, fires, fireworks, crowds, ..
  • 40. 40 Not only monitoring The previous examples are not only about monitoring Twitter to know what is happening out there.
  • 41. 41 She was about to turn 16…
  • 42. 42 So she invited some friends …
  • 44. 44 She pulled back her invitation, but…
  • 45. 45 45
  • 50. 50 Massive damage at local stores 50
  • 52. 52 Mayor of Haren resigns after Haren-debacle
  • 53. 53 Recommendations by Cohen •  Clear communication strategy •  Planning & organizing in advance •  Social media monitoring •  Clear intervention policy
  • 54. 54 Why social media monitoring? Content tells what to expect
  • 56. 56 Recommendation from experience Let us go and find the needle that tells us what appears to be happening out there But let us also think about how to support the action to make the world out there a better one.
  • 57. 57 Meaningful and actionable Twitcident has learned us how information obtained from Twitter needs to be meaningful and actionable.
  • 58. 58 “Polling meaningful information” “Sifting thousands of tweets during hurricane Irene” “Getting situational awareness” “Finding the eye’s on the ground” “Finding actionable information” “Providing timely reaction” “ “Volunteers are great” “But we need hybrid approaches to monitor social media” Patrick Meier Today’s challenges
  • 59. 59 Hybrid approach Twitcident has also shown us how these problems ask for a hybrid approach with humans in the loop that handle and interpret the knowledge derived from the Social Web. Big Data is available from the Social Web, but Small Interpretations are needed, to get it right!
  • 60. 60 Human interpretation inside The nature of these problems makes that solutions are not fully automatic. They involve users of systems that help the interpretation and decision taking. It is a special kind of users that we (as UMAP) can consider and that is fast growing and in urgent need of support.
  • 61. 61 Take home from experience Learn from concrete cases: •  Case-based experimental approaches bring specific understanding and experience necessary for general understanding and theory. •  Cases can have great value for stakeholders. It is all about correct and actionable interpretation: •  Make information meaningful and actionable in the context. •  Employ hybrid, human-enhanced approaches for the context.
  • 62. 62 APPLICATION HUMANS FOR AUGMENTATION USERSDOMAIN DOMAIN Augmented with Web Semantics USERS Augmented with Web Semantics REAL DOMAIN REAL USERS
  • 63. 63
  • 65. 65 Challenge: Making sense of Twitter Inspired by different applications and domains, researchers have given attention to underlying technology for making sense of Twitter. ‘Finding the needle’ as the research challenge.
  • 66. 66 Technology for making sense The sense-making usually relies on application and domain specific knowledge and researchers investigate how to do it effectively. Semantics and interactivity prove to be important ingredients. In fact, it turns out that sense-making, i.e. finding the needle, is a combination of many things that need to be coming together.
  • 69. 69 Semantics for filtering and search In [HT2012] we considered what is needed as first steps in processing tweets, before we can ‘analyze’ them.
  • 70. 70 1.  (Automatic) Filtering: Given an incident, how can one automatically identify those tweets that are relevant to the incident? 2.  Search & Analytics: How can one improve search and analytical capabilities so that users can explore information in the streams of tweets? Twitter streams Challenges Filtering topic Search & Analytics information need
  • 71. 71 Dataset • Twitter corpus (TREC Microblog Track 2011) • 16 million tweets (Jan. 24th – Feb. 8th, 2011 ) • 4,766,901 tweets classified as English • 6.2 million entity-extractions • News (Same time period) • 62 RSS News Feeds • 13,959 News Articles • 357,559 entity-extractions
  • 73. 73 Filtering evaluation The semantic strategy is more robust and achieves higher precisions for complex topics. 1 2 3 4 number of entities extracted from inital topic description 0 0.2 0.4 0.6 0.8 1 Precision@30andRecall Precision@30 Recall 1 2 3 4 5 number of words in the inital topic description 0 0.2 0.4 0.6 0.8 1 Precision@30andRecall Precision@30 Recall
  • 75. 75 Faceted search evaluation Strategies with semantic enrichment outperform those without in predicting appropriate facet-values. 3Adaptive Faceted Search on Twitter !"#$% !"#&% !"#'% !"'(% !"#&% !")'% !"#(% !"'*% !"#+% !"#)% !",+% !"',% !% !"!+% !"'% !"'+% !",% !",+% !"#% !"#+% !")% !")+% -./0123456.7% 89 .:0.2058;.% </.=>.2?@% A30AB3C:D30.7% EF+% EF'!% GHH% with semantic enrichment without semantic enrichment
  • 76. 76 Lessons The context: a (Twitcident-inspired) framework for filtering, searching, and analyzing information about incidents that people publish on Twitter. We have seen how to obtain • better filtering of Twitter messages for a given incident, • better search for relevant information about an incident within the filtered messages. For these first steps in processing Twitter messages, the semantic interpretation is the key element that we need to understand for the given context.
  • 78. 78 Semantics for enrichment and linkage In [ESWC2011] we focused more on the semantics for enrichment and linkage to connect the tweets to background knowledge and thus enhance what we can learn from them.
  • 79. 79 SI Sportsman of the year: Surprise French Open champ Francesca Schiavone Thirty in women's tennis is primordially old, … news article topic:Sports topic:Sports topic:Tennis person:Francesca_Schiavone oc:SportsGame event:FrenchOpen francesca is becoming #sport idol of the year! microblog post user enrichment enrichment user modeling linkage Profile Topics of interest: - topic:Tennis - topic:Sports People of interest: - person:Francesca_Schiavone Events of interest: - event:FrenchOpen Example: Semantic enrichment of Twitter posts
  • 80. 80 SI Sportsman of the year: Surprise French Open champ Francesca Schiavone Thirty in women's tennis is primordially old, … news article francesca is becoming #sport idol of the year! microblog post user linkage How? Goal  in  this  linkage  discovery  is  to  iden3fy  news  resources   that  are  related  to  a  given  Twi8er  message:   1.  Web  resource  has  to  be  related  to  the  given  tweet   2.  Web  resource  has  to  be  related  to  news     Linkage discovery
  • 81. 81 Francesca Schiavone is sportsman of the year #sport #tennis Content-based SI Sportsman of the year: Surprise French Open champ Francesca Schiavone Thirty in women's tennis is primordially old… Francesca Schiavone is sportsman of the year #sport #tennis Hashtag-based Petkovic & Goerges leading German tennis revival there are signs that German tennis is… The image cannot be displayed. Linkage discovery strategies
  • 82. 82 nice! http://bit.ly/eiU33c URL-based SI Sportsman of the year: Surprise French Open champ Francesca Schiavone Thirty in women's tennis is primordially old… news article URL Entity-based Olympic champion and world number nine Elena Dementieva announced her retirement The 29-year-old Russian delivered the shock news after losing to Francesca Schiavone in the group stages of the season-ending tournamen … news article Entity-based Francesca Schiavone is sportsman of the year #sport #tennis temporal constraint Old news L publish date publish date •  URL-based (Strict): only consider content of the Twitter message •  URL-based (Lenient): also consider reply or re-tweet messages Linkage discovery strategies
  • 83. 83 Evaluation on linkage discovery !"#!#$% !"&!'$% !"&'()% !")#$$% !")*+% !"*!(,% !% !"#% !"'% !"$% !"&% !"(% !"+% !")% !"*% !",% -./01/0234516%78492.:2;.<65=% >45?049234516% @/A0B234516%7CD0?.E0%01FG.<4H%I./50<4D/05=% @/A0B234516% JKL234516%7H1/D1/0=% JKL234516%750<DI0=% !"#$%&%'() URL-based strategies offer good linkage.
  • 84. 84 Analysis on linkage discovery and semantic enrichment •  URL-based strategies: more than 10 tweet-news relations for c.a. more than 1000 •  Entity-based strategy: found a far more higher number of tweet-news relations •  Hashtag-based strategy failed for more than 79% of the users because of the limited usage of hashtags •  Combination of all strategies: higher than 10 tweet-news relation found for more than 20% of the users Entity-based URL-based Hashtag-based Combination Combined strategies perform better.
  • 85. 85 Lessons There is good background knowledge out there, if we are able to understand how it connects to the domain and context we are considering. Many applications can share the same enrichment and linking, but not all. With common descriptions of the problem, we can share enrichment and linking (more) effectively.
  • 86. 86 Social Web for profiles
  • 87. 87 Challenge: Social web for profiles An ambition often seen in conferences like this one is to exploit the semantic enriched social web knowledge for the purpose of creating or enhancing user profiles. These profiles can then be used for adaptation and personalization.
  • 88. 88 Components for profiling For applications such as personalized news recommendation, like in our [UMAP2011] work, components for profiling can be carefully selected and assembled. It can also help the development of the deeper understanding and theory about how to link the data to background knowledge and thus make sense of the data.
  • 89. 89 Library GeniUS [JIST2011] is a topic and user modeling software library that • produces semantically meaningful profiles, to enhance the interoperability of profiles between applications; • provides functionality for aggregating relevant information about a user from the Social Web; • generates domain-specific user profiles according to the information needs of different applications; • is flexible and extensible to serve different applications.
  • 90. 90 GeniUS: Generic Topic and User Modeling Library for the Social Semantic Web Item Fetcher Enrichment Weighting Function RDF Repository Filter Modeling Configuration RDF Serialization Social Web Semantic Web user data items enriched items semantic data user profiles interested in: locationproduct
  • 91. 91 (a) hashtag-based (b) entity-based (c) topic-based 2. Profile Type 1. Temporal Constraints 3. Semantic Enrichment 4. Weighting Scheme (a) time period (b) temporal patterns (a) tweet-based (b) further enrichment (a) concept frequency User Modeling Building Blocks
  • 92. 92 User modeling with rich semantics: interested in: people topics events …linkage user profile construction #sport person:Francesca_Schiavone topic:Sports event:FrenchOpen topic:Tennis time weekday weekend Profile types • hashtag- based • topic-based • entity-based enrichment • tweet-only • exploitation of external news resources temporal patterns • specific time period • temporal pattern • No constrains User profile construction
  • 94. 94 RDF Gears Plugin Architecture
  • 95. 95 1 10 100 1000 user profiles 0 10 100 1000 10000 entitiesperuserprofile News-based Tweet-based 1 10 100 1000 user profiles 0 10 distincttopicsperuserprofile News-based Tweet-based Entity-based profiles Topic-based profiles profiles enriched with external news resource profiles enriched with external news resource By exploiting the linkage between tweets and news articles, we get more distinct entities / topics (semantics)! Richer semantics through linking strategies. Analysis of profile characteristics
  • 96. 96 Lessons For profiles, we observed: • Semantic enrichment allows for richer user profiles. • Profiles change over time (hashtag-based more): fresh profiles seem to better reflect current user demands. • Temporal patterns: weekend profiles differ significantly form weekday profiles (more than day/night). For personalized news recommendation, we learned: • Best user modeling strategy: Entity-based > topic-based > hashtag-based. • Semantic enrichment improves recommendation quality. • Adapting to temporal context helps for topic-based strategy.
  • 97. 97 Social Web for augmentation
  • 98. 98 Augment with what is there Systems can use technology to augment their knowledge with data from the Social Web. Lessons learned show that for adaptive systems on the Social Web there is a lot of knowledge (easily) available, from other systems and other domains. Understanding how to leverage it, even to a basic level, can bring a lot.
  • 100. 100 Cross-system profiles An example to show the added value of ‘cross-system’ on the Social Web is the work in [UMUAI 2013] where interweaving of public profiles is studied.
  • 101. 101 User data on the Social Web Cross-system user modeling on the Social Web
  • 102. 102 Google  Profile  URI     h.p://google.com/profile/XY     4.  enrich  data  with   seman?cs     WordNet®   Seman'c   Enhancement   Profile   Alignment   3.  Map  profiles  to   target  user  model   FOAF   vCard   Blog  posts:   Bookmarks:   Other  media:   Social  networking  profiles:   2.  aggregate     public  profile     data     Social  Web   Aggregator   1.  get  other  accounts     of  user     SocialGraph  API   Account   Mapping   Aggregated,     enriched  profile   (e.g.,  in  RDF  or  vCard)   Analysis  and  user   modeling   5.  generate  user   profiles   Interweaving public user data with Mypes
  • 103. 103 1.  Characteristics of distributed tag-based profiles: •  Overlap of tag-based profiles, which an individual user creates at different services, is low •  Aggregated profiles reveal significantly more information (regarding entropy) than service-specific profiles 2.  Performance of cross-system user modeling for cold- start recommendations: •  Cross-system UM leads to tremendous (and significant) improvements of the tag and bookmark recommendation quality •  To optimize the performance one has to adapt the cross-system strategies to the concrete application setting http://persweb.org Lessons
  • 104. 104 Location estimation Another nice example follows from our work in the ImREAL project on augmentation (of adaptation) with the Social Web.
  • 105. 105 Improved location estimation by mixing Social Web streams + = external data sources: Enriching the image’s textual meta-data with the user’s tweets improves the accuracy of the location estimation.
  • 106. 106 Accuracy of social web metadata This work has also raised attention for the accuracy of Social Web metadata. There are many reasons why this data cannot be taken as the universal truth. In application and domain specific contexts, we need to understand the accuracy of social metadata. Also, the work of [Rout et al. 2013] on location estimation based on social ties, shows the feasibility as well as the context-dependency.
  • 107. 107 Linked Open Data for augmentation
  • 108. 108 LOD and cross-system With these results in hand, in our [ICWE2012] work, we considered cross-system modeling with Linked Open Data. With the aim to understand how Linked Open Data background knowledge can be leveraged for cross-system and cross-domain augmentation.
  • 109. 109 Johannes Vermeer dbpedia:LouvreLooking forward to visit Paris next week! dbpedia:Paris The lacemaker The astronomer Recommending Points of Interest
  • 110. 110 c1   c4   c5   c6   weigh'ng  strategies   Applica'on   that  demands  user     interest  profile  regarding                    -­‐concepts   c2   c3   cx   cy   c9   User  Profile   concept      weight   0.4   0.1   0.2   c1   c2   c3   …  …   concepts  that  can  be  extracted   from  the  user  data     user  data   Social  Web   background  knowledge     (graph  structures)   Linked  Data   LOD-based User Modeling
  • 111. 111 tags: girl with pearl earring geo: The Hague dbpedia:Girl_with_pearl_earring A   Artifact B   The lacemaker C   The astronomer …   rdf:type Johannes Vermeer foaf:maker foaf:maker Strategies for exploiting the RDF-based background knowledge graph dbpedia:The_Hague dbpedia:Louvre dbpprop:locationlocatedIn
  • 112. 112 Lessons With LOD-based user modeling on the Social Web, different strategies for exploiting RDF-based background knowledge are possible. Findings: • Combination of different user data sources (Flickr & Twitter) is beneficial for the user modeling performance. • User modeling quality increases the more background knowledge one considers. • Combination of strategies achieves the best performance. To investigate further: dependency of strategies of entities and relationships, and temporal effects (eg temporal relationships or upcoming trends).
  • 113. 113 Interlinked online society If you take a semantic technology perspective, then strong interlinking could be the direction to go. [Passant et al. 2009] studies applying semantic technologies to social media, creating a Web where data is socially created and maintained through end-user interactions, but is also machine-readable and therefore open towards sophisticated queries and large-scale information integration. "Social Semantic Information Spaces”, where any social data is a component in a worldwide collective intelligence ecosystem.
  • 114. 114 Origin of semantics These social semantic spaces can trigger us in UMAP to articulate where we see the role and origin of semantics. Making all social data available ‘with semantics’ or observing that a lot of semantics is (only) effective in a specific domain or application? Experience showing the fine-grained nature of effects suggests the latter.
  • 116. 116 Humans & adaptive faceted search An important element in the process of sense-making is its hybrid nature: humans involved in the sense-making. The control rooms have shown us that the human aspect in search is crucial, for judgment and interpretation. In our [ISWC2011] work, we looked at adaptive faceted search.
  • 117. 117 Adaptive faceted search framework Adaptive Faceted Search Twitter posts Semantic Enrichment User and Context Modeling user How to adapt the facet-value pair ranking to the current demands of the user? How to represent the content of a tweet?  facet extraction
  • 118. 118 Facet extraction and semantic enrichment @bob: Julian Assange got arrested Julian Assange Julian Assange Tweet-based enrichment Julian Assange arrested Julian Assange, the founder of WikiLeaks, is under arrest in London… Link-based enrichment Julian Assange London WikiLeaks Julian Assange Julian Assange London WikiLeaks powered by
  • 119. 119 Impact of Link-based enrichment Representation of tweets: significantly more facets per tweet with link-based enrichment
  • 120. 120 Faceted search strategies Goal: most relevant facet-value pair should appear at the top of the ranking Faceted Search Strategies: 1.  Occurrence frequency: count occurrence frequencies of FVP 2.  Personalization: adapt ranking to user profile (eg user tweeting history) 3.  Diversification: increase variety among the top-ranked FVPs 4.  Time-sensitivity: adapt FVP ranking to temporal context Semantic enrichment: (i) tweet-based and (ii) link-based enrichment Locations 1.  Aachen 2.  Aalborg 3.  Aalesund 4.  Aarhus … 2145. Eindhoven Locations 1.  Eindhoven 2.  Delft 3.  Amsterdam 4.  Rotterdam 5.  London … Link-based enrichment and occurrence-based and personalized rankings have large effect.
  • 121. 121 Twitcident.com Twitter-based crisis management system 1. 2. 3. 4. Semantic enrichment allows for: 1.  Grouping tweets into incidents 2.  Faceted search 3.  Thematic Views 4.  Analysis
  • 122. 122 Lessons Semantic enrichment allows for structured representation of the content of tweets: a good basis for faceted search. Faceted search performs significantly better than hashtag-based keyword search Different building blocks for making faceted search on Twitter adaptive improve the search quality: •  Link-based enrichment: more discoverable tweets, better search performance. •  Personalization leads to significant improvements. •  Time-sensitivity improves performance as well.
  • 124. 124 Duplicate detection Important for reducing the volume of social data, is to categorize the social chatter and reduce redundancy in information. In our [WWW2013] work we have considered duplicate detection.
  • 125. 125 Twitter is more like a news media. How do people search on Twitter? [Teevan et al. 2011] has shown how this is characterized by repeated queries & monitoring for new content. Problems: •  Short tweets è lots of similar information. •  Few people produce contents è many retweets, copied content. Search and retrieval on Twitter
  • 126. 126 Near-duplicates in Twitter search Analysis of the Tweets2011 corpus (TREC microblog track) [WWW2013] 1.89%& 9.51%& 21.09%& 48.71%& 18.80%& Exact&copy& Nearly&exact& copy& Strong&near; duplicate& Weak&near; duplicate& Low&overlapping& •  For the 49 topics (queries), 2,825 topic-tweet pairs are relevant. •  We manually labeled 55,362 tweet pairs •  We found 2,745 pairs of duplicates in different levels.
  • 128. 128 Lessons Analyzing duplicate content in Twitter, we inferred a model for categorizing different levels of duplicity. We developed a near-duplicate detection framework for microposts and for categorizing duplicity of tweet pairs. Given the duplicate detection framework, we perform extensive evaluations and analyses of different duplicate detection strategies. Our approach enables search result diversification, also good to avoid ‘bubble effects’, and analyzes the impact of the diversification on the search quality. Follow Twinder progress: http://wis.ewi.tudelft.nl/twinder/
  • 129. 129 Take home from technology research With semantics and humans, Social Web can help: •  Semantics beneficial for filtering & search and enrichment & linking. •  Semantic-enriched tweets beneficial for profiles and adaptation. •  Social Web & Linked Data beneficial for cross-system augmentation. •  Adaptive faceted search and duplicate detection beneficial for human- enhanced processing. For adaptive systems that rely on profiling, Social Web is a fertile source for more knowledge. ImREAL research & experiences elegantly show principles, as well as the detailed work in domain & application: •  Social Web & LOD usage is context-specific. •  Big Data in need of Small Interpretations.
  • 130. 130 APPLICATION HUMANS FOR AUGMENTATION USERSDOMAIN DOMAIN Augmented with Web Semantics USERS Augmented with Web Semantics REAL DOMAIN REAL USERS
  • 131. 131 Take home from technology research The human intelligence is to be arranged differently: •  We have moved from a priori understanding the system, to on the fly understanding the system. •  We have moved from careful manual analysis before, to machines doing the analysis on the fly. •  The critical and context-specific approach to (small) data, about domain and users, is a part of process and system we now need to (re-)include. •  This task of the designer has now shifted to a task for the human interpretation inside the hybrid system: human monitoring inside.
  • 132. 132
  • 135. 135 In reality, not one truth In the beginning, social systems like Twitter were used as ‘the’ semantic source of knowledge with an implicit assumption that Twitter is one voice. Over time, researchers have begun to investigate how to identify and interpret different voices and viewpoints in such a source. Differences in viewpoints and opinions are subject of study, but until now leverage is limited
  • 136. 136 Diversity and beliefs [Flock et al. 2011] study the different backgrounds, mindsets and biases of Wikipedia contributors, to understand the effects - positive and negative – of this diversity on the quality of the Wikipedia content, and on the sustainability of the overall project. • Analysis and approach for diversity-minded content management within Wikipedia. [Bhattachanya et al. 2012] estimate beliefs from posts made on social media, to monitor the level of belief, disbelief and doubt related to specific propositions.
  • 137. 137 Include the negative Diversity of viewpoints and opinions also suggests to include negative links in the approach. [Symeonidis et al. 2010] give an example of how to include negative links into friend recommendation approaches, but this goes much further. The effect they observe on improving accuracy can be held as a principle where accuracy improvement can be gained using information about positive and negative edges.
  • 138. 138 ViewS Modelling Viewpoints in User Generated Content Text processing Viewpoint extraction (attention focus) Ontology (activity aspects to analyse) Semantic enrichment Viewpoint exploration
  • 139. 139 Viewpoints in YouTube Examples viewpoints in user comments on job interview videos Comparing the viewpoints around ‘anger’ of young users (left) and old users (right)
  • 141. 141 Truth is not always truth Just like this source of knowledge is not a single one, it is also clear that it might not be consisting of ‘true’ knowledge alone.
  • 142. 142 Malicious profiles For example, profiles can be suspicious and made for the wrong reasons. In a context of online dating, [Pizzato et al. 2012] have observed the need to gain understanding of the sensitivity of recommender algorithms to scammers. With people being the items to recommend, fraudulent profiles can be having a serious impact on recommender algorithms. Identifying and detecting fraudulent profiles is a new challenge for us.
  • 143. 143 Identity theft Another aspect to ‘wrong profiles’ relates to identity disambiguation and theft. [Rowe et al. 2010] consider malevolent web practices such as identity theft and lateral surveillance. They study techniques for web users to identify all web resources which cite them and if necessary, remove the sensitive information.
  • 144. 144 Credibility of social content The credibility of messages in social networks is for example studied in [Seth et al. 2010] on stories from Digg. Their model is based on theories developed in sociology, political science and information science. [Cramer et al. 2008] have nicely brought attention for trust. The study of social content credibility and trust are important, and ask for cross-discipline effort.
  • 145. 145 Privacy A lot can be said about privacy in these networks, for example Facebook. [Bachrach et al. 2012] shows how users’ activity on Facebook (related to privacy) relates to their personality, as measured by the standard Five Factor Model. Nice example of understanding how Facebook features relate to interesting aspects of users and usage.
  • 147. 147 Cultural diversity Studying diversity is not just relevant for understanding how Twitter content is to be interpreted. It is also relevant for understanding how the Social Web is used and can be used with a purpose. Cultural diversity is here one of the most interesting aspects and perhaps also one of the most challenging ones.
  • 148. 148 Cultural diversity A subject addressed in ImREAL. Components are made available as services in ImREAL for augmented user modeling, e.g. for simulation designers.
  • 149. 149
  • 150. 150 Hofstede’s cultural dimensions Describes stereotypical cultural characteristics of nationalities, with scores relative to other nationalities Five core dimensions: •  Individualism versus Collectivism (IDV) •  Power Distance (PDI) •  Masculinity versus Femininity (MAS) •  Uncertainty Avoidance (UAI) •  Long-Term Orientation (LTO) geert-hofstede.com
  • 151. 151 Analysis • Datasets •  Microblog data collected over a period of three months •  22 million microposts from Sina Weibo and 24m from Twitter •  a sample of 2616 Sina Weibo users and 1200 Twitter users • Analyze and compare user behavior •  on two levels (i) the entire user population and (ii) individual users •  from different angles (i) syntactic, (ii) semantic, (iii) sentiment and (iv) temporal analysis
  • 152. 152 0% 20% 40% 60% 80% 100% users 0 0.01 0.1 1 avg.numberof hashtags/URLsperpost Hashtag-Weibo URL-Weibo Hashtag-Twitter URL-Twitter Hashtags and URLs are less frequently applied on Sina Weibo than on Twitter. Users on Twitter are more triggered by hashtags and URLs when propagating information than on Sina Weibo. Syntactic analysis high collectivism in Weibo, a high individualism in Twitter
  • 153. 153 Semantic analysis The topics that users discuss on Sina Weibo are to a large extent related to locations and persons. In contrast to Sina Weibo, users on Twitter are talking more about organizations (such as companies, political parties). 0% 20% 40% 60% 80% 100% users 0 0.001 0.01 0.1 1 10 avg.numberofentitiesperpost Weibo Twitter low employee commitment to an organization in China - high long term orientation.
  • 154. 154 Sentiment analysis Sina Weibo users have a stronger tendency to publish positive messages than Twitter users. 0% 20% 40% 60% 80% 100% users 0% 20% 40% 60% 80% 100% ratioofpositveposts Weibo Twitter more negative posts more positive posts high long term orientation.
  • 155. 155 Combined semantic sentiment analysis The difference is amplified when discussing ‘people’ or ‘location’, with Sina Weibo users even more positive and Twitter users more negative. more longterm orientation in Weibo, more shortterm orientation in Twitter
  • 156. 156 Temporal analysis Twitter users repost messages faster than Sina Weibo users. time distance = trepost - toriginal post 0% 20% 40% 60% 80% 100% users 0 0.1 1 10 100 1000 timedistance(inhours) Weibo Twitter large degree of power distance in Weibo, small one in Twitter
  • 157. 157 Cultural differences in tagging Other work confirms the findings. And the consistency with theories of cultural differences between Asian and Western cultures. [Dong et al. 2011] look at cultural differences in a tagging system and find that American and Chinese subjects differed in many ways: • the number and types of tags they applied; • the extent to which they applied suggested tags or entered new tags of their own; and • how often they applied tags that originated from a different culture.
  • 158. 158 Cultural variations for Social Q&A Another example is given by [Yang et al. 2011] that looks at cultural differences in people’s social question asking behaviors across the United States, the United Kingdom, China, and India. They analyzed the questions people ask via social networking tools, and their motivations for asking and answering questions online. Results reveal culture as a consistently significant factor in predicting people’s social question and answer behavior.
  • 160. 160 Understand the source When using the knowledge from Twitter as a semantic source, specially if it is the only semantic source, there are a few things one needs to consider that relate to the real-time nature of social contributions. The ‘knowledge’ is not unambiguous: inconsistency, moods, etc. Real-time knowledge spreads and evolves fast.
  • 161. 161 Inconsistency & moods Twitter is used as semantic sensor, sometimes as the only semantic sensor, but consistency in user contributions like ratings is a concern. [Said et al. 2012] shows how users are inconsistent in their ratings and tend to be more consistent for above average ratings. [De Choudhury et al. 2012] report on the relation between moods and social activity, social relations and participatory patterns like link sharing and conversational engagement.
  • 162. 162 Understanding over time While Twitter and the like were used in the beginning as ‘fixed’ sources of knowledge, researchers have become interested in the evolution over time. The nature and speed of the flow of content over time have become great objects of study. Two domains that in this light have received fair attention is that of diseases and (political) news.
  • 163. 163 Flow in disease information Domain of diseases and outbreaks is getting fair attention. Works by [Gomide et al. 2011] on Dengue and [Diaz-Aviles et al. 2012] on EHEC, show how the people’s behavior on Twitter can be used for surveillance and tasks such as early warning and outbreak investigation.
  • 164. 164 Flow of news From [Naveed et al. 2011] we learn how retweets reflect what the Twitter community considers interesting on a global scale. In [Backstrom et al. 2011] we see the differences between communication and observation in Facebook: communication involves a much higher focus of attention than observation activities. We see in [Lerman et al. 2010] how network structure affects dynamics of how interest in news stories spreads among social networks in Digg and Twitter
  • 165. 165 Flow in political news Coming back to our observation of the multiple truths, political news is a great domain to look at. For the contact of political speech, [Metaxas et al. 2010] discuss how the real-time nature of Twitter provides disproportionate exposure to personal opinions, fabricated content, unverified events, lies and misrepresentations, with viral spread as a consequence. To act upon that, [Lumezanu et al. 2012] identify extreme tweeting patterns that could characterize users who spread propaganda (political propagandists), e.g. sending high volumes of near-duplicate messages.
  • 166. 166 Temporal effects In our [WebSci2011] work, we have considered how user interests are manifest over time. Most users, who are interested into the news topic, become interested within a few days. Lifespan of users’ interest: • Long-term adopters - continuously interested • Short-term adopters - interested only for a short period in time (and influenced by “global trends”) High overlap between early adopters and long-term adopters.
  • 167. 167 Temporal effects On Twitter the importance of entities for a topic varies over time (long-term vs. short-term entities). In terms of user interests over time, the majority of users becomes quickly (few days) interested in a topic. When using Twitter-based profiles for personalization, time-sensitive user modeling improves recommendation quality. Also, the selection of user modeling strategy should take the type of user into account: • Long-term adopters: hashtag-based • Short-term adopters: entity-based
  • 168. 168 Twitter-based Trend and User Modeling Framework Twitter posts current tweets of Twitter community news recommender? Profile Semantic Enrichment Profile Type Aggregation Weighting Scheme trends time user’s interests
  • 169. 169 Temporal effects with trends For the domain of personalized news recommendations, We have combined trend and user modeling in our framework. • We have seen how user profiles change over time, under the influence of trends. • Appropriate concept weighting strategies allow for the discovery of local trends. • Time sensitive weighting function is best for generating trend profiles. Aggregation of trend and user profile can improve the performance of recommendations.
  • 171. 171 Check with the user With all profiles based on augmentation, it becomes (even more) vital to follow the lessons of checking with the user. By engaging with the user in a common process of validating the profile and the assumptions based on it.
  • 172. 172 Perico Dialogue for Modelling Cultural Exposure using Linked Data Initial User Model •  Visited Countries •  Estimated Cultural Exposure Social Web Sensors Perico Dialogue Agent Cultural Fact Extractor Quiz Generator User Profile GeneratorDialogue Planner Updated User Model •  Verified Visited Countries •  Enhanced Cultural Exposure Score
  • 173. 173 Perico Dialogue for Modelling Cultural Exposure using Linked Data Initial User Model •  Visited Countries •  Estimated Cultural Exposure Social Web Sensors Perico Dialogue Agent Cultural Fact Extractor Quiz Generator User Profile GeneratorDialogue Planner Updated User Model •  Verified Visited Countries •  Enhanced Cultural Exposure Score
  • 174. 174 Inspect and control [Knijnenburg et al. 2012] consider how users of social recommender systems may want to inspect and control how their social relationships influence the recommendations they receive: friends are not always “nearest neighbors”. The results show that high inspectability and control indeed increase users’ perceived understanding of and control over the system, their rating of the recommendation quality, and their satisfaction with the system, and thus an overall better user experience.
  • 176. 176 Understanding communities Attention is given to communities and their dynamics. [Chan et al. 2010] proposes a method for analysing user communication roles in discussion forums. [Schwagereit et al. 2011] study governance in web communities. [Karnstedt et al. 2011] considers the relation between a user's value within a community - constituted from various user features - and the probability of a user churning. [Yang et al. 2010] analyze users’ activity lifespan in online knowledge sharing communities: acknowledgement of contributions leads to user survival.
  • 177. 177 Involvement in communities In order to understand how people behave in Social Web and in communities, it is relevant to understand their engagement and involvement in more detail. [Lehmann et al. 2012] study how users engage with online services, and how to measure this engagement. [Freyne et al. 2009] look at how social networking sites rely on the contribution and participation of their members: focus on early interventions for engagement.
  • 178. 178 Communities and expertise Understanding communities is also relevant as these communities can act as additional resource. From finding evidence for profiles, we have seen recent attention shift towards finding people and expertise. For example, to enable active engagement of people. For using expertise in UMAP, it is also important to be able to specify expertise, to enable reasoning about the expertise’s quality and fit.
  • 179. 179 Take home from challenges The (Social) Web tells many stories: •  Acknowledge multiple truths, opposing truths, and bad intentions. •  Acknowledge multiple audiences and viewpoints. •  Acknowledge cultural variations. The (Social) Web moves fast: •  Acknowledge the real-time nature of Web and applications. •  Analyze and understand the flow of information. •  Analyze and understand the nature of communities. The (Social) Web includes people: •  Involve the users actively in validation. •  Involve (communities of) users in interpretation.
  • 180. 180
  • 182. 182 Social & UMAP Huge economic and societal potential for added value. Social Web is a fertile source of knowledge for augmentation. •  Semantics can be beneficial for social-based augmentation. •  Hybrid, human-enhanced approaches can be beneficial. •  Technological feasibility of augmentation. Research from specific cases towards general theory. Next on the agenda: •  Describe added value for stakeholders, describe goals. •  Share and compare research challenges and evaluations.
  • 183. 183 Web & UMAP UMAP systems are Web systems: •  The (Social) Web tells many stories. •  The (Social) Web moves fast. •  The (Social) Web includes people. The Web is the real laboratory for UMAP systems. Next on the agenda: •  Share and compare solutions, components, and systems. •  Support more uniformity in methods and practices.
  • 184. 184 UMAP & Web On the (Social) Web, systems are being made: •  Take positions or prepare to take positions about bad intentions. •  Take responsibility and recommend about future architectures. On the (Social) Web, many systems are small: •  Do (also) consider the specific problems of small and medium sized stakeholders: bring UMAP into practice.
  • 185. 185 UMAP & Social In SWUMAP, human intelligence is arranged differently: •  From careful manual analysis a priori, to machine analysis on the fly. •  Critical and context-specific approach to data is part of the ‘in vivo’ system. •  Human interpretation of data is inside the hybrid system. It makes for a new type of system, and one of great value. And plenty of fun and diverse challenges for UMAP.
  • 186. 186 APPLICATION HUMANS FOR AUGMENTATION USERSDOMAIN DOMAIN Augmented with Web Semantics USERS Augmented with Web Semantics REAL DOMAIN REAL USERS
  • 187. 187 APPLICATION HUMANS FOR AUGMENTATION USERSDOMAIN DOMAIN Augmented with Web Semantics USERS Augmented with Web Semantics SWUMAP
  • 188. 188 Thanks Slides made with input from many, including Alessandro, Claudia, Fabian, Ilknur, Jan, Jasper, Ke, Qi, and Richard from WIS in Delft, and friends from ImREAL, Net2, SEALINCMedia, and Twitcident.