Citizen Sensor Data Mining, Social Media Analytics and Applications

Citizen sensor data mining,
social media analytics and applications
Singapore Symposium
on Sentiment Analysis (S3A) ,Feb 6, 2015
Amit Sheth
Kno.e.sis: Ohio Center of Excellence
in Knowledge-enabled Computing
@ Wright State University

Acknowledgements
Significant components of this talk is from the tutorial I gave at WWW2011:
“Citizen Sensor Data Mining, Social Media Analytics and Development
Centric Web Applications,” with Meena Nagarajan and Selvam Velmurugan.
Contributors to Twitris and/or Semantic Social Web Research @ Kno.e.sis:
L. Chen, H. Purohit, W. Wang
with: P. Anantharam, A. Jadhav, P. Kapanipathi, Dr. T.K. Prasad,
And alumni: K. Gomadam, M. Nagarajan, A. Ranabahu)
Funding: NSF, AFRL, NIH; Collaborations: IBM, Microsoft
3

Ohio Center of Excellence in Knowledge-
enabled Computing
• Among top 10 among all universities in the world in World Wide Web (cf:
10-yr impact, Microsoft Academic Search)
• Largest academic group in the US in Semantic Web + Social/Sensor
Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical &
Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM
Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research
universities, NLM, startups )
• 80+researchers including 15 World Class faculty (>3K citations/faculty)
and 45+ PhD students- practically all funded
• $2M+/yr research for largely multidisciplinary projects; world class
resources; industry sponsorships/collaborations (Google, IBM, …)
4

6Data for mid2012
http://www.mediabistro.com/alltwitter/social-media-stats-2014_b54243
Never before humanity is so connected

• Mumbai Terror
Attack
• Iran Election
2009
• Haiti Earthquake
2010
• Occupy Wall
Street
• Kashmir Floods
2014
Citizen Sensors in Action
7Image: http://huff.to/hp0OhA

• Ghonim, who has been a figurehead for the movement
against the Egyptian government, told Blitzer “If you
want to liberate a government, give them the internet.”
• Egyptian anti-government
demonstrator sleeps on the pavement
under spray paint that reads 'Al-
Jazeera' and 'Facebook' at Cairo's
Tahrir square on February 7, 2011.
http://www.cbsnews.com/stories/2011/02
/15/eveningnews/main20032118.shtml
Revolution 2.0
Political/Social Activism
8
• When Blitzer asked “Tunisia, then Egypt, what’s next?,”
Ghonim replied succinctly “Ask Facebook.”
http://cnn.com/video/?/video/world/2011/02/13/nr.social.media.revolution.cnn
http://cnn.com/video/?/video/tech/2011/02/11/barnett.egypt.social.media.cnn

Citizen Journalism
9
Twitter Journalism
Images: http://bit.ly/9GVfPQ,
http://bit.ly/hmrTYV

• Social News
• Social Media and
Global Media are
inter-twined.
News is increasingly Social
10

11
Some of the significant human, social & economic
development applications we work on at Kno.e.sis
• Coordination during disasters (Qatar Computing Research
Institute, Microsoft Research NYC)
• Harassment on social media (WSU cognitive scientists)
• Prescription drug abuse, Cannabis & Synthetic
Cannabinoid epidemiology (Center for Interventions, Treatment
and Addictions Research, ….)
• Depressive disorders (Mayo Clinic)
• Gender-based violence (United Nations)
Highly multidisciplinary team efforts, often with significant
partners, with real world data, intended to achieve real-
world impact

12
Sample of Real-World Impact & Media Coverage
• Twitter Data Mining Reveals America‘s Religious Fault Lines,
MIT Technology Review, Oct 6, 2014
• Digital soldiers emerge heroes in Kashmir flood rescue,
HindustanTimes, September 25, 2014
• India's social media election battle, BBC News, Mar 30, 2014
• #Cursing Study: 10 Lessons About How We Use Swear Words on
Twitter, Time.com, Feb 19, 2014
• Twitris: Taking Crisis Mapping to the Next Level, Tech President,
June 24, 2013
• Picking the President: Twindex, Twitris Track Social Media
Electorate, Semanticweb.com, Aug 3, 2012
• Web App Analyzes Tweets in Real Time for a Record of Historic
Events, Mashable.com, Feb 17, 2012

13
TWITRIS’ Technical Approach to
Understand & Analyze Social Content
Social Data is
incredibly rich

14
Some of the topics on Online Social Media
we research at Kno.e.sis
1. Named Entity Recognition
2. Language usage in Social Media
4. Exploration of People, Content and Network dynamics
6. Sentiment, Emotion and Opinion mining
5. Trust
6. Integrated exploitation of Sensor (physical), Web (Cyber)
and Social data for PCS applications
7. TWITRIS: A System for Mining Collective Intelligence
from Citizen-Sensor Data

• "Who says what, to whom,
why, to what extent and with what effect?" [Laswell]
• Network: Social structure emerges
from the aggregate of relationships (ties)
• People: poster identities, the active effort of
accomplishing interaction
• Content : studying the content of communication
Social Information
Processing
15

Why People-Content-Network +
Spatial-Temporal-Thematic metadata?
(Example of Understanding Crisis Data)
16
, Offer help, etc.

`
• Explicit information from user profiles
– User Names, Pictures, Videos, Links, Demographic Information,
Group memberships...
• Implicit information from user attention metadata
– Page views, Facebook 'Likes', Comments; Twitter 'Follows',
Retweets, Replies..
People Metadata:
Variety of Self-expression Modes
on Multiple Social Media Platforms
17

People Metadata: Various Types
Identification
Structural Network
Activity
Interests
18

People Metadata: Continued
User Identification Metadata
• User-id
• Screen/Display-name of user
• Real name of user
• Location
• Profile Creation Date
• User description
- Biodata of the user
- Link to webpage of the user
Interest Metadata
• Author type
- Trustee/donor, journalist, blogger,
scientist etc.
• Favorite tweets
• Types of lists subscribed
• Style of Writing (personality
indicator)
• No. of Followees
• Majority of author type of
Followees
19

People Metadata: Continued
Web Presence:
- User affiliations
- Influence Metric – e.g., KLOUT (www.klout.com)
Activity Metadata
• Age of the profile
• Frequency of posts
• Timestamp of last status
• No. of Posts
• No. of Lists/groups created
• No. of Lists/groups subscribed
Influence Metadata
(Inferring People Metadata from Network level Information)
• No. of Followers – normal, influential
• No. of Mentions
• No. of Retweets/Forwards
• No. of Replies
• No. of Lists/groups following
• No. of people following back
• Authority & Hub Scores
20

Content Metadata:
Content Dependent (Tweet)
23
Direct Content-based Metadata
Indirect content-based metadata (External metadata)

Direct Content-based Metadata
Content Metadata:
Content Dependent (SMS)
24

Connections/Relationships matter! (foundation for the network)
Network Metadata
25
Structure Metadata
• Community Size
• Community growth rate
• Largest Strongly Connected
Component size
• Weakly Connected Components &
Max(WCC) size
• Average Degree of Separation
• Clustering Coefficient
Relationship Metadata
• Type of Relationship
• Relationship strength
• User Homophily (based on certain
characteristic such as location,
interest etc.)
• Reciprocity: mutual relationship
• Active Community/ Ties

Metadata Creation & Extraction
Length: 109 characters
General topic: Egypt protest
This poor {sentiment_expression: {target: “Lara Logan”,
polarity: “negative”}} woman! RT @THR CBS News‘
{entity:{type=“News Agency”}} Lara Logan
{entity:{type=“Person”}} Released From Hospital
{entity:{type=“Hospital”}} After Egypt
{entity:{type=“Country”} Assault {topic} http://bit.ly/dKWTY0
{external_URL}
26

Metadata Extraction from
Informal Text
Meena Nagarajan, ‘Understanding User-Generated Content on Social Media,’ Ph.D. Dissertation, Wright State University, 2010

Content Analysis: Typical Sub-tasks
• Recognize key entities mentioned in content
– Information Extraction (entity recognition, anaphora resolution, entity
classification..)
– Discovery of Semantic Associations between entities
• Topic Classification, Aboutness of content
– What is the content about?
• Intention Analysis
– Why did they share this content?
28
• Sentiment Analysis
– What opinions are people conveying via the content?
• Author Profiling
– What can we infer about the author from the content he posts?
• Context (external to content) extraction
– URL extraction, analyzing external content

• Named Entity Recognition
– I loved <movie> the hangover </movie>!
• Key Phrase Extraction
29
NER, Key Phrase Extraction

Named Entity Recognition
“I loved your music Yesterday!”
Yesterday is an album
“It was THE HANGOVER of the year..lasted forever..
The Hangover is not a movie
So I went to the movies..badchoice picking “GI
Jane”worse now”
GI Jane is a movie
30
Task of NER : Identifying and classifying tokens

Analysing the Content can be Hard…
Using a domain model (E.g., MusicBrainz)
Using context cues from the content
• e.g. new Merry Christmas tune
Reduce potential entity spot size (with restrictions)
• e.g. new albums/songs
Multimodal Social Intelligence in a Real-Time Dashboard System
Analyzing the content can be hard
31

32
Music NER application : BBC SoundIndex
(IBM Almaden)
Pulse of the Online Music Populace
Daniel Gruhl, Meenakshi Nagarajan, Jan Pieper, Christine Robson, Amit Sheth: ‘Multimodal Social Intelligence in a Real-Time Dashboard System,’
special issue of the VLDB Journal on "Data Management and Mining for Social Networks and Social Media", 2010
Project: http://www.almaden.ibm.com/cs/projects/iis/sound/

The Vision
http://www.almaden.ibm.com/cs/projects/iis/sound/
33

Several Insights
35
Only 4% -ve sentiments, perhaps ignore the Sentiment
Annotator on this data source?
Ignoring Spam can change ordering
of popular artists
Trending popularity of artists Trending topics in artist pages

Predictive Power of Data
• Billboards Top 50 Singles chart
during the week of Sept 22-28
’07 vs. MySpace popularity
charts.
• User study indicated 2:1 and
upto 7:1 (younger age groups)
preference for MySpace list.
• Challenging traditional polling
methods!
36

Key Phrase Extraction - Example
• Key phrases extracted from prominent discussions on
Twitter around the 2009 Health Care Reform debate and
2008 Mumbai Terror Attack on one day
38

39
M. Nagarajan et al., Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences, Tenth International Conference on Web
Information Systems Engineering, Oct 5-7, 2009: 539-553
TF-IDF vs. Spatio-temporal-thematic scores rank phrases differently
Foreign relations
surfaces up

Why do people share?
• Outside of the psychological incentives, broadly, people
share to Seek Information OR Share Information
• If we understand the intent behind a post, we can build
systems that respond to it better
• An application: Understand intent to deliver targeted
content
– Use case: Online Content-Targeted Advertisements on Social Media
Platforms
41

Circa 2009 -Content-based Ads
42

Today – Content-based Ads on Profiles
43

What is going on here..
• Ads are targeted on profile interests, demographic data
• But Interests on profiles do not translate to purchase
intents
– Interests are often outdated..
– Intents are rarely stated on a profile..
• Some profile data does seem to work
– Example: New store openings, sales targeted at location
information in a profile
44

But Monetizable Intents are Elsewhere,
away from their profiles..
45

Showing clear intents on MySpace
posts but no relevant ads..
46

–Non-trivial
–Non-policed content
•Brand image, Unfavorable sentiments
–People are there to network
•User attention to ads is not guaranteed
–Informal, casual nature of content
•People are sharing experiences and events
–Main message overloaded with off
topic content
I NEED HELP WITHSONY VEGAS PRO 8!! Ugh and ihave a
video project due tomorrow for merrilllynch :(( all ineed
to do is simple: Extract several scenes from a clip, insert
captions, transitions and thatsit. really. omggicant figure
out anything!! help!! and igot food poisoning from eggs.
its not fun. Pleasssse, help? :(
1Learning from Multi-topic Web Documents for Contextual Advertisement, Zhang, Y., Surendran, A. C., Platt, J. C., and Narasimhan, M.,KDD 2008
Targeted Content-based Advertizing
47

Focus: Discuss Methodology,
Preliminary Results in…
• Identifying intents behind user posts on social networks
– Identify Content with monetization potential
• Identifying keywords for advertizing in user-generated
content
– Considering interpersonal communication & off-topic chatter
48
M. Nagarajan et al., ‘Monetizing User Activity on Social Networks - Challenges and Experiences,’ 2009 IEEE/WIC/ACM International Conference on Web
Intelligence, Sep 15-18 2009: 92-99

Result - 8X more interest for non-profile
ads..
• Using profile ads
– Total of 56 ad impressions
– 7% of ads generated interest
• Using authored posts
• Using topical keywords from authored posts
49

Sentiment Analysis: Motivation
Which movie
should I see?
What
customers
complain
about?
Why do
people
oppose
health care
reform?
Image: http://bit.ly/eZtKBF
51

Content Analysis:
Sentiment Analysis/Opinion Mining
• Two main types of information we can learn from user-
generated content: fact vs. opinion
• Much of social media text (e.g., blogs, Twitter, Facebook)
is a mix of facts and opinions.
• Extracting structured sentiment information from
unstructured content
• Allowing computation to be done on “what people think”
and “how people feel”
52

• From coarse-grained to fine-grained
– Document level -> sentence level -> expression level
– General sentiment -> domain-dependent sentiment -> target-
dependent sentiment
• From static to dynamic
– Our attitude can be changed during social communication.
• Modeling, detecting, and tracking the change of attitude
• What leads to the change of attitude? E.g., persuasion
campaign
53
Sentiment Analysis: Challenges

Sentiment Analysis:
Target-specific Opinion Identification
Observations:
• The opinion clues may not be toward the given target
(1,2,3,6)
• The opinion clues are domain and context dependent
(5,7)
• Single words are not enough (4,7,8)
Simple lexicon-based method doesn't work well.
54
Target of “sexy” is “Helena”
Target of “terrific” is “reviews”
“free” is not opinionated in
movie domain.
Target of “loving” is “telling”
“well” in “as well” is not
opinionated

55
Extracting a diverse and richer
set of sentiment-bearing
expressions, including formal
and slang words/phrases
Assessing the
target-dependent polarity
of each sentiment
expression
A novel formulation of assigning
polarity to a sentiment expression
as a constrained optimization
problem over the tweet corpus
Extracting Diverse Sentiment Expressions
With Target-dependent Polarity from Twitter [Chen et al. ICWSM 2012]

The Usage of Background Knowledge
56

57
Sentiment Analysis:
Feature and Aspect Extraction
Motivation
• To understand a user’s opinions about a product at a fine-grained
level, support opinion summarization for products, and
automatically extract pros and cons from reviews it is essential to
identify product features and aspects.
Impact
• Existing methods tend to require seed terms and focus on
identifying explicit features or a few high-level aspects.
• Our approach is capable of identifying both explicit and implicit
aspects and does not require any labeling efforts.
Approach
• We use a combination of corpus-based association measures, and
semantic similarity measures to identify product aspects in an
efficient clustering based approach.

58
Clustering for Aspect Discovery in Opinion Mining [Chen et al.
in submission]

59
It is actually about tracking public opinion.
PollingorSocial Media Analysis?
1. Sample size
2. Representative of the target population
3. Accurate measure of opinions
4. Timeliness

• We Study different groups of social media users who
engage in the discussions of 2012 U.S. Republican
Presidential Primaries, and compare the predictive
power among these user groups.
• Existing studies on predicting election result are under
the assumption that all the users should be treated
equally.
• How could different groups of users be different in
predicting election results?
60
Harnessing the Power of Social Data
to Predict Election Results [Chen et al., SocInfo 2012]

61
1. Engagement
Degree
2. Tweet Mode 3. Content Type 4. Political Preference
User Categorization

Predicting a User's Vote
• Basic idea: for which candidate the user shows the most
support
– Frequent mentions
– Positive sentiment
62
Nm(c): the number of tweets mentioning the candidate c
Npos(c): the number of positive tweets about candidate c
Nneg(c): the number of negative tweets about candidate c
 (0 <  < 1): smoothing parameter
 (0 <  < 1): discounting the score when the user does not
express any opinion towards c.
The user
posted opinion
about c
The user
mentioned c but
did not post
opinion about c
More mentions,
higher score
More positive/less
negative opinions,
higher score

63
Revealing the challenge of
identifying the vote intent of
“silent majority”
Retweets may not necessarily
reflect users' attitude.
Prediction of user’s vote based
on more opinion tweets is not
necessarily more accurate than
the prediction using more
information tweets
The right-leaning user group
provides the most accurate
prediction result. It correctly predict
the winners in 8 out of 10 states
with an average prediction error of
0.1.
To some extent, it demonstrates
the importance of identifying likely
voters in electoral prediction.
Twitter users are not “equal”
in predicting elections!

Emotion Mining: Motivation
65
• Emotion is essential to all aspects of our lives.
– Inﬂuences our decision-making
– Affects our social relationships
– Shapes our daily behavior
• Emotional mental health
– New mothers may suffer from post-partum depression
– Veterans may constantly suffer from negative emotions because
of post-traumatic stress disorder

Emotion Mining: what have we studied
66
• Can we automatically create a large emotion dataset
with high quality labels from Twitter? How?
• What features can effectively improve the performance
of supervised machine learning algorithms?
• Can the system developed on Twitter data be directly
applied to identify emotions from other datasets?
• What can we learn about emotion from social media
data?

• Collect self-annotated emotion tweets [Wang et. al. SocialCom 2012]
– Seven emotions: joy, sadness, anger, love, fear, surprise, thankfulness
“When I see a cop, no matter where I am or what I’m doing, I
always feel like every law I’ve ever broken is stamped all over
my body #fear”
“I hate when my mom compares me to my friends. #anger”
“I hate when I get the hiccups in class. #embarrassing”
Harnessing twitter" big data" for
automatic emotion identification [Wang et al.
SocialCom12]
67

0.4
0.45
0.5
0.55
0.6
0.65
1,000 10,000 248,898 497,796 746,694 995,592 1,244,490 1,493,388 1,742,286 1,991,184
accuracy
number of tweets in training data
LIBLINEAR
MNB
The more data, the merrier
68
Results of performing seven emotion classifications

Discovering Fine-grained Emotion
in Suicide Notes [Wang et al. BII12]
69
• Automatically classify suicide notes to different (15)
categories at sentence level
• Emotion categories
– Positive
• Hopefulness, thankfulness, forgiveness, love, pride, happiness
– negative
• Sorrow, abuse, anger, hopelessness, guilt, blame, fear
• Other categories
– Information, instructions

70
Sentence: “Found out today that // I passed my math STAAR test.”
• N-gram features
• Unigram, e.g., found, today, passed, etc.
• Bigram, e.g., found_out, out_today, etc.
• N-gram position
– Unigram: found-1, out-1, today-1,…,, I-2, passed-2, my-2, …
• Knowledge-based features:
– LIWC (Pennebaker et al., 2014a)
– WordNet-Affect (Strapparava and Valitutti, 2004)
– MPQA (Wilson et al., 2005)
• Syntactic features:
– Part-of-speech tags, e.g., Found/VBN out/RP today/NN that/IN I/PRP
passed/VBD…
– Dependency relations, e.g., root(ROOT-0, Found-1); ccomp(Found-1, passed-6);
dobj(passed-6, test-10) …

71
Winner: N-gram(1,2), knowledge-based and syntactic features

Cursing in English on Twitter [Wang et al. CSCW14]
72
• The main reason that people use curse words is to express some
strong emotions, especially anger and frustration. [Jay 1992, 2000;
McEnergy 2006; Nasution and Rosa 2012]

Normalized Emotion Distributions
over Time in Eastern Standard TimeNormalized Emotion Distributions over Days (EST)
“I am so thankful for my family && close friends. They hold me together
when everything else around me is falling apart. #SoBlessed #Thankful”
73

Normalized Emotion Distributions over Time (EST)
“I thank God everytime I see another day :*) #thankful .”
74

Rank Mom Dad
1 Irritation (7, 562) Irritation (3, 034)
2 Sadness (2, 315) Sadness (1, 363)
3 Affection (2, 225) Embarrassment (1, 158)
4 Zest (2, 213) Zest (1, 035)
5 Embarrassment (1, 849) Affection (1, 030)
6 Thankfulness (1, 537) Cheerfulness (911)
7 Cheerfulness (1, 332) envy (902)
“I hate when my dad uses my laptop. Its mine. Not yours. You have your own computer.
I have shit to do, get off now please. #annoyed”
“ugh my mom gets so nervous when i drive #annoying”
“My mom just told me I can't open any presents early cause I'm too old for that #depressing”
What are the top Emotions Associated with Moms and Dads?
75

PEOPLE ANALYSIS
- Deriving People Metadata
- from Content Analysis
- from Network Analysis
- Merge of two approaches
- People-Content-Network Analysis to leverage the metadata
- Finding Influential Users
- Finding User Types & Affiliation
- Measuring Social Engagement
- Leverage communities to assist coordination
76

People Analysis:
Social Engagement & Coordination
77
Imagine a crisis scenario such as Haiti earthquake (2010) or
hurricane Sandy (2012)
- emergency teams are looking for ways to help the victims
• What are the best possible ways to communicate:
identify and engage people
• Between resource providers (supply) and people in
need of resources (demand)
• Topical community influencers
• How response teams can coordinate social media
communities well between volunteers, managers in
organizational structure, and resource seekers?

People Analysis: Who is asking for help, Who is offering to help?
Smart Data in the context of Disaster Management
ACTIONABLE: Timely delivery of
right resources and information
to the right people at right
location!
78
Because everyone wants to Help, but DON’T KNOW HOW!

Really sparse Signal to Noise:
• 2M tweets during the first 48 hrs. of #Oklahoma-tornado-2013
- 1.3% as the precise resource donation requests to help
- 0.02% as the precise resource donation offers to help
79
• Anyone know how to get involved to
help the tornado victims in
Oklahoma??#tornado #oklahomacity
(OFFER)
• I want to donate to the Oklahoma cause
shoes clothes even food if I can (OFFER)
Disaster Response Coordination:
Finding Actionable Nuggets for Responders to act
• Text REDCROSS to 909-99 to donate to
those impacted by the Moore tornado!
http://t.co/oQMljkicPs (REQUEST)
• Please donate to Oklahoma disaster
relief efforts.: http://t.co/crRvLAaHtk
(REQUEST)
For responders, most important information to manage
coordination dependencies is
the scarcity and availability of resources
Blog by our colleague Patrick Meier on this analysis: http://irevolution.net/2013/05/29/analyzing-tweets-tornado/

People Analysis: Match demander-
suppliers for coordination during crisis
Purohit, H., Castillo, C., Diaz, F., Sheth, A., & Meier, P. (2013). Emergency-relief coordination on social media: Automatically
matching resource requests and offers. First Monday, 19(1).
80

Demand-Supply identification and
representation: core & facets
• Extract Core of the phrase- “what”
– Other facets includes “who”, “where”, “when”, etc.
• Supervised Learning to classify items for demands, supplies, and
resource type facets
81
Rotary collecting clothing and other donations in New Jersey <URL>
{ source: “Twitter”, author: “@NN”, text: “Rotary collecting clothing and
other donations in New Jersey <URL>”, donation-info: { donation-type:
“Request”, donation-type-confidence: 0.8, donation-organization: “Rotary”,
donation-item: “clothing and other donations”, donation-location: “New
Jersey” }, … }
Corresponding data item in the semi-structured knowledge inventory:
• IR model approach to match demand (request) with supply (offer)
items in this semantically annotated knowledge inventory

Leveraging Communities for Whom
to Engage With, Why and How
82
Purohit et al., User Taglines: Alternative Presentations of Expertise and Interest in Social Media . ASE Social Informatics, 2012

Network Analysis
Interesting questions to ask:
• How communities form around topics- growth & evolution
• What are the effects of influential participants in the communities
• What are the effects of content nature (or sentiment, opinions)
flowing in network on the community structures and growth
• What is the community structure: degree of separation and sub-
communities that contribute for macro-level effects, e.g.,
coordination, engagement
“To Discover How A, is in Touch with B and C,
Is Affected by the Relation Between B & C”
-John Barnes
83
Foundation of network:
•Nodes
•Connections/Relationships
Image: http://www.onasurveys.com/

Graphs showing sparse (A) and dense (B) RT networks and their
corresponding follower graphs for 'call for action' and
'information sharing' tweet content types
M. Nagarajan, H. Purohit, and A. Sheth, ’A Qualitative Examination of Topical Tweet and Retweet Practices,’ 4th Int'l AAAI Conference on Weblogs
and Social Media, ICWSM 2010 84

Understanding Evolving Community
Structures for Coordination
85
User interaction networks of two topical communities– Occupy LA and Chicago,
of emerging influencers during Occupy Wall Street (OWS) event 2011
Application of evolving communities:
H. Purohit, J. Ajmera, S. Joshi, A. Verma, A. Sheth. Finding Influential Authors in Brand-Page Communities. 6th Int'l AAAI Conference on Weblogs and
Social Media (ICWSM), Dublin, Ireland, June 5-7, 2012

Evolution of influencer interaction networks for Romney vs. Obama topical
communities, during U.S. Presidential Election 2012 debates
Romney
Obama
Before 1st
debate
After 1st
debate
After
Hurricane Sandy
After 3rd
debate
Understanding Community Evolution for
Real-World Actions
86
Social Media analysis for US elections 2012, powered by Twitris: http://analysis.knoesis.org/uselection/insights/

On Understanding the Divergence of
Online Social Group Discussion
• Change of group discussion divergence over time, and different
phases of real world events
• Relation between discussion divergence and existing theories of
social cohesion and social identity in Psychology
• Prediction of future change in the group discussion divergence
Research Questions on Social Dynamics in Communities
Acknowledgement:
NSF SoCS grant for ‘Leveraging Social Media during Emergency
Response’
Purohit, H., Ruan, Y., Fuhry, D., Parthasarathy, S., & Sheth, A. (2014, May). On Understanding Divergence of Online Social Group
Discussion. In 8th Intl AAAI Conference on Weblogs and Social Media.

• Prior work:
– Focus on structural metrics to understand group evolution
dynamics, but may not be sufficient to answer ‘WHY a group
diverges over time’
• Our approach:
– Content driven measure: collective divergence of group
members for topics of discussion
– Features assessing role of socio-psychological theories:
cohesion & identity
• Data:
– Tweets during evolving events of natural disasters, and social
activism
Contrasting Prior Work and Approach
Evolution of groups in online
social communities
surrounding events 
88

• During #sandy, predicted low
diverging (focused) groups to
engage with on the updates
of flights, first delays &
cancellation, then resuming
• Natural disaster (D) events
(Hurricane Irene and Sandy)
have stronger correlations
with identity-driven features
than with cohesion featuresWe predicted group discussion
divergence
across phases, by 0.83 AUC
Time
89

Continuous Semantics for Evolving Events to Extract Smart Data
90

Dynamic Model Creation
Continuous Semantics 91

Live Demo of Powerful Social
Media Analysis: Twitris
92

Twitris - Motivation
1. Information Overload
• Multiple events around us
• WHAT to be aware of
• Multiple Storylines about same
event!!
93
Image: http://bit.ly/etFezl

2. Evolution of Citizen Observation
• with location and time
94

3. Semantics of Social perceptions
• What is being said about an event (theme)
• Where (spatial)
• When (temporal )
Twitris lets you browse citizen reports using social
perceptions as the fulcrum
95

Twitris: Semantic Social Web Mash-up
Facilitates understanding of multi-dimensional social perceptions over
SMS, Tweets, multimedia Web content, electronic news media
96
96

Twitris: Architecture
97
Meenakshi Nagarajan, Karthik Gomadam, Amit Sheth, Ajith Ranabahu, Raghava Mutharaju and Ashutosh Jadhav, ‘Spatio-Temporal-Thematic
Analysis of Citizen-Sensor Data - Challenges and Experiences,’ Tenth International Conference on Web Information Systems Engineering, 539 - 553,
Oct 5-7, 2009.

Twitris:
Functional
Overview
98

Twitris: Event Summarization
99

Incoming Tweets with need
types to give quick idea of what
is needed and where currently
#OKC
Legends for
Different
needs #OKC
100
Clicking on a tag brings contextual
information– relevant tweets,
news/blogs, and Wikipedia articles
Twitris: Real-time information

How People from Different
parts of the world talked
about US Election
Images and Videos
Related to US Election
101
Twitris: Analysis by location for contrast in
social perceptions

Twitris: Sentiment Analysis
• Sentiment Analysis
– using statistical and machine learning techniques
102

103
How was Obama doing in the first debate?
Twitris: Sentiment Analysis- Smart
Answers with reasoning!

The Dead People mentioned
in the event OWC
104
Twitris: Impact of Background
Knowledge

Twitris: Demo, Quick Show
http://twitris2.knoesis.org/
• Many other interesting efforts – Eg: Vivek K. Singh, Mingyan Gao, and Ramesh
Jain. 2010. From microblogs to social images: event analytics for situation
assessment. In Proceedings of the international conference on Multimedia
information retrieval (MIR '10). ACM, New York, NY, USA, 433-436.
105

• Do you have a sense of immense opportunity of analyzing
citizen sensing for useful social signals?
• Do you appreciate the broad range of issues and challenges?
Did we present examples and a few insights into how to
address some unique challenges?
• Did spatio-temporal-thematic, people-content-network,
emotion-sentiment-intent dimensions present reasonable way
to organize vast number of relevant research challenges and
techniques?
106
Conclusions

107
http://knoesis.org
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
thank you, and please visit us at

Citizen Sensor Data Mining, Social Media Analytics and Applications

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Citizen Sensor Data Mining, Social Media Analytics and Applications

Ähnlich wie Citizen Sensor Data Mining, Social Media Analytics and Applications (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Citizen Sensor Data Mining, Social Media Analytics and Applications

Hinweis der Redaktion