SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Mining Citizen Sensor Communities to Improve
Cooperation with Organizational Actors
June 23 2015
PhD Defense
Hemant Purohit (Advisor: Prof. Amit Sheth)	
  
Kno.e.sis, Dept. of CSE, Wright State University, USA
@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
2
@hemant_pt
Citizen Sensors: Access to Human
Observations & Interactions
Uni-directional communication
(TO people)
Unstructured, Unconstrained Language Data
•  Ambiguity
•  Sparsity
•  Diversity
•  Scalability
Bi-directional
(BY people, TO people)
Web 2.0
media
3
@hemant_pt
Goal: Data to Decision Making
Organizational Decision Making
Noisy Citizen Sensor data
4
SOCIAL SCIENCE
•  Experts on Organizations
•  Small-scale Data
COMPUTER SCIENCE
•  Experts on Mining
•  Large-scale data
Scope of My
Research
@hemant_pt
1.  No Structured Roles
2.  No Defined Tasks
ü  But “GENERATE”
Massive Data
1.  Structured Roles
2.  Defined Tasks
ü  COLLECT Data
ü  Process, & Make Decisions
ORGANIZATIONS	
  
Sure!
How to help?
CITIZEN	
  SENSOR	
  COMMUNITIES	
  
5
COOPERATIVE
SYSTEM
Can you
help us?
@hemant_pt
Computer-Supported Cooperative
Work (CSCW) Matrix
6
[Johansen
1988,
Baecker
1995]
TIME
PLACE
@hemant_pt
Articulation
Challenges
(Malone & Crowston 1990;
Schmidt & Bannon 1992)
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
DATA
PROBLEM
DESIGN
PROBLEM
7
ORGANIZATIONS	
   CITIZEN	
  SENSOR	
  COMMUNITIES	
  
Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
resource needs &
availabilities?
Org. Actor
@hemant_pt
Research Questions
—  Can general theories of offline conversation be
applied in the online context?
—  Can we model intentions to inform organizational
tasks using knowledge-guided features?
—  Can we find reliable groups to engage by modeling
collective group divergence using content-based
measure?
8
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
Scope of Concepts
•  Intent: aim of action, e.g., offering help
•  Engagement: involvement in activity, e.g., participating in discussion
9
@hemant_pt
Contributions
1.  Operationalized computing in cooperative system design
—  by accommodating articulation in Intent Mining, and
—  enriching awareness by Engagement Modeling
2.  Improved computation of online social data
—  by incorporating features from offline social theoretical knowledge
3.  Improved performance of intent classification
—  by fusing top-down & bottom-up data representations
4.  Improved explanation of group engagement
—  by modeling content divergence to complement existing structural measures
10
@hemant_pt
Data: Scope
—  Social Platform: Twitter
—  Important bridge between citizens & organizations
—  Characteristics
—  Users: follow/subscribe
—  Content: status updates (140 chars max)
—  Network: directed
—  Platform conversation functions
—  Reply
—  Retweet
—  Mention
11
@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
12
@hemant_pt
User1. Analyzing #Conversations on Twitter. Using platform provided
functions #REPLY, #RT, and #Mention.
..
…
……..
User2. I kinda feel one might need more than just the platform fn -- @User1 u
can think #Psycholinguistics, dude!
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
13
R1. Can general theories of conversation be applied in the online context?
@hemant_pt
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
—  Task: Given a set S of messages mi, Classify a sample {mi}
for {RP, None}, {RT, None}, {MN, None} , where
—  Ground-truth corpuses
—  RP = { mi | has_Reply_function (mi) = True }
—  RT = { mi | has_Retweet_function (mi) = True }
—  MN = { mi | has_Mention_function (mi) = True }
—  None = S – {RP, RT, MN}
—  Sample {mi} size = 3, based on average Reply conversation size
14
@hemant_pt
Conversation Classification: Offline
Theories
—  Psycholinguistics Indicators [Clark & Gibbs, 1986, Chafe 1987, etc.]
—  Determiners (‘the’ vs. ‘a/an’)
—  Dialogue Management (e.g., ‘thanks’, ’anyway’), etc.
—  Drawback
—  Offline analysis focused on positive conversation instances
—  Hypotheses
—  Offline theoretic features are discriminative
—  Such features correlate with information density
15
@hemant_pt
Conversation Classification: Feature
Examples
16
CATEGORY Hj Hj SET
H1 - Determiners (the)
H3 - Subject pronouns (she, he, we, they)
H9 - Dialogue management indicators (thanks, yes, ok, sorry, hi, hello, bye,
anyway, how about, so, what do you
mean, please, {could, would, should,
can, will} followed by pronoun)
H11 - Hedge words (kinda, sorta)
•  Feature_Hj (mi) = term-frequency ( Hj-set, mi )
•  Normalized
•  Total 14 feature categories
@hemant_pt
Conversation Classification: Results
—  Dataset
—  Tweets from 3 Disasters, and 3 Non-Disaster events
—  Varying set size (3.8K – 609K), time periods
—  Classifier:
—  Decision Tree
—  Evaluation: 10-fold Cross Validation
—  Accuracy: 62% - 78% [Lowest for {Mention,None} ]
—  AUC range: 0.63 - 0.84
17	
  Purohit,	
  Hampton,	
  Shalin,	
  Sheth	
  &	
  Flach.	
  In	
  Journal	
  of	
  Computers	
  in	
  Human	
  Behavior,	
  2013
@hemant_pt
Conversation Classification:
Discriminative Features
—  Consistent top features across classifiers
—  Pronouns (e.g., you, he)
—  Dialogue management (e.g., thanks)
—  Determiners (e.g., the)
—  Word counts
—  Positively correlated with RP, RT, MN
—  Correlation Coefficient up to 0.69
18
@hemant_pt
Conversation Classification:
Psycholinguistic Analysis
—  LIWC: Tool for deeper content analysis [Pennebaker, 2001]
—  Gives a measure per psychological category
—  Categories of interest
—  Social Interaction
—  Sensed Experience
—  Communication
—  Analyzed output sets in confusion matrices
Ø  Higher values for positive classified conversation
Ø suggests higher information for cooperative intent
19	
  Purohit,	
  Hampton,	
  Shalin,	
  Sheth	
  &	
  Flach.	
  In	
  Journal	
  of	
  Computers	
  in	
  Human	
  Behavior,	
  2013
True
Positive
False
Negative
False
Positive
True
Negative
@hemant_pt
Conversation Classification:
Lessons
1.  Offline theoretic features of conversations exist in the
online environment
Ø  Can be applied for computing social data
2.  Such features correlate with information density in content
- Reflection of conversation for an intent
20
@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
21
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
22
@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT	
   INTENT
Text	
  REDCROSS	
  to	
  90999	
  to	
  donate	
  10$	
  to	
  help	
  the	
  victims	
  of	
  
hurricane	
  sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP	
  
Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING	
  
23
@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT	
   INTENT
Text	
  REDCROSS	
  to	
  90999	
  to	
  donate	
  10$	
  to	
  help	
  the	
  victims	
  of	
  
hurricane	
  sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP	
  
Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING	
  
24
How to identify relevant intent from ambiguous, unconstrained
natural language text?
Relevant intent è Articulation of organizational tasks
(e.g., Seeking vs. Offering resources)
@hemant_pt
Intent Classification: Problem
Formulation
—  Given a set of user-generated text documents, identify
existing intents
—  Variety of interpretations
—  Problem statement: a multi-class classification task
approximate f: S ! C , where
C = {c1, c2 … cK}
is a set of predefined K intent classes, and
S = {m1, m2 … mN}
is a set of N short text documents
Focus - Cooperation-assistive intent classes, C= {Seeking, Offering, None}
25
@hemant_pt
Intent Classification: Related Work
TEXT CLASSIFICATION
TYPE
FOCUS EXAMPLE
Topic predominant
subject matter
sports or entertainment
Sentiment/Emotion/
Opinion
focus on present state
of emotional affairs
negative or positive;
happy emotion
Intent Focus on action, hence,
future state of affairs
offer to help after floods
e.g., I am going to watch the awesome Fast and Furious movie!! #Excited
26
@hemant_pt
Intent Classification: Related Work
DATA TYPE APPROACH FOCUS LIMITED APPLICABILITY
27
Formal text on
Webpages/blogs
(Kröll and Strohmaier 2009, -15;
Raslan et al. 2013, -14)
Knowledge
Acquisition:
via Rules, Clustering
•  Lack of large corpora with
proper grammatical structure
•  Poor quality text hard to parse
for dependencies
Commercial Reviews,
marketplace
(Hollerit et al. 2013, Wu et al. 2011,
Ramanand et al. 2010, Carlos &
Yalamanchi 2012, Nagarajan et al.
2009)
Classification:
via Rules, Lexical
template based,
Pattern
•  More generalized intents
(e.g., ‘help’ broader than ‘sell’)
•  Patterns implicit to capture than
for buying/selling
Search Queries
(Broder 2002, Downey et al. 2008,,
Case 2012, Wu et al. 2010,
Strohmaier & Kröll 2012)
User Profiling:
Query Classification
•  Lack of large query logs, click
graphs
•  Existence of social conversation
@hemant_pt
Intent Classification: Challenges
—  Unconstrained Natural Language in small space
—  Ambiguity in interpretation
—  Sparsity of low ‘signal-to-noise’: Imbalanced classes
—  1% signals (Seeking/Offering) in 4.9 million tweets #Sandy
—  Hard-to-predict problem:
—  commercial intent, F-1 score 65% on Twitter [Hollerit et al. 2013]
@Zuora wants to help @Network4Good with Hurricane Relief. Text SANDY to
80888 & donate $10 to @redcross @AmeriCares & @SalvationArmyUS #help
*Blue: offering intent, *Red: seeking intent
28
@hemant_pt
Intent Classification: Types & Features
29
Intent
Binary
Crisis Domain:
- [Varga et al. 2013] Problem vs. Aid (Japanese)
- Features: Syntactic, Noun-Verb templates, etc.
Commercial Domain:
- [Hollerit et al. 2013] Buy vs. Sell intent
- Features: N-grams, Part-of-Speech
Multiclass
Commercial Domain:
-  Not on Twitter
@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
(patterns defined for intent association)
BOTTOM-UP
Bag of N-grams Tokens:
Independent Tokens
(patterns derived from the data)
Our
Hybrid
Approach
Learning
Improves
Expressivity
Increases
30
@hemant_pt
Intent Classification Top-Down:
Binary Classifier - Prior Knowledge
—  Conceptual Dependency Theory [Schank, 1972]
—  Make meaning independent from the actual words in input
—  e.g., Class in an Ontology abstracts similar instances
—  Verb Lexicon [Hollerit et al. 2013]
—  Relevant Levin’s Verb categories [Levin, 1993]
—  e.g., give, send, etc.
—  Syntactic Pattern
—  Auxiliary & modals: e.g., ‘be’, ‘do’, ‘could’, etc. [Ramanand et al. 2010]
—  Word order: Verb-Subject positions, etc.
Purohit,	
  Hampton,	
  Bhatt,	
  Shalin,	
  Sheth	
  &	
  Flach.	
  In	
  Journal	
  of	
  CSCW,	
  2014	
  
31
@hemant_pt
Intent Classification Top-Down:
Binary Classifier – Psycholinguistic Rules
—  Transform knowledge into rules
—  Examples:
(Pronouns except 'you' = yes) ^ (need/want = yes) ^ (Adjective = yes/no) ^ (Things=yes) → Seeking
(Pronoun except 'you' | Proper Noun = yes) ^ (can/could/would/should = yes) ^ (Levin Verb = yes)
^ (Determiner = yes/no) ^ (Adjective = yes/no) ^ (Things = yes) -> Offering
Domain
ontology
32
Purohit,	
  Hampton,	
  Bhatt,	
  Shalin,	
  Sheth	
  &	
  Flach.	
  In	
  Journal	
  of	
  CSCW,	
  2014	
  
@hemant_pt
Intent Classification Top-Down:
Binary Classifier - Lessons
—  Preliminary Study
—  2000 conversation and then rule-based classified tweets:
labeled by two native speakers
—  Labels: Seeking, Offering, None
—  Results
—  Avg. F-1 score: 78% (Baseline F-1 score: 57% [Varga et al. 2013] )
—  Lessons
—  Role of prior knowledge: Domain Independent & Dependent
—  Limitation: Exhaustive rule-set, low Recall, Ambiguity
addressed, but sparsity
	
  	
  	
  	
  	
  	
  	
  	
  Purohit,	
  Hampton,	
  Bhatt,	
  Shalin,	
  Sheth	
  &	
  Flach.	
  In	
  Journal	
  of	
  CSCW,	
  2014	
  
33
@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
BOTTOM-UP
Bag of N-grams Tokens:
Independent Tokens
Hybrid
Approach
34
@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Design
—  AMBIGUITY: addressed via rich feature space
1. Top-Down: Declarative Knowledge Patterns [Ramanand et al. 2010]
DK(mi, P) ! {0,1}
e.g., P= b(like|want) b.*b(to)b.*b(bring|give|help|raise|donate)b

(acquired via Red Cross expert searches)
2. Abstraction: due to importance in info sharing [Nagarajan et al. 2010]
-  Numeric (e.g., $10) à _NUM_
-  Interactions (e.g., RT & @user) à _RT_ , _MENTION_
-  Links (e.g., http://bit.ly) ! _URL_
3. Bottom-Up: N-grams after stemming and abstraction [Hollerit et al. 2013]
TOKENIZER ( mi ) à { bi-, tri-gram }
35
@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Design
—  SPARSITY: addressed via algorithmic choices
1.  Feature Selection
2.  Ensemble Learning
3.  Classifier Chain
36
DATASET
Knowledge-driven
features
XT
, y
m_1
m_2
P(c2)
P(c1)
X1
T, y1
X2
T, y2
1 - P(c1)
@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Experiments
—  Binary classifiers:
—  Seeking vs. not Seeking
—  Offering vs. not Offering
—  Dataset:
—  Candidate set: 4000 donation classified tweets
—  Labels: min. 3 judges
—  Annotations: Seeking , Offering , None
37Purohit,	
  Castillo,	
  Diaz,	
  Sheth,	
  &	
  Meier.	
  First	
  Monday	
  journal,	
  2014	
  
@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Results
Experiments Supervised
Learning
Training
Samples
Precision
(*Baseline)
F-1
score
Class-
labels
Seeking vs. (None’ +
Offering)
RF
(CR=50:1)
3836 98%
(*79%)
46%
(56%)
56%
requests
Offering vs. (None’) RF
(CR=9:2)
1763 90%
(*65%)
44%
(*58%)
13%
offers
RF = Random Forest ensemble
CR = Asymmetric false–alarm Cost Ratios for True:False
Evaluation : 10-fold CV
Notes:
-  Domain requires high precision than recall
-  Scope for improving low recall
38Purohit,	
  Castillo,	
  Diaz,	
  Sheth,	
  &	
  Meier.	
  First	
  Monday	
  journal,	
  2014	
  
@hemant_pt
Intent Classification Hybrid:
Multiclass Classifier - Generalization
—  Lessons from binary classification
—  Improvement by fusing top-down & bottom-up
—  Sparsity
—  Ambiguity (Seeking & Offering complementary)
—  addressed via improved data representation
Hypothesis: Knowledge-guided approach improves
multiclass classification accuracy
39
@hemant_pt
TOP-DOWN
Knowledge Patterns
(DK) Declarative
(SK) Social Behavior
(CTK, CSK) Contrast Patterns
BOTTOM-UP
Bag of N-grams Tokens:
(T) Independent Tokens
Hybrid
Approach
40
@hemant_pt
Intent Classification Hybrid:
Multiclass Classifier – Feature Creation
1. (T) Bag of Tokens -
2. (DK) Declarative Knowledge Patterns
—  Domain expert guidance
—  Psycholinguistics syntactic & semantic rules
—  Expand by WordNet and Levin Verbs
e.g.,
3. (SK) Social Knowledge Indicators
—  Offline conversation indicators studied in Problem 1
e.g., Hj = Dialogue Management, Hj-set = {Thanks, anyway,..}
41
(how = yes) ^ (Modal-Set 'can' = yes) ^ (Pronouns except 'you' = yes) ^ (Levin Verb-Set 'give' = yes)
Feature_Hj (mi) = term-frequency ( Hj-set, mi )
Pj = Feature_Pj (mi) = 1 if Pj exists in mi , else 0
TOKENIZER(mi , min, max)
@hemant_pt
Intent Classification Hybrid:
Multiclass Classifier - Feature Creation
4. (CTK) Contrast Knowledge Patterns
INPUT: corpus {mi} cleaned and abstracted, min. support, X
For each class Cj
—  Find contrasting pattern using sequential pattern mining
OUTPUT: contrast patterns set {P} for each class Cj
5. (CPK) Contrast Patterns: on Part-of-Speech tags of {mi}
42
e.g., unique sequential patterns:
SEEKING: help .* victim .* _url_ .*
OFFERING: anyon .* know .* cloth .*
@hemant_pt
Intent Classification Hybrid:
Multiclass Classifier - Feature Creation
Finding CTK: Contrast Knowledge Patterns
For each class Cj
1.  Tokenize the cleaned, abstracted text of {mi }
2.  Mine Sequential Patterns: SPADE Algorithm
—  - Output: sequences of token sets, {P’}
3.  Reduce to minimal sequences {P}
4.  Compute growth rate & contrast strength for P with all other Ck
5.  Top-K ranked {P} by contrast strength
OUTPUT: contrast patterns set {P} for each class Cj
43
gr(P,Cj,Ck) = support (P,Cj) / support (P,Ck) .. (1)
Contrast-Growth (P,Cj,Ck) = 1/(|Cj| -1) ΣCk, k=/=j gr(P,Cj,Ck)/ (1 + gr(P,Cj,Ck)) ..(2)
Contrast-Strength(P,Cj) = support(P,Cj)*Contrast-Growth(P,Cj,Ck) .. (3)
@hemant_pt
CORPUS
Set of
short text
documents,
S
FEATURES
Knowledge-driven
features
XT
, y
M_1
M_2
M_K
.
.
.
Subset Xj
T ⊂ S such that, Xj
T includes
all the labeled instances of class Cj for
model M_j
Binarization Frameworks for
Multiclass Classifier: 1 vs. All
P(c2)
P(c1)
X1
T, y1
X2
T, y2
XK
T, yK
P(cK)
44(In 1 vs. 1 framework: K*(K-1)/2 classifiers, for each Cj,Ck pair)
@hemant_pt
Intent Classification Hybrid:
Multiclass Classifier - Experiments
—  Datasets
—  Dataset-1: Hurricane Sandy, Oct 27 – Nov 7, 2012
—  Dataset-2: Philippines Typhoon, Nov 7 – Nov 17, 2013
—  Parameters
—  Base Learner M_j: Random Forest, 10 trees with 100 features
—  bi-, tri-gram for (T)
—  K=100% & min. support 10% for CTK, 50% for CPK
45
@hemant_pt
Intent Classification:
Multiclass Classifier – Results
46
56% 58% 60% 62% 64% 66% 68% 70%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Avg. F-1 Score
(10-fold CV)
Frameworks:
Gain 7%, p < 0.05
Dataset-1 (Hurricane Sandy, 2012)
(Declarative)
(Social)
(Contrast)
@hemant_pt
74% 76% 78% 80% 82% 84% 86%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Intent Classification:
Multiclass Classifier - Results
47
Frameworks:
Gain 6%, p < 0.05
Dataset-2 (Philippines Typhoon, 2013)
(Declarative)
(Social)
(Contrast)
Avg. F-1 Score
(10-fold CV)
@hemant_pt
Lessons
1.  Top-down & Bottom-up hybrid approach improves data
representation for learning (complementary) intent classes
—  Top 1% discriminative features contained 50% knowledge driven
2.  Offline theoretic social conversation (SK) features (the, thanks,
etc.), often removed for text classification are valuable for
intent.
3.  There is a varying effect of knowledge types (SK vs. DK vs.
CTK/CPK) in different types of real world event datasets
Ø Culturally-sensitive psycholinguistics knowledge in future
48
@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
49
@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
50
@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on
topics
Problem 3. Group Engagement Model
51Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
How can organizations find reliable groups to engage for action?
@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on topics
—  Why & How do groups collectively evolve over time?
1.  Define a group from interaction network, g
2.  Define Divergence of g: content based in contrast to structure
3.  Predict change in the divergence between time slices
—  Features of g based on theories of social identity, & cohesion
Problem 3. Group Engagement Model
52Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
@hemant_pt
Group Engagement Model:
Integrated Approach Unlike Prior Work
People (User): Participant
of the discussion
Content (Text): Topic of
Interest
Network (Community):
Group around topic
AND
AND
Sources: tupper-lake.com/.../uploads/Community.jpg
http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
KEY POINT: capture
User Node Diversity
53
@hemant_pt
—  Candidate Group: Detect in interaction network
—  Group Discussion Divergence: Jenson-Shannon Divergence of topic
distribution on group members’ tweets
Group Engagement Model: Discussion
Divergence
where, H(*) = Shannon Entropy
Bt = Latent topic distribution of each tweet t in all members’ tweets |Tg| ,
Bg = mean topic distribution of group g, such that:
54
@hemant_pt
Lessons
1.  Content Divergence based measure helps explanation of
why groups collectively diverge
—  Less diverging group write more social & future action related
content
2.  Emerging events such as disasters have higher correlation
with social identity-driven features
Ø Role of social context
55
@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
56
@hemant_pt
DISASTER Event
Application-1: Filter Content for
Disaster Response
CITIZEN
Sensors
RESPONSE
Organizations
Me	
  and	
  @CeceVancePR	
  are	
  coordinating	
  a	
  clothing/
food	
  drive	
  for	
  families	
  affected	
  by	
  Hurricane	
  Sandy.	
  
If	
  you	
  would	
  like	
  to	
  donate,	
  DM	
  us	
  	
  	
  
Does	
  anyone	
  know	
  how	
  to	
  donate	
  clothes	
  to	
  
hurricane	
  #Sandy	
  victims?	
  
[SEEKING	
  
[OFFERING	
  
Intent-Classifiers
as a Service
57
@hemant_pt
Broader Impact: Classifier Model
integrated by Crisis Mapping Pioneer
58
@hemant_pt
DISASTER Event
Application-2: “We TRUST people!”
User engagement tool
CITIZEN
Sensors
RESPONSE
Organizations
Tool to mine
Important
users
59
@hemant_pt
Broader Impact: Winner of Int’l Challenge: UN
ITU Young Innovators 2014
60
@hemant_pt
Articulation
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
61
ORGANIZATIONS	
   CITIZEN	
  SENSOR	
  COMMUNITIES	
  
Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
Resource needs &
availabilities?
Org. Actor
@hemant_pt
Limitations & Future Work
—  Cooperative System
—  CSCW Application specific to domain of crisis
Ø  How to create a full What-Where-When-Who knowledge base
—  Intent Mining
—  Non-cooperation assistive intent classes not considered, as well as
the temporal drift of intent not considered
Ø  How to mine actor-level intent beyond document level
—  Group Engagement
—  Reliable prioritized groups based on Correlation, not Causality
—  Interplay of Offline and Online interactions beyond the scope
Ø  How to incorporate intent in the group divergence
—  Bipartite Intent Graph Matching
—  Reducing time complexity of Seeking vs. Offering matching
62
@hemant_pt
Conclusion
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation between citizen sensors and organizations in
the online social communities.
63
@hemant_pt
Thanks to the Committee Members
64
[Left to Right] Prof. Amit Sheth, (advisor, WSU), Prof. Guozhu Dong (WSU), Prof. Srinivasan
Parthasarathy (OSU), Prof. TK Prasad (WSU), Dr. Patrick Meier (QCRI), Prof. Valerie Shalin (WSU)
Computer Science Social Science
@hemant_pt
Acknowledgement,
Thanks and Questions J
—  NSF SoCS grant IIS-1111182 to support this work
—  Interdisciplinary Mentors especially Prof. John Flach (WSU), Drs. Carlos
Castillo (QCRI), Fernando Diaz (Microsoft), Meena Nagarajan (IBM)
—  Kno.e.sis team especially Andrew Hampton from Psychology dept. and
Shreyansh and Tanvi from CSE at Wright State, as well as Yiye Ruan (now
Google) & David Fuhry at the Data Mining Lab, Ohio State University
—  Colleagues: Digital Volunteers from the CrisisMappers network, StandBy Task
Force, InCrisisRelief.org, info4Disasters, Humanity Road, Ushahidi, etc. and
the subject matter experts at UN FPA
65
@hemant_pt
Ambiguity
Sparsity
Diversity
Scalability
•  Mutual Influence in Sparse
Friendship Network
[AAAI ICWSM’12]
•  User Summarization with
Sparse Profile Metadata
[ASE SocialInfo’12]
•  Matching intent as task of
Information Retrieval [FM’14]
•  Knowledge-aware Bi-partite
Matching [In preparation]
•  Short-Text Document Intent
Mining [FM’14, JCSCW’14]
•  Actor-Intent Mining
Complexity [In preparation]
•  Modeling Group Using
Diverse Social Identity &
Cohesion [AAAI ICWSM’14]
•  Modeling Diverse User-
Engagement [SOME WWW’11,
ACM WebSci’12]
(Interpretation)
(users)
(behaviors)
66
Other
works

Weitere ähnliche Inhalte

Was ist angesagt?

Knowledge Will Propel Machine Understanding of Big Data
Knowledge Will Propel Machine Understanding of Big DataKnowledge Will Propel Machine Understanding of Big Data
Knowledge Will Propel Machine Understanding of Big DataAmit Sheth
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social StreamsPramod Anantharam
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingAmit Sheth
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...Amit Sheth
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Artificial Intelligence Institute at UofSC
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Jan Sifra
 
Physical Cyber Social Computing
Physical Cyber Social ComputingPhysical Cyber Social Computing
Physical Cyber Social ComputingAmit Sheth
 
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...SMART Infrastructure Facility
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebValentina Presutti
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...Elena Simperl
 
Tutorial Cognition - Irene
Tutorial Cognition - IreneTutorial Cognition - Irene
Tutorial Cognition - IreneSSSW
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Sciencedatasciencekorea
 
Causal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionCausal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionFabio Stella
 

Was ist angesagt? (20)

Knowledge Will Propel Machine Understanding of Big Data
Knowledge Will Propel Machine Understanding of Big DataKnowledge Will Propel Machine Understanding of Big Data
Knowledge Will Propel Machine Understanding of Big Data
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected Manufacturing
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)
 
Domain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of DataDomain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of Data
 
Physical Cyber Social Computing
Physical Cyber Social ComputingPhysical Cyber Social Computing
Physical Cyber Social Computing
 
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...
SMART Seminar Series: Tweets, Emergencies and Experience - New Theory and Met...
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
AI that/for matters
AI that/for mattersAI that/for matters
AI that/for matters
 
Empirical AI Research
Empirical AI Research Empirical AI Research
Empirical AI Research
 
The human face of AI: how collective and augmented intelligence can help sol...
The human face of AI:  how collective and augmented intelligence can help sol...The human face of AI:  how collective and augmented intelligence can help sol...
The human face of AI: how collective and augmented intelligence can help sol...
 
Tutorial Cognition - Irene
Tutorial Cognition - IreneTutorial Cognition - Irene
Tutorial Cognition - Irene
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Science
 
Causal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionCausal networks, learning and inference - Introduction
Causal networks, learning and inference - Introduction
 

Andere mochten auch

Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Artificial Intelligence Institute at UofSC
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Artificial Intelligence Institute at UofSC
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Artificial Intelligence Institute at UofSC
 
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social MediaMeena Nagarajan
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
 

Andere mochten auch (20)

Contrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and ClassificationContrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and Classification
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Automatic Emotion Identification from Text
Automatic Emotion Identification from TextAutomatic Emotion Identification from Text
Automatic Emotion Identification from Text
 
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent MiningAshutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
 
PhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith RanabahuPhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith Ranabahu
 
Knowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information ExtractionKnowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information Extraction
 
Mining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated ContentMining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated Content
 
Cartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defenseCartic Ramakrishnan's dissertation defense
Cartic Ramakrishnan's dissertation defense
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Satya Sahoo Thesis Defense
Satya Sahoo Thesis DefenseSatya Sahoo Thesis Defense
Satya Sahoo Thesis Defense
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social Media
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Trust Management: A Tutorial
Trust Management: A TutorialTrust Management: A Tutorial
Trust Management: A Tutorial
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 

Ähnlich wie Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Ed Chi
 
Psychology of Social Media:Implication for Design
Psychology of Social Media:Implication for DesignPsychology of Social Media:Implication for Design
Psychology of Social Media:Implication for DesignShelly D. Farnham, Ph.D.
 
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08"Understanding Broadband from the Outside" - ARNIC Seminar April1 08
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08ARNIC
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
 
Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkKhan Mostafa
 
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...
IRJET-  	  Real Time Sentiment Analysis of Political Twitter Data using Machi...IRJET-  	  Real Time Sentiment Analysis of Political Twitter Data using Machi...
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...IRJET Journal
 
Sentiment Mining of Community Development Program Evaluation Based on Social ...
Sentiment Mining of Community Development Program Evaluation Based on Social ...Sentiment Mining of Community Development Program Evaluation Based on Social ...
Sentiment Mining of Community Development Program Evaluation Based on Social ...TELKOMNIKA JOURNAL
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...cscpconf
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...csandit
 
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...Michele Thomas
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis reportSavio Aberneithie
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...IJECEIAES
 
A large-scale sentiment analysis using political tweets
A large-scale sentiment analysis using political tweetsA large-scale sentiment analysis using political tweets
A large-scale sentiment analysis using political tweetsIJECEIAES
 
An Analytical Survey on Hate Speech Recognition through NLP and Deep Learning
An Analytical Survey on Hate Speech Recognition through NLP and Deep LearningAn Analytical Survey on Hate Speech Recognition through NLP and Deep Learning
An Analytical Survey on Hate Speech Recognition through NLP and Deep LearningIRJET Journal
 
Collaborating Across Boundaries to Engage Journalism Students in Computationa...
Collaborating Across Boundaries to Engage Journalism Students in Computationa...Collaborating Across Boundaries to Engage Journalism Students in Computationa...
Collaborating Across Boundaries to Engage Journalism Students in Computationa...Kim Pearson
 
Application For Sentiment And Demographic Analysis Processes On Social Media
Application For Sentiment And Demographic Analysis Processes On Social MediaApplication For Sentiment And Demographic Analysis Processes On Social Media
Application For Sentiment And Demographic Analysis Processes On Social MediaWhitney Anderson
 
Need Response 1The subcomponent of crowdsourcing ICT platform.docx
Need Response 1The subcomponent of crowdsourcing ICT platform.docxNeed Response 1The subcomponent of crowdsourcing ICT platform.docx
Need Response 1The subcomponent of crowdsourcing ICT platform.docxvannagoforth
 

Ähnlich wie Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations (20)

Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
Information Seeking with Social Signals: Anatomy of a Social Tag-based Explor...
 
Psychology of Social Media:Implication for Design
Psychology of Social Media:Implication for DesignPsychology of Social Media:Implication for Design
Psychology of Social Media:Implication for Design
 
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08"Understanding Broadband from the Outside" - ARNIC Seminar April1 08
"Understanding Broadband from the Outside" - ARNIC Seminar April1 08
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
 
Graph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social NetworkGraph-based Analysis and Opinion Mining in Social Network
Graph-based Analysis and Opinion Mining in Social Network
 
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...
IRJET-  	  Real Time Sentiment Analysis of Political Twitter Data using Machi...IRJET-  	  Real Time Sentiment Analysis of Political Twitter Data using Machi...
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...
 
Sentiment Mining of Community Development Program Evaluation Based on Social ...
Sentiment Mining of Community Development Program Evaluation Based on Social ...Sentiment Mining of Community Development Program Evaluation Based on Social ...
Sentiment Mining of Community Development Program Evaluation Based on Social ...
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
 
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
A MODEL BASED ON SENTIMENTS ANALYSIS FOR STOCK EXCHANGE PREDICTION - CASE STU...
 
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...
Analyzing The Organization Of Collaborative Math Problem-Solving In Online Ch...
 
Twitter sentimentanalysis report
Twitter sentimentanalysis reportTwitter sentimentanalysis report
Twitter sentimentanalysis report
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
 
A large-scale sentiment analysis using political tweets
A large-scale sentiment analysis using political tweetsA large-scale sentiment analysis using political tweets
A large-scale sentiment analysis using political tweets
 
An Analytical Survey on Hate Speech Recognition through NLP and Deep Learning
An Analytical Survey on Hate Speech Recognition through NLP and Deep LearningAn Analytical Survey on Hate Speech Recognition through NLP and Deep Learning
An Analytical Survey on Hate Speech Recognition through NLP and Deep Learning
 
Collaborating Across Boundaries to Engage Journalism Students in Computationa...
Collaborating Across Boundaries to Engage Journalism Students in Computationa...Collaborating Across Boundaries to Engage Journalism Students in Computationa...
Collaborating Across Boundaries to Engage Journalism Students in Computationa...
 
Application For Sentiment And Demographic Analysis Processes On Social Media
Application For Sentiment And Demographic Analysis Processes On Social MediaApplication For Sentiment And Demographic Analysis Processes On Social Media
Application For Sentiment And Demographic Analysis Processes On Social Media
 
Need Response 1The subcomponent of crowdsourcing ICT platform.docx
Need Response 1The subcomponent of crowdsourcing ICT platform.docxNeed Response 1The subcomponent of crowdsourcing ICT platform.docx
Need Response 1The subcomponent of crowdsourcing ICT platform.docx
 

Kürzlich hochgeladen

Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptbibisarnayak0
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxNiranjanYadav41
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 

Kürzlich hochgeladen (20)

Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.pptAutonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptxBSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

  • 1. Mining Citizen Sensor Communities to Improve Cooperation with Organizational Actors June 23 2015 PhD Defense Hemant Purohit (Advisor: Prof. Amit Sheth)   Kno.e.sis, Dept. of CSE, Wright State University, USA
  • 2. @hemant_pt Outline —  Citizen Sensor Communities & Organizations —  Cooperative System Design Challenges —  Contributions —  Problem 1. Conversation Classification using Offline Theories —  Problem 2. Intent Classification —  Problem 3. Engagement Modeling —  Applications —  Limitations & Future Work 2
  • 3. @hemant_pt Citizen Sensors: Access to Human Observations & Interactions Uni-directional communication (TO people) Unstructured, Unconstrained Language Data •  Ambiguity •  Sparsity •  Diversity •  Scalability Bi-directional (BY people, TO people) Web 2.0 media 3
  • 4. @hemant_pt Goal: Data to Decision Making Organizational Decision Making Noisy Citizen Sensor data 4 SOCIAL SCIENCE •  Experts on Organizations •  Small-scale Data COMPUTER SCIENCE •  Experts on Mining •  Large-scale data Scope of My Research
  • 5. @hemant_pt 1.  No Structured Roles 2.  No Defined Tasks ü  But “GENERATE” Massive Data 1.  Structured Roles 2.  Defined Tasks ü  COLLECT Data ü  Process, & Make Decisions ORGANIZATIONS   Sure! How to help? CITIZEN  SENSOR  COMMUNITIES   5 COOPERATIVE SYSTEM Can you help us?
  • 6. @hemant_pt Computer-Supported Cooperative Work (CSCW) Matrix 6 [Johansen 1988, Baecker 1995] TIME PLACE
  • 7. @hemant_pt Articulation Challenges (Malone & Crowston 1990; Schmidt & Bannon 1992) ENGAGEMENT MODELING INTENT MINING COOPERATIVE SYSTEM DATA PROBLEM DESIGN PROBLEM 7 ORGANIZATIONS   CITIZEN  SENSOR  COMMUNITIES   Awareness Q1. Who to engage first? Org. Actor Q2. What are resource needs & availabilities? Org. Actor
  • 8. @hemant_pt Research Questions —  Can general theories of offline conversation be applied in the online context? —  Can we model intentions to inform organizational tasks using knowledge-guided features? —  Can we find reliable groups to engage by modeling collective group divergence using content-based measure? 8
  • 9. @hemant_pt Thesis: Statement Prior knowledge, and interplay of features of users, their content, and network efficiently model Intent & Engagement for cooperation of citizen sensor communities. Scope of Concepts •  Intent: aim of action, e.g., offering help •  Engagement: involvement in activity, e.g., participating in discussion 9
  • 10. @hemant_pt Contributions 1.  Operationalized computing in cooperative system design —  by accommodating articulation in Intent Mining, and —  enriching awareness by Engagement Modeling 2.  Improved computation of online social data —  by incorporating features from offline social theoretical knowledge 3.  Improved performance of intent classification —  by fusing top-down & bottom-up data representations 4.  Improved explanation of group engagement —  by modeling content divergence to complement existing structural measures 10
  • 11. @hemant_pt Data: Scope —  Social Platform: Twitter —  Important bridge between citizens & organizations —  Characteristics —  Users: follow/subscribe —  Content: status updates (140 chars max) —  Network: directed —  Platform conversation functions —  Reply —  Retweet —  Mention 11
  • 12. @hemant_pt Outline —  Citizen Sensor Communities & Organizations —  Cooperative System Design Challenges —  Awareness: tackle via Engagement Modeling —  Articulation: tackle via Intent Mining —  Contributions —  Problem 1. Conversation Classification using Offline Theories —  Problem 2. Intent Classification —  Problem 3. Engagement Modeling —  Applications —  Limitations & Future Work 12
  • 13. @hemant_pt User1. Analyzing #Conversations on Twitter. Using platform provided functions #REPLY, #RT, and #Mention. .. … …….. User2. I kinda feel one might need more than just the platform fn -- @User1 u can think #Psycholinguistics, dude! Problem 1. Conversation Classification —  Function of Reply, Retweet, Mention reflect conversation 13 R1. Can general theories of conversation be applied in the online context?
  • 14. @hemant_pt Problem 1. Conversation Classification —  Function of Reply, Retweet, Mention reflect conversation —  Task: Given a set S of messages mi, Classify a sample {mi} for {RP, None}, {RT, None}, {MN, None} , where —  Ground-truth corpuses —  RP = { mi | has_Reply_function (mi) = True } —  RT = { mi | has_Retweet_function (mi) = True } —  MN = { mi | has_Mention_function (mi) = True } —  None = S – {RP, RT, MN} —  Sample {mi} size = 3, based on average Reply conversation size 14
  • 15. @hemant_pt Conversation Classification: Offline Theories —  Psycholinguistics Indicators [Clark & Gibbs, 1986, Chafe 1987, etc.] —  Determiners (‘the’ vs. ‘a/an’) —  Dialogue Management (e.g., ‘thanks’, ’anyway’), etc. —  Drawback —  Offline analysis focused on positive conversation instances —  Hypotheses —  Offline theoretic features are discriminative —  Such features correlate with information density 15
  • 16. @hemant_pt Conversation Classification: Feature Examples 16 CATEGORY Hj Hj SET H1 - Determiners (the) H3 - Subject pronouns (she, he, we, they) H9 - Dialogue management indicators (thanks, yes, ok, sorry, hi, hello, bye, anyway, how about, so, what do you mean, please, {could, would, should, can, will} followed by pronoun) H11 - Hedge words (kinda, sorta) •  Feature_Hj (mi) = term-frequency ( Hj-set, mi ) •  Normalized •  Total 14 feature categories
  • 17. @hemant_pt Conversation Classification: Results —  Dataset —  Tweets from 3 Disasters, and 3 Non-Disaster events —  Varying set size (3.8K – 609K), time periods —  Classifier: —  Decision Tree —  Evaluation: 10-fold Cross Validation —  Accuracy: 62% - 78% [Lowest for {Mention,None} ] —  AUC range: 0.63 - 0.84 17  Purohit,  Hampton,  Shalin,  Sheth  &  Flach.  In  Journal  of  Computers  in  Human  Behavior,  2013
  • 18. @hemant_pt Conversation Classification: Discriminative Features —  Consistent top features across classifiers —  Pronouns (e.g., you, he) —  Dialogue management (e.g., thanks) —  Determiners (e.g., the) —  Word counts —  Positively correlated with RP, RT, MN —  Correlation Coefficient up to 0.69 18
  • 19. @hemant_pt Conversation Classification: Psycholinguistic Analysis —  LIWC: Tool for deeper content analysis [Pennebaker, 2001] —  Gives a measure per psychological category —  Categories of interest —  Social Interaction —  Sensed Experience —  Communication —  Analyzed output sets in confusion matrices Ø  Higher values for positive classified conversation Ø suggests higher information for cooperative intent 19  Purohit,  Hampton,  Shalin,  Sheth  &  Flach.  In  Journal  of  Computers  in  Human  Behavior,  2013 True Positive False Negative False Positive True Negative
  • 20. @hemant_pt Conversation Classification: Lessons 1.  Offline theoretic features of conversations exist in the online environment Ø  Can be applied for computing social data 2.  Such features correlate with information density in content - Reflection of conversation for an intent 20
  • 21. @hemant_pt Outline —  Citizen Sensor Communities & Organizations —  Cooperative System Design Challenges —  Awareness: tackle via Engagement Modeling —  Articulation: tackle via Intent Mining —  Contributions —  Problem 1. Conversation Classification using Offline Theories —  Problem 2. Intent Classification —  Problem 3. Engagement Modeling —  Applications —  Limitations & Future Work 21
  • 22. @hemant_pt Thesis: Statement Prior knowledge, and interplay of features of users, their content, and network efficiently model Intent & Engagement for cooperation of citizen sensor communities. 22
  • 23. @hemant_pt Short-text Document Intent —  Intent: Aim of action DOCUMENT   INTENT Text  REDCROSS  to  90999  to  donate  10$  to  help  the  victims  of   hurricane  sandy SEEKING HELP Anyone know where the nearest #RedCross is? I wanna give blood today to help the victims of hurricane Sandy OFFERING HELP   Would like to urge all citizens to make the proper preparations for Hurricane #Sandy - prep is key - http:// t.co/LyCSprbk has valuable info! ADVISING   23
  • 24. @hemant_pt Short-text Document Intent —  Intent: Aim of action DOCUMENT   INTENT Text  REDCROSS  to  90999  to  donate  10$  to  help  the  victims  of   hurricane  sandy SEEKING HELP Anyone know where the nearest #RedCross is? I wanna give blood today to help the victims of hurricane Sandy OFFERING HELP   Would like to urge all citizens to make the proper preparations for Hurricane #Sandy - prep is key - http:// t.co/LyCSprbk has valuable info! ADVISING   24 How to identify relevant intent from ambiguous, unconstrained natural language text? Relevant intent è Articulation of organizational tasks (e.g., Seeking vs. Offering resources)
  • 25. @hemant_pt Intent Classification: Problem Formulation —  Given a set of user-generated text documents, identify existing intents —  Variety of interpretations —  Problem statement: a multi-class classification task approximate f: S ! C , where C = {c1, c2 … cK} is a set of predefined K intent classes, and S = {m1, m2 … mN} is a set of N short text documents Focus - Cooperation-assistive intent classes, C= {Seeking, Offering, None} 25
  • 26. @hemant_pt Intent Classification: Related Work TEXT CLASSIFICATION TYPE FOCUS EXAMPLE Topic predominant subject matter sports or entertainment Sentiment/Emotion/ Opinion focus on present state of emotional affairs negative or positive; happy emotion Intent Focus on action, hence, future state of affairs offer to help after floods e.g., I am going to watch the awesome Fast and Furious movie!! #Excited 26
  • 27. @hemant_pt Intent Classification: Related Work DATA TYPE APPROACH FOCUS LIMITED APPLICABILITY 27 Formal text on Webpages/blogs (Kröll and Strohmaier 2009, -15; Raslan et al. 2013, -14) Knowledge Acquisition: via Rules, Clustering •  Lack of large corpora with proper grammatical structure •  Poor quality text hard to parse for dependencies Commercial Reviews, marketplace (Hollerit et al. 2013, Wu et al. 2011, Ramanand et al. 2010, Carlos & Yalamanchi 2012, Nagarajan et al. 2009) Classification: via Rules, Lexical template based, Pattern •  More generalized intents (e.g., ‘help’ broader than ‘sell’) •  Patterns implicit to capture than for buying/selling Search Queries (Broder 2002, Downey et al. 2008,, Case 2012, Wu et al. 2010, Strohmaier & Kröll 2012) User Profiling: Query Classification •  Lack of large query logs, click graphs •  Existence of social conversation
  • 28. @hemant_pt Intent Classification: Challenges —  Unconstrained Natural Language in small space —  Ambiguity in interpretation —  Sparsity of low ‘signal-to-noise’: Imbalanced classes —  1% signals (Seeking/Offering) in 4.9 million tweets #Sandy —  Hard-to-predict problem: —  commercial intent, F-1 score 65% on Twitter [Hollerit et al. 2013] @Zuora wants to help @Network4Good with Hurricane Relief. Text SANDY to 80888 & donate $10 to @redcross @AmeriCares & @SalvationArmyUS #help *Blue: offering intent, *Red: seeking intent 28
  • 29. @hemant_pt Intent Classification: Types & Features 29 Intent Binary Crisis Domain: - [Varga et al. 2013] Problem vs. Aid (Japanese) - Features: Syntactic, Noun-Verb templates, etc. Commercial Domain: - [Hollerit et al. 2013] Buy vs. Sell intent - Features: N-grams, Part-of-Speech Multiclass Commercial Domain: -  Not on Twitter
  • 30. @hemant_pt TOP-DOWN Pattern Rules: Declarative Knowledge (patterns defined for intent association) BOTTOM-UP Bag of N-grams Tokens: Independent Tokens (patterns derived from the data) Our Hybrid Approach Learning Improves Expressivity Increases 30
  • 31. @hemant_pt Intent Classification Top-Down: Binary Classifier - Prior Knowledge —  Conceptual Dependency Theory [Schank, 1972] —  Make meaning independent from the actual words in input —  e.g., Class in an Ontology abstracts similar instances —  Verb Lexicon [Hollerit et al. 2013] —  Relevant Levin’s Verb categories [Levin, 1993] —  e.g., give, send, etc. —  Syntactic Pattern —  Auxiliary & modals: e.g., ‘be’, ‘do’, ‘could’, etc. [Ramanand et al. 2010] —  Word order: Verb-Subject positions, etc. Purohit,  Hampton,  Bhatt,  Shalin,  Sheth  &  Flach.  In  Journal  of  CSCW,  2014   31
  • 32. @hemant_pt Intent Classification Top-Down: Binary Classifier – Psycholinguistic Rules —  Transform knowledge into rules —  Examples: (Pronouns except 'you' = yes) ^ (need/want = yes) ^ (Adjective = yes/no) ^ (Things=yes) → Seeking (Pronoun except 'you' | Proper Noun = yes) ^ (can/could/would/should = yes) ^ (Levin Verb = yes) ^ (Determiner = yes/no) ^ (Adjective = yes/no) ^ (Things = yes) -> Offering Domain ontology 32 Purohit,  Hampton,  Bhatt,  Shalin,  Sheth  &  Flach.  In  Journal  of  CSCW,  2014  
  • 33. @hemant_pt Intent Classification Top-Down: Binary Classifier - Lessons —  Preliminary Study —  2000 conversation and then rule-based classified tweets: labeled by two native speakers —  Labels: Seeking, Offering, None —  Results —  Avg. F-1 score: 78% (Baseline F-1 score: 57% [Varga et al. 2013] ) —  Lessons —  Role of prior knowledge: Domain Independent & Dependent —  Limitation: Exhaustive rule-set, low Recall, Ambiguity addressed, but sparsity                Purohit,  Hampton,  Bhatt,  Shalin,  Sheth  &  Flach.  In  Journal  of  CSCW,  2014   33
  • 34. @hemant_pt TOP-DOWN Pattern Rules: Declarative Knowledge BOTTOM-UP Bag of N-grams Tokens: Independent Tokens Hybrid Approach 34
  • 35. @hemant_pt Intent Classification Hybrid: Binary Classifier - Design —  AMBIGUITY: addressed via rich feature space 1. Top-Down: Declarative Knowledge Patterns [Ramanand et al. 2010] DK(mi, P) ! {0,1} e.g., P= b(like|want) b.*b(to)b.*b(bring|give|help|raise|donate)b (acquired via Red Cross expert searches) 2. Abstraction: due to importance in info sharing [Nagarajan et al. 2010] -  Numeric (e.g., $10) à _NUM_ -  Interactions (e.g., RT & @user) à _RT_ , _MENTION_ -  Links (e.g., http://bit.ly) ! _URL_ 3. Bottom-Up: N-grams after stemming and abstraction [Hollerit et al. 2013] TOKENIZER ( mi ) à { bi-, tri-gram } 35
  • 36. @hemant_pt Intent Classification Hybrid: Binary Classifier - Design —  SPARSITY: addressed via algorithmic choices 1.  Feature Selection 2.  Ensemble Learning 3.  Classifier Chain 36 DATASET Knowledge-driven features XT , y m_1 m_2 P(c2) P(c1) X1 T, y1 X2 T, y2 1 - P(c1)
  • 37. @hemant_pt Intent Classification Hybrid: Binary Classifier - Experiments —  Binary classifiers: —  Seeking vs. not Seeking —  Offering vs. not Offering —  Dataset: —  Candidate set: 4000 donation classified tweets —  Labels: min. 3 judges —  Annotations: Seeking , Offering , None 37Purohit,  Castillo,  Diaz,  Sheth,  &  Meier.  First  Monday  journal,  2014  
  • 38. @hemant_pt Intent Classification Hybrid: Binary Classifier - Results Experiments Supervised Learning Training Samples Precision (*Baseline) F-1 score Class- labels Seeking vs. (None’ + Offering) RF (CR=50:1) 3836 98% (*79%) 46% (56%) 56% requests Offering vs. (None’) RF (CR=9:2) 1763 90% (*65%) 44% (*58%) 13% offers RF = Random Forest ensemble CR = Asymmetric false–alarm Cost Ratios for True:False Evaluation : 10-fold CV Notes: -  Domain requires high precision than recall -  Scope for improving low recall 38Purohit,  Castillo,  Diaz,  Sheth,  &  Meier.  First  Monday  journal,  2014  
  • 39. @hemant_pt Intent Classification Hybrid: Multiclass Classifier - Generalization —  Lessons from binary classification —  Improvement by fusing top-down & bottom-up —  Sparsity —  Ambiguity (Seeking & Offering complementary) —  addressed via improved data representation Hypothesis: Knowledge-guided approach improves multiclass classification accuracy 39
  • 40. @hemant_pt TOP-DOWN Knowledge Patterns (DK) Declarative (SK) Social Behavior (CTK, CSK) Contrast Patterns BOTTOM-UP Bag of N-grams Tokens: (T) Independent Tokens Hybrid Approach 40
  • 41. @hemant_pt Intent Classification Hybrid: Multiclass Classifier – Feature Creation 1. (T) Bag of Tokens - 2. (DK) Declarative Knowledge Patterns —  Domain expert guidance —  Psycholinguistics syntactic & semantic rules —  Expand by WordNet and Levin Verbs e.g., 3. (SK) Social Knowledge Indicators —  Offline conversation indicators studied in Problem 1 e.g., Hj = Dialogue Management, Hj-set = {Thanks, anyway,..} 41 (how = yes) ^ (Modal-Set 'can' = yes) ^ (Pronouns except 'you' = yes) ^ (Levin Verb-Set 'give' = yes) Feature_Hj (mi) = term-frequency ( Hj-set, mi ) Pj = Feature_Pj (mi) = 1 if Pj exists in mi , else 0 TOKENIZER(mi , min, max)
  • 42. @hemant_pt Intent Classification Hybrid: Multiclass Classifier - Feature Creation 4. (CTK) Contrast Knowledge Patterns INPUT: corpus {mi} cleaned and abstracted, min. support, X For each class Cj —  Find contrasting pattern using sequential pattern mining OUTPUT: contrast patterns set {P} for each class Cj 5. (CPK) Contrast Patterns: on Part-of-Speech tags of {mi} 42 e.g., unique sequential patterns: SEEKING: help .* victim .* _url_ .* OFFERING: anyon .* know .* cloth .*
  • 43. @hemant_pt Intent Classification Hybrid: Multiclass Classifier - Feature Creation Finding CTK: Contrast Knowledge Patterns For each class Cj 1.  Tokenize the cleaned, abstracted text of {mi } 2.  Mine Sequential Patterns: SPADE Algorithm —  - Output: sequences of token sets, {P’} 3.  Reduce to minimal sequences {P} 4.  Compute growth rate & contrast strength for P with all other Ck 5.  Top-K ranked {P} by contrast strength OUTPUT: contrast patterns set {P} for each class Cj 43 gr(P,Cj,Ck) = support (P,Cj) / support (P,Ck) .. (1) Contrast-Growth (P,Cj,Ck) = 1/(|Cj| -1) ΣCk, k=/=j gr(P,Cj,Ck)/ (1 + gr(P,Cj,Ck)) ..(2) Contrast-Strength(P,Cj) = support(P,Cj)*Contrast-Growth(P,Cj,Ck) .. (3)
  • 44. @hemant_pt CORPUS Set of short text documents, S FEATURES Knowledge-driven features XT , y M_1 M_2 M_K . . . Subset Xj T ⊂ S such that, Xj T includes all the labeled instances of class Cj for model M_j Binarization Frameworks for Multiclass Classifier: 1 vs. All P(c2) P(c1) X1 T, y1 X2 T, y2 XK T, yK P(cK) 44(In 1 vs. 1 framework: K*(K-1)/2 classifiers, for each Cj,Ck pair)
  • 45. @hemant_pt Intent Classification Hybrid: Multiclass Classifier - Experiments —  Datasets —  Dataset-1: Hurricane Sandy, Oct 27 – Nov 7, 2012 —  Dataset-2: Philippines Typhoon, Nov 7 – Nov 17, 2013 —  Parameters —  Base Learner M_j: Random Forest, 10 trees with 100 features —  bi-, tri-gram for (T) —  K=100% & min. support 10% for CTK, 50% for CPK 45
  • 46. @hemant_pt Intent Classification: Multiclass Classifier – Results 46 56% 58% 60% 62% 64% 66% 68% 70% T (Baseline) T,DK T,SK T,CTK,CSK T,DK,SK,CTK,CSK 1-vs-1 1-vs-All Avg. F-1 Score (10-fold CV) Frameworks: Gain 7%, p < 0.05 Dataset-1 (Hurricane Sandy, 2012) (Declarative) (Social) (Contrast)
  • 47. @hemant_pt 74% 76% 78% 80% 82% 84% 86% T (Baseline) T,DK T,SK T,CTK,CSK T,DK,SK,CTK,CSK 1-vs-1 1-vs-All Intent Classification: Multiclass Classifier - Results 47 Frameworks: Gain 6%, p < 0.05 Dataset-2 (Philippines Typhoon, 2013) (Declarative) (Social) (Contrast) Avg. F-1 Score (10-fold CV)
  • 48. @hemant_pt Lessons 1.  Top-down & Bottom-up hybrid approach improves data representation for learning (complementary) intent classes —  Top 1% discriminative features contained 50% knowledge driven 2.  Offline theoretic social conversation (SK) features (the, thanks, etc.), often removed for text classification are valuable for intent. 3.  There is a varying effect of knowledge types (SK vs. DK vs. CTK/CPK) in different types of real world event datasets Ø Culturally-sensitive psycholinguistics knowledge in future 48
  • 49. @hemant_pt Outline —  Citizen Sensor Communities & Organizations —  Cooperative System Design Challenges —  Awareness: tackle via Engagement Modeling —  Articulation: tackle via Intent Mining —  Contributions —  Problem 1. Conversation Classification using Offline Theories —  Problem 2. Intent Classification —  Problem 3. Engagement Modeling —  Applications —  Limitations & Future Work 49
  • 50. @hemant_pt Thesis: Statement Prior knowledge, and interplay of features of users, their content, and network efficiently model Intent & Engagement for cooperation of citizen sensor communities. 50
  • 51. @hemant_pt —  Engagement: degree of involvement in discussion —  Reliable groups: stay focused and collectively behave to diverge on topics Problem 3. Group Engagement Model 51Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014 How can organizations find reliable groups to engage for action?
  • 52. @hemant_pt —  Engagement: degree of involvement in discussion —  Reliable groups: stay focused and collectively behave to diverge on topics —  Why & How do groups collectively evolve over time? 1.  Define a group from interaction network, g 2.  Define Divergence of g: content based in contrast to structure 3.  Predict change in the divergence between time slices —  Features of g based on theories of social identity, & cohesion Problem 3. Group Engagement Model 52Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
  • 53. @hemant_pt Group Engagement Model: Integrated Approach Unlike Prior Work People (User): Participant of the discussion Content (Text): Topic of Interest Network (Community): Group around topic AND AND Sources: tupper-lake.com/.../uploads/Community.jpg http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html KEY POINT: capture User Node Diversity 53
  • 54. @hemant_pt —  Candidate Group: Detect in interaction network —  Group Discussion Divergence: Jenson-Shannon Divergence of topic distribution on group members’ tweets Group Engagement Model: Discussion Divergence where, H(*) = Shannon Entropy Bt = Latent topic distribution of each tweet t in all members’ tweets |Tg| , Bg = mean topic distribution of group g, such that: 54
  • 55. @hemant_pt Lessons 1.  Content Divergence based measure helps explanation of why groups collectively diverge —  Less diverging group write more social & future action related content 2.  Emerging events such as disasters have higher correlation with social identity-driven features Ø Role of social context 55
  • 56. @hemant_pt Outline —  Citizen Sensor Communities & Organizations —  Cooperative System Design Challenges —  Awareness: tackle via Engagement Modeling —  Articulation: tackle via Intent Mining —  Contributions —  Problem 1. Conversation Classification using Offline Theories —  Problem 2. Intent Classification —  Problem 3. Engagement Modeling —  Applications —  Limitations & Future Work 56
  • 57. @hemant_pt DISASTER Event Application-1: Filter Content for Disaster Response CITIZEN Sensors RESPONSE Organizations Me  and  @CeceVancePR  are  coordinating  a  clothing/ food  drive  for  families  affected  by  Hurricane  Sandy.   If  you  would  like  to  donate,  DM  us       Does  anyone  know  how  to  donate  clothes  to   hurricane  #Sandy  victims?   [SEEKING   [OFFERING   Intent-Classifiers as a Service 57
  • 58. @hemant_pt Broader Impact: Classifier Model integrated by Crisis Mapping Pioneer 58
  • 59. @hemant_pt DISASTER Event Application-2: “We TRUST people!” User engagement tool CITIZEN Sensors RESPONSE Organizations Tool to mine Important users 59
  • 60. @hemant_pt Broader Impact: Winner of Int’l Challenge: UN ITU Young Innovators 2014 60
  • 61. @hemant_pt Articulation ENGAGEMENT MODELING INTENT MINING COOPERATIVE SYSTEM 61 ORGANIZATIONS   CITIZEN  SENSOR  COMMUNITIES   Awareness Q1. Who to engage first? Org. Actor Q2. What are Resource needs & availabilities? Org. Actor
  • 62. @hemant_pt Limitations & Future Work —  Cooperative System —  CSCW Application specific to domain of crisis Ø  How to create a full What-Where-When-Who knowledge base —  Intent Mining —  Non-cooperation assistive intent classes not considered, as well as the temporal drift of intent not considered Ø  How to mine actor-level intent beyond document level —  Group Engagement —  Reliable prioritized groups based on Correlation, not Causality —  Interplay of Offline and Online interactions beyond the scope Ø  How to incorporate intent in the group divergence —  Bipartite Intent Graph Matching —  Reducing time complexity of Seeking vs. Offering matching 62
  • 63. @hemant_pt Conclusion Prior knowledge, and interplay of features of users, their content, and network efficiently model Intent & Engagement for cooperation between citizen sensors and organizations in the online social communities. 63
  • 64. @hemant_pt Thanks to the Committee Members 64 [Left to Right] Prof. Amit Sheth, (advisor, WSU), Prof. Guozhu Dong (WSU), Prof. Srinivasan Parthasarathy (OSU), Prof. TK Prasad (WSU), Dr. Patrick Meier (QCRI), Prof. Valerie Shalin (WSU) Computer Science Social Science
  • 65. @hemant_pt Acknowledgement, Thanks and Questions J —  NSF SoCS grant IIS-1111182 to support this work —  Interdisciplinary Mentors especially Prof. John Flach (WSU), Drs. Carlos Castillo (QCRI), Fernando Diaz (Microsoft), Meena Nagarajan (IBM) —  Kno.e.sis team especially Andrew Hampton from Psychology dept. and Shreyansh and Tanvi from CSE at Wright State, as well as Yiye Ruan (now Google) & David Fuhry at the Data Mining Lab, Ohio State University —  Colleagues: Digital Volunteers from the CrisisMappers network, StandBy Task Force, InCrisisRelief.org, info4Disasters, Humanity Road, Ushahidi, etc. and the subject matter experts at UN FPA 65
  • 66. @hemant_pt Ambiguity Sparsity Diversity Scalability •  Mutual Influence in Sparse Friendship Network [AAAI ICWSM’12] •  User Summarization with Sparse Profile Metadata [ASE SocialInfo’12] •  Matching intent as task of Information Retrieval [FM’14] •  Knowledge-aware Bi-partite Matching [In preparation] •  Short-Text Document Intent Mining [FM’14, JCSCW’14] •  Actor-Intent Mining Complexity [In preparation] •  Modeling Group Using Diverse Social Identity & Cohesion [AAAI ICWSM’14] •  Modeling Diverse User- Engagement [SOME WWW’11, ACM WebSci’12] (Interpretation) (users) (behaviors) 66 Other works