Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Mining Citizen Sensor Communities to Improve
Cooperation with Organizational Actors
June 23 2015
PhD Defense
Hemant Purohit (Advisor: Prof. Amit Sheth)

Kno.e.sis, Dept. of CSE, Wright State University, USA

@hemant_pt
Outline
—  Citizen Sensor Communities & Organizations
—  Cooperative System Design Challenges
—  Contributions
—  Problem 1. Conversation Classification using Offline Theories
—  Problem 2. Intent Classification
—  Problem 3. Engagement Modeling
—  Applications
—  Limitations & Future Work
2

@hemant_pt
Citizen Sensors: Access to Human
Observations & Interactions
Uni-directional communication
(TO people)
Unstructured, Unconstrained Language Data
•  Ambiguity
•  Sparsity
•  Diversity
•  Scalability
Bi-directional
(BY people, TO people)
Web 2.0
media
3

@hemant_pt
Goal: Data to Decision Making
Organizational Decision Making
Noisy Citizen Sensor data
4
SOCIAL SCIENCE
•  Experts on Organizations
•  Small-scale Data
COMPUTER SCIENCE
•  Experts on Mining
•  Large-scale data
Scope of My
Research

@hemant_pt
1.  No Structured Roles
2.  No Defined Tasks
ü  But “GENERATE”
Massive Data
1.  Structured Roles
2.  Defined Tasks
ü  COLLECT Data
ü  Process, & Make Decisions
ORGANIZATIONS

Sure!
How to help?
CITIZEN
SENSOR
COMMUNITIES

5
COOPERATIVE
SYSTEM
Can you
help us?

@hemant_pt
Computer-Supported Cooperative
Work (CSCW) Matrix
6
[Johansen
1988,
Baecker
1995]
TIME
PLACE

@hemant_pt
Articulation
Challenges
(Malone & Crowston 1990;
Schmidt & Bannon 1992)
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
DATA
PROBLEM
DESIGN
PROBLEM
7
ORGANIZATIONS
CITIZEN
SENSOR
COMMUNITIES

Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
resource needs &
availabilities?
Org. Actor

@hemant_pt
Research Questions
—  Can general theories of offline conversation be
applied in the online context?
—  Can we model intentions to inform organizational
tasks using knowledge-guided features?
—  Can we find reliable groups to engage by modeling
collective group divergence using content-based
measure?
8

@hemant_pt
Thesis: Statement
Prior knowledge, and
interplay of features of users, their content, and network
efficiently model
Intent & Engagement
for cooperation of citizen sensor communities.
Scope of Concepts
•  Intent: aim of action, e.g., offering help
•  Engagement: involvement in activity, e.g., participating in discussion
9

@hemant_pt
Contributions
1.  Operationalized computing in cooperative system design
—  by accommodating articulation in Intent Mining, and
—  enriching awareness by Engagement Modeling
2.  Improved computation of online social data
—  by incorporating features from offline social theoretical knowledge
3.  Improved performance of intent classification
—  by fusing top-down & bottom-up data representations
4.  Improved explanation of group engagement
—  by modeling content divergence to complement existing structural measures
10

@hemant_pt
Data: Scope
—  Social Platform: Twitter
—  Important bridge between citizens & organizations
—  Characteristics
—  Users: follow/subscribe
—  Content: status updates (140 chars max)
—  Network: directed
—  Platform conversation functions
—  Reply
—  Retweet
—  Mention
11

@hemant_pt
Outline
—  Awareness: tackle via Engagement Modeling
—  Articulation: tackle via Intent Mining
12

@hemant_pt
User1. Analyzing #Conversations on Twitter. Using platform provided
functions #REPLY, #RT, and #Mention.
..
…
……..
User2. I kinda feel one might need more than just the platform fn -- @User1 u
can think #Psycholinguistics, dude!
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
13
R1. Can general theories of conversation be applied in the online context?

@hemant_pt
Problem 1. Conversation Classification
—  Function of Reply, Retweet, Mention reflect conversation
—  Task: Given a set S of messages mi, Classify a sample {mi}
for {RP, None}, {RT, None}, {MN, None} , where
—  Ground-truth corpuses
—  RP = { mi | has_Reply_function (mi) = True }
—  RT = { mi | has_Retweet_function (mi) = True }
—  MN = { mi | has_Mention_function (mi) = True }
—  None = S – {RP, RT, MN}
—  Sample {mi} size = 3, based on average Reply conversation size
14

@hemant_pt
Conversation Classification: Offline
Theories
—  Psycholinguistics Indicators [Clark & Gibbs, 1986, Chafe 1987, etc.]
—  Determiners (‘the’ vs. ‘a/an’)
—  Dialogue Management (e.g., ‘thanks’, ’anyway’), etc.
—  Drawback
—  Offline analysis focused on positive conversation instances
—  Hypotheses
—  Offline theoretic features are discriminative
—  Such features correlate with information density
15

@hemant_pt
Conversation Classification: Feature
Examples
16
CATEGORY Hj Hj SET
H1 - Determiners (the)
H3 - Subject pronouns (she, he, we, they)
H9 - Dialogue management indicators (thanks, yes, ok, sorry, hi, hello, bye,
anyway, how about, so, what do you
mean, please, {could, would, should,
can, will} followed by pronoun)
H11 - Hedge words (kinda, sorta)
•  Feature_Hj (mi) = term-frequency ( Hj-set, mi )
•  Normalized
•  Total 14 feature categories

@hemant_pt
Conversation Classification: Results
—  Dataset
—  Tweets from 3 Disasters, and 3 Non-Disaster events
—  Varying set size (3.8K – 609K), time periods
—  Classifier:
—  Decision Tree
—  Evaluation: 10-fold Cross Validation
—  Accuracy: 62% - 78% [Lowest for {Mention,None} ]
—  AUC range: 0.63 - 0.84
17
Purohit,
Hampton,
Shalin,
Sheth
&
Flach.
In
Journal
of
Computers
in
Human
Behavior,
2013

@hemant_pt
Conversation Classification:
Discriminative Features
—  Consistent top features across classifiers
—  Pronouns (e.g., you, he)
—  Dialogue management (e.g., thanks)
—  Determiners (e.g., the)
—  Word counts
—  Positively correlated with RP, RT, MN
—  Correlation Coefficient up to 0.69
18

@hemant_pt
Psycholinguistic Analysis
—  LIWC: Tool for deeper content analysis [Pennebaker, 2001]
—  Gives a measure per psychological category
—  Categories of interest
—  Social Interaction
—  Sensed Experience
—  Communication
—  Analyzed output sets in confusion matrices
Ø  Higher values for positive classified conversation
Ø suggests higher information for cooperative intent
19
Purohit,
Hampton,
Shalin,
Sheth
&
Flach.
In
Journal
of
Computers
in
Human
Behavior,
2013
True
Positive
False
Negative
False
Positive
True
Negative

@hemant_pt
Lessons
1.  Offline theoretic features of conversations exist in the
online environment
Ø  Can be applied for computing social data
2.  Such features correlate with information density in content
- Reflection of conversation for an intent
20

@hemant_pt
Outline
21

@hemant_pt
Thesis: Statement
efficiently model
Intent & Engagement
22

@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT
INTENT
Text
REDCROSS
to
90999
to
donate
10$
to
help
the
victims
of

hurricane
sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP

Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING

23

@hemant_pt
Short-text Document Intent
—  Intent: Aim of action
DOCUMENT
INTENT
Text
REDCROSS
to
90999
to
donate
10$
to
help
the
victims
of

hurricane
sandy
SEEKING HELP
Anyone know where the nearest #RedCross is? I wanna
give blood today to help the victims of hurricane Sandy
OFFERING HELP

Would like to urge all citizens to make the proper
preparations for Hurricane #Sandy - prep is key - http://
t.co/LyCSprbk has valuable info!
ADVISING

24
How to identify relevant intent from ambiguous, unconstrained
natural language text?
Relevant intent è Articulation of organizational tasks
(e.g., Seeking vs. Offering resources)

@hemant_pt
Intent Classification: Problem
Formulation
—  Given a set of user-generated text documents, identify
existing intents
—  Variety of interpretations
—  Problem statement: a multi-class classification task
approximate f: S ! C , where
C = {c1, c2 … cK}
is a set of predefined K intent classes, and
S = {m1, m2 … mN}
is a set of N short text documents
Focus - Cooperation-assistive intent classes, C= {Seeking, Offering, None}
25

@hemant_pt
Intent Classification: Related Work
TEXT CLASSIFICATION
TYPE
FOCUS EXAMPLE
Topic predominant
subject matter
sports or entertainment
Sentiment/Emotion/
Opinion
focus on present state
of emotional affairs
negative or positive;
happy emotion
Intent Focus on action, hence,
future state of affairs
offer to help after floods
e.g., I am going to watch the awesome Fast and Furious movie!! #Excited
26

@hemant_pt
Intent Classification: Related Work
DATA TYPE APPROACH FOCUS LIMITED APPLICABILITY
27
Formal text on
Webpages/blogs
(Kröll and Strohmaier 2009, -15;
Raslan et al. 2013, -14)
Knowledge
Acquisition:
via Rules, Clustering
•  Lack of large corpora with
proper grammatical structure
•  Poor quality text hard to parse
for dependencies
Commercial Reviews,
marketplace
(Hollerit et al. 2013, Wu et al. 2011,
Ramanand et al. 2010, Carlos &
Yalamanchi 2012, Nagarajan et al.
2009)
Classification:
via Rules, Lexical
template based,
Pattern
•  More generalized intents
(e.g., ‘help’ broader than ‘sell’)
•  Patterns implicit to capture than
for buying/selling
Search Queries
(Broder 2002, Downey et al. 2008,,
Case 2012, Wu et al. 2010,
Strohmaier & Kröll 2012)
User Profiling:
Query Classification
•  Lack of large query logs, click
graphs
•  Existence of social conversation

@hemant_pt
Intent Classification: Challenges
—  Unconstrained Natural Language in small space
—  Ambiguity in interpretation
—  Sparsity of low ‘signal-to-noise’: Imbalanced classes
—  1% signals (Seeking/Offering) in 4.9 million tweets #Sandy
—  Hard-to-predict problem:
—  commercial intent, F-1 score 65% on Twitter [Hollerit et al. 2013]
@Zuora wants to help @Network4Good with Hurricane Relief. Text SANDY to
80888 & donate $10 to @redcross @AmeriCares & @SalvationArmyUS #help
*Blue: offering intent, *Red: seeking intent
28

@hemant_pt
Intent Classification: Types & Features
29
Intent
Binary
Crisis Domain:
- [Varga et al. 2013] Problem vs. Aid (Japanese)
- Features: Syntactic, Noun-Verb templates, etc.
Commercial Domain:
- [Hollerit et al. 2013] Buy vs. Sell intent
- Features: N-grams, Part-of-Speech
Multiclass
Commercial Domain:
-  Not on Twitter

@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
(patterns defined for intent association)
BOTTOM-UP
Bag of N-grams Tokens:
Independent Tokens
(patterns derived from the data)
Our
Hybrid
Approach
Learning
Improves
Expressivity
Increases
30

@hemant_pt
Intent Classification Top-Down:
Binary Classifier - Prior Knowledge
—  Conceptual Dependency Theory [Schank, 1972]
—  Make meaning independent from the actual words in input
—  e.g., Class in an Ontology abstracts similar instances
—  Verb Lexicon [Hollerit et al. 2013]
—  Relevant Levin’s Verb categories [Levin, 1993]
—  e.g., give, send, etc.
—  Syntactic Pattern
—  Auxiliary & modals: e.g., ‘be’, ‘do’, ‘could’, etc. [Ramanand et al. 2010]
—  Word order: Verb-Subject positions, etc.
Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

31

@hemant_pt
Binary Classifier – Psycholinguistic Rules
—  Transform knowledge into rules
—  Examples:
(Pronouns except 'you' = yes) ^ (need/want = yes) ^ (Adjective = yes/no) ^ (Things=yes) → Seeking
(Pronoun except 'you' | Proper Noun = yes) ^ (can/could/would/should = yes) ^ (Levin Verb = yes)
^ (Determiner = yes/no) ^ (Adjective = yes/no) ^ (Things = yes) -> Offering
Domain
ontology
32
Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

@hemant_pt
Binary Classifier - Lessons
—  Preliminary Study
—  2000 conversation and then rule-based classified tweets:
labeled by two native speakers
—  Labels: Seeking, Offering, None
—  Results
—  Avg. F-1 score: 78% (Baseline F-1 score: 57% [Varga et al. 2013] )
—  Lessons
—  Role of prior knowledge: Domain Independent & Dependent
—  Limitation: Exhaustive rule-set, low Recall, Ambiguity
addressed, but sparsity

Purohit,
Hampton,
Bhatt,
Shalin,
Sheth
&
Flach.
In
Journal
of
CSCW,
2014

33

@hemant_pt
TOP-DOWN
Pattern Rules:
Declarative Knowledge
BOTTOM-UP
Independent Tokens
Hybrid
Approach
34

@hemant_pt
Intent Classification Hybrid:
Binary Classifier - Design
—  AMBIGUITY: addressed via rich feature space
1. Top-Down: Declarative Knowledge Patterns [Ramanand et al. 2010]
DK(mi, P) ! {0,1}
e.g., P= b(like|want) b.*b(to)b.*b(bring|give|help|raise|donate)b

(acquired via Red Cross expert searches)
2. Abstraction: due to importance in info sharing [Nagarajan et al. 2010]
-  Numeric (e.g., $10) à _NUM_
-  Interactions (e.g., RT & @user) à _RT_ , _MENTION_
-  Links (e.g., http://bit.ly) ! _URL_
3. Bottom-Up: N-grams after stemming and abstraction [Hollerit et al. 2013]
TOKENIZER ( mi ) à { bi-, tri-gram }
35

@hemant_pt
Binary Classifier - Design
—  SPARSITY: addressed via algorithmic choices
1.  Feature Selection
2.  Ensemble Learning
3.  Classifier Chain
36
DATASET
Knowledge-driven
features
XT
, y
m_1
m_2
P(c2)
P(c1)
X1
T, y1
X2
T, y2
1 - P(c1)

@hemant_pt
Binary Classifier - Experiments
—  Binary classifiers:
—  Seeking vs. not Seeking
—  Offering vs. not Offering
—  Dataset:
—  Candidate set: 4000 donation classified tweets
—  Labels: min. 3 judges
—  Annotations: Seeking , Offering , None
37Purohit,
Castillo,
Diaz,
Sheth,
&
Meier.
First
Monday
journal,
2014

@hemant_pt
Binary Classifier - Results
Experiments Supervised
Learning
Training
Samples
Precision
(*Baseline)
F-1
score
Class-
labels
Seeking vs. (None’ +
Offering)
RF
(CR=50:1)
3836 98%
(*79%)
46%
(56%)
56%
requests
Offering vs. (None’) RF
(CR=9:2)
1763 90%
(*65%)
44%
(*58%)
13%
offers
RF = Random Forest ensemble
CR = Asymmetric false–alarm Cost Ratios for True:False
Evaluation : 10-fold CV
Notes:
-  Domain requires high precision than recall
-  Scope for improving low recall
38Purohit,
Castillo,
Diaz,
Sheth,
&
Meier.
First
Monday
journal,
2014

@hemant_pt
Multiclass Classifier - Generalization
—  Lessons from binary classification
—  Improvement by fusing top-down & bottom-up
—  Sparsity
—  Ambiguity (Seeking & Offering complementary)
—  addressed via improved data representation
Hypothesis: Knowledge-guided approach improves
multiclass classification accuracy
39

@hemant_pt
TOP-DOWN
Knowledge Patterns
(DK) Declarative
(SK) Social Behavior
(CTK, CSK) Contrast Patterns
BOTTOM-UP
(T) Independent Tokens
Hybrid
Approach
40

@hemant_pt
Multiclass Classifier – Feature Creation
1. (T) Bag of Tokens -
2. (DK) Declarative Knowledge Patterns
—  Domain expert guidance
—  Psycholinguistics syntactic & semantic rules
—  Expand by WordNet and Levin Verbs
e.g.,
3. (SK) Social Knowledge Indicators
—  Offline conversation indicators studied in Problem 1
e.g., Hj = Dialogue Management, Hj-set = {Thanks, anyway,..}
41
(how = yes) ^ (Modal-Set 'can' = yes) ^ (Pronouns except 'you' = yes) ^ (Levin Verb-Set 'give' = yes)
Feature_Hj (mi) = term-frequency ( Hj-set, mi )
Pj = Feature_Pj (mi) = 1 if Pj exists in mi , else 0
TOKENIZER(mi , min, max)

@hemant_pt
Multiclass Classifier - Feature Creation
4. (CTK) Contrast Knowledge Patterns
INPUT: corpus {mi} cleaned and abstracted, min. support, X
For each class Cj
—  Find contrasting pattern using sequential pattern mining
OUTPUT: contrast patterns set {P} for each class Cj
5. (CPK) Contrast Patterns: on Part-of-Speech tags of {mi}
42
e.g., unique sequential patterns:
SEEKING: help .* victim .* _url_ .*
OFFERING: anyon .* know .* cloth .*

@hemant_pt
Multiclass Classifier - Feature Creation
Finding CTK: Contrast Knowledge Patterns
For each class Cj
1.  Tokenize the cleaned, abstracted text of {mi }
2.  Mine Sequential Patterns: SPADE Algorithm
—  - Output: sequences of token sets, {P’}
3.  Reduce to minimal sequences {P}
4.  Compute growth rate & contrast strength for P with all other Ck
5.  Top-K ranked {P} by contrast strength
OUTPUT: contrast patterns set {P} for each class Cj
43
gr(P,Cj,Ck) = support (P,Cj) / support (P,Ck) .. (1)
Contrast-Growth (P,Cj,Ck) = 1/(|Cj| -1) ΣCk, k=/=j gr(P,Cj,Ck)/ (1 + gr(P,Cj,Ck)) ..(2)
Contrast-Strength(P,Cj) = support(P,Cj)*Contrast-Growth(P,Cj,Ck) .. (3)

@hemant_pt
CORPUS
Set of
short text
documents,
S
FEATURES
Knowledge-driven
features
XT
, y
M_1
M_2
M_K
.
.
.
Subset Xj
T ⊂ S such that, Xj
T includes
all the labeled instances of class Cj for
model M_j
Binarization Frameworks for
Multiclass Classifier: 1 vs. All
P(c2)
P(c1)
X1
T, y1
X2
T, y2
XK
T, yK
P(cK)
44(In 1 vs. 1 framework: K*(K-1)/2 classifiers, for each Cj,Ck pair)

@hemant_pt
Multiclass Classifier - Experiments
—  Datasets
—  Dataset-1: Hurricane Sandy, Oct 27 – Nov 7, 2012
—  Dataset-2: Philippines Typhoon, Nov 7 – Nov 17, 2013
—  Parameters
—  Base Learner M_j: Random Forest, 10 trees with 100 features
—  bi-, tri-gram for (T)
—  K=100% & min. support 10% for CTK, 50% for CPK
45

@hemant_pt
Intent Classification:
Multiclass Classifier – Results
46
56% 58% 60% 62% 64% 66% 68% 70%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Avg. F-1 Score
(10-fold CV)
Frameworks:
Gain 7%, p < 0.05
Dataset-1 (Hurricane Sandy, 2012)
(Declarative)
(Social)
(Contrast)

@hemant_pt
74% 76% 78% 80% 82% 84% 86%
T (Baseline)
T,DK
T,SK
T,CTK,CSK
T,DK,SK,CTK,CSK
1-vs-1
1-vs-All
Intent Classification:
Multiclass Classifier - Results
47
Frameworks:
Gain 6%, p < 0.05
Dataset-2 (Philippines Typhoon, 2013)
(Declarative)
(Social)
(Contrast)
Avg. F-1 Score
(10-fold CV)

@hemant_pt
Lessons
1.  Top-down & Bottom-up hybrid approach improves data
representation for learning (complementary) intent classes
—  Top 1% discriminative features contained 50% knowledge driven
2.  Offline theoretic social conversation (SK) features (the, thanks,
etc.), often removed for text classification are valuable for
intent.
3.  There is a varying effect of knowledge types (SK vs. DK vs.
CTK/CPK) in different types of real world event datasets
Ø Culturally-sensitive psycholinguistics knowledge in future
48

@hemant_pt
Outline
49

@hemant_pt
Thesis: Statement
efficiently model
Intent & Engagement
50

@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on
topics
Problem 3. Group Engagement Model
51Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014
How can organizations find reliable groups to engage for action?

@hemant_pt
—  Engagement: degree of involvement in discussion
—  Reliable groups: stay focused and collectively behave to diverge on topics
—  Why & How do groups collectively evolve over time?
1.  Define a group from interaction network, g
2.  Define Divergence of g: content based in contrast to structure
3.  Predict change in the divergence between time slices
—  Features of g based on theories of social identity, & cohesion
Problem 3. Group Engagement Model
52Purohit, Ruan, Fuhry, Parthasarathy, & Sheth. ICWSM 2014

@hemant_pt
Group Engagement Model:
Integrated Approach Unlike Prior Work
People (User): Participant
of the discussion
Content (Text): Topic of
Interest
Network (Community):
Group around topic
AND
AND
Sources: tupper-lake.com/.../uploads/Community.jpg
http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
KEY POINT: capture
User Node Diversity
53

@hemant_pt
—  Candidate Group: Detect in interaction network
—  Group Discussion Divergence: Jenson-Shannon Divergence of topic
distribution on group members’ tweets
Group Engagement Model: Discussion
Divergence
where, H(*) = Shannon Entropy
Bt = Latent topic distribution of each tweet t in all members’ tweets |Tg| ,
Bg = mean topic distribution of group g, such that:
54

@hemant_pt
Lessons
1.  Content Divergence based measure helps explanation of
why groups collectively diverge
—  Less diverging group write more social & future action related
content
2.  Emerging events such as disasters have higher correlation
with social identity-driven features
Ø Role of social context
55

@hemant_pt
Outline
56

@hemant_pt
DISASTER Event
Application-1: Filter Content for
Disaster Response
CITIZEN
Sensors
RESPONSE
Organizations
Me
and
@CeceVancePR
are
coordinating
a
clothing/
food
drive
for
families
affected
by
Hurricane
Sandy.

If
you
would
like
to
donate,
DM
us

Does
anyone
know
how
to
donate
clothes
to

hurricane
#Sandy
victims?

[SEEKING

[OFFERING

Intent-Classifiers
as a Service
57

@hemant_pt
Broader Impact: Classifier Model
integrated by Crisis Mapping Pioneer
58

@hemant_pt
DISASTER Event
Application-2: “We TRUST people!”
User engagement tool
CITIZEN
Sensors
RESPONSE
Organizations
Tool to mine
Important
users
59

@hemant_pt
Broader Impact: Winner of Int’l Challenge: UN
ITU Young Innovators 2014
60

@hemant_pt
Articulation
ENGAGEMENT MODELING INTENT MINING
COOPERATIVE
SYSTEM
61
ORGANIZATIONS
CITIZEN
SENSOR
COMMUNITIES

Awareness
Q1. Who to
engage
first?
Org. Actor
Q2. What are
Resource needs &
availabilities?
Org. Actor

@hemant_pt
Limitations & Future Work
—  Cooperative System
—  CSCW Application specific to domain of crisis
Ø  How to create a full What-Where-When-Who knowledge base
—  Intent Mining
—  Non-cooperation assistive intent classes not considered, as well as
the temporal drift of intent not considered
Ø  How to mine actor-level intent beyond document level
—  Group Engagement
—  Reliable prioritized groups based on Correlation, not Causality
—  Interplay of Offline and Online interactions beyond the scope
Ø  How to incorporate intent in the group divergence
—  Bipartite Intent Graph Matching
—  Reducing time complexity of Seeking vs. Offering matching
62

@hemant_pt
Conclusion
efficiently model
Intent & Engagement
for cooperation between citizen sensors and organizations in
the online social communities.
63

@hemant_pt
Thanks to the Committee Members
64
[Left to Right] Prof. Amit Sheth, (advisor, WSU), Prof. Guozhu Dong (WSU), Prof. Srinivasan
Parthasarathy (OSU), Prof. TK Prasad (WSU), Dr. Patrick Meier (QCRI), Prof. Valerie Shalin (WSU)
Computer Science Social Science

@hemant_pt
Acknowledgement,
Thanks and Questions J
—  NSF SoCS grant IIS-1111182 to support this work
—  Interdisciplinary Mentors especially Prof. John Flach (WSU), Drs. Carlos
Castillo (QCRI), Fernando Diaz (Microsoft), Meena Nagarajan (IBM)
—  Kno.e.sis team especially Andrew Hampton from Psychology dept. and
Shreyansh and Tanvi from CSE at Wright State, as well as Yiye Ruan (now
Google) & David Fuhry at the Data Mining Lab, Ohio State University
—  Colleagues: Digital Volunteers from the CrisisMappers network, StandBy Task
Force, InCrisisRelief.org, info4Disasters, Humanity Road, Ushahidi, etc. and
the subject matter experts at UN FPA
65

@hemant_pt
Ambiguity
Sparsity
Diversity
Scalability
•  Mutual Influence in Sparse
Friendship Network
[AAAI ICWSM’12]
•  User Summarization with
Sparse Profile Metadata
[ASE SocialInfo’12]
•  Matching intent as task of
Information Retrieval [FM’14]
•  Knowledge-aware Bi-partite
Matching [In preparation]
•  Short-Text Document Intent
Mining [FM’14, JCSCW’14]
•  Actor-Intent Mining
Complexity [In preparation]
•  Modeling Group Using
Diverse Social Identity &
Cohesion [AAAI ICWSM’14]
•  Modeling Diverse User-
Engagement [SOME WWW’11,
ACM WebSci’12]
(Interpretation)
(users)
(behaviors)
66
Other
works

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations

Ähnlich wie Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation with Organizations