Video: https://www.youtube.com/watch?v=ZCToaDgxnAs
Abstract:
People's emotions can be gleaned from their text using machine learning techniques to build models that exploit large self-labeled emotion data from social media. Further, the self-labeled emotion data can be effectively adapted to train emotion classifiers in different target domains where training data are sparse.
Emotions are both prevalent in and essential to most aspects of our lives. They influence our decision-making, affect our social relationships and shape our daily behavior. With the rapid growth of emotion-rich textual content, such as microblog posts, blog posts, and forum discussions, there is a growing need to develop algorithms and techniques for identifying people's emotions expressed in text. It has valuable implications for the studies of suicide prevention, employee productivity, well-being of people, customer relationship management, etc. However, emotion identification is quite challenging partly due to the following reasons: i) It is a multi-class classification problem that usually involves at least six basic emotions. Text describing an event or situation that causes the emotion can be devoid of explicit emotion-bearing words, thus the distinction between different emotions can be very subtle, which makes it difficult to glean emotions purely by keywords. ii) Manual annotation of emotion data by human experts is very labor-intensive and error-prone. iii) Existing labeled emotion datasets are relatively small, which fails to provide a comprehensive coverage of emotion-triggering events and situations.
1. Ohio Center of Excellence in Knowledge-Enabled Computing
Automatic Emotion Identification
from Text
Wenbo Wang
Kno.e.sis Center
Advisor:
Dr. Amit P. Sheth
Committee members:
Dr. Keke Chen
Kevin Haas
Dr. T.K. Prasad
Dr. Ramakanth Kavuluru
Ph.D. Dissertation Defense
2. Ohio Center of Excellence in Knowledge-Enabled Computing 2Sadness
Anger
Fear
Joy
Your emotions are the slaves to your thoughts,
and you are the slave to your emotions.
--Elizabeth Gilbert
3. Ohio Center of Excellence in Knowledge-Enabled Computing 3
S&P 500 dropped 1% …
Jon C. Ogg, credit
Stock Market
4. Ohio Center of Excellence in Knowledge-Enabled Computing 4
Employee Productivity
Credit, credit
5. Ohio Center of Excellence in Knowledge-Enabled Computing 5
Subjective Well-being
Credit, credit
Happiness IndexECG
Physical State Emotional State
6. Ohio Center of Excellence in Knowledge-Enabled Computing 6
7. Ohio Center of Excellence in Knowledge-Enabled Computing 7
8. Ohio Center of Excellence in Knowledge-Enabled Computing
Emotion Identification
• Emotion
– “a strong feeling (such as love, anger, joy, hate, or fear)” --
Merriam-Webster Online Dictionary
• Emotion Identification
– the task of automatically identifying and extracting the
emotions expressed in a given text.
• Examples
8
“I hate when my mom compares me to my friends” -> Anger
“When I see a cop, no matter where I am or what I’m doing,
I always feel like every law I’ve broken is stamped all over
my body” -> Fear
9. Ohio Center of Excellence in Knowledge-Enabled Computing
Proposed Questions
• How to glean people’s emotions from their texts using machine
learning techniques?
• How to create large self-labeled emotion data from social media?
• How to improve emotion identification in target domains (e.g.,
blog, diary) by leveraging large self-labeled emotion data from
social media?
9
10. Ohio Center of Excellence in Knowledge-Enabled Computing
1. EMOTION CLASSIFICATION
10
Wenbo Wang, Lu Chen, Ming Tan, Shaojun Wang, Amit P. Sheth. Discovering Fine-
grained Sentiment in Suicide Notes. Biomedical Informatics Insights, 2012
Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit P. Sheth. Harnessing
Twitter ‘Big Data’ for Automatic Emotion Identification. 2012 ASE International
Conference on Social Computing (SocialCom 2012)
11. Ohio Center of Excellence in Knowledge-Enabled Computing
Background - Classification
Credit: nltk 11
12. Ohio Center of Excellence in Knowledge-Enabled Computing
Dataset Description
• Suicide notes
– 15 fine-grained emotions
– Training: 4,633 sentences;
– Testing: 2086 sentences
• Twitter data
– 7 emotions
– Training: ~250 K tweets
– Testing: 250 K tweets
12
13. Ohio Center of Excellence in Knowledge-Enabled Computing
Suicide Notes Dataset
13
Sentence example:
“I loved you and was proud of
you.”
Unigrams: i, love, you, and,
be, proud, of, you, .
Bigrams: I love, love you, you
and, and be, be proud, proud
of, of you, you .
The combination of unigrams and
bigrams perform the best among n-gram
features.
14. Ohio Center of Excellence in Knowledge-Enabled Computing
Suicide Notes Dataset
14
Sentence example:
“I loved you and was proud of
you.”
LIWC Knowledge:
Posemo: 2 (love, proud)
Negemo: 0
Anger: 0
Sad: 0
Adding knowledge-based features
further increases the performance.
15. Ohio Center of Excellence in Knowledge-Enabled Computing
Suicide Notes Dataset
15
Sentence example:
“I loved you and was proud of
you .”
POS count:
Adjective: 1 (proud)
Noun: 0 ()
Pronoun: 3 (i, you)
…
Sentence tense:
Simple past tense: 2 (I loved,
was proud)
Adding sentence tenses and POS counts
further increases the performance
16. Ohio Center of Excellence in Knowledge-Enabled Computing
Twitter Dataset – Supervised Classifier
16
Applying only adjectives performs poorly because
emotions can be implicitly expressed in text.
17. Ohio Center of Excellence in Knowledge-Enabled Computing
Twitter Dataset – Supervised Classifier
17
The combination of unigrams and bigrams
perform the best among n-gram features.
18. Ohio Center of Excellence in Knowledge-Enabled Computing
Twitter Dataset – Supervised Classifier
18
Knowledge features and syntactic features
become less important on Twitter data.
19. Ohio Center of Excellence in Knowledge-Enabled Computing
Challenge: The Lack of Training Data
• Emotion annotation is typically time-consuming,
expensive and error-prone.
– multiple emotion categories
– subtle and ambiguous emotion expressions
– Human judgement of emotion tends to be subjective and
varied.
• Most of existing datasets are small, e.g.,
– Blog: 1,890 sentences (Aman and Szpakowicz 2008)
– Experience: 1,000 sentences (Neviarouskaya et. al. 2010)
– Diary: 700 sentences (Neviarouskaya et. al. 2011)
19
20. Ohio Center of Excellence in Knowledge-Enabled Computing
Why do We Need More Training Data? (I)
20
speech. The memory-based learner used only
the word before and word after as features.
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0.1 1 10 100 1000
Millions of Words
TestAccuracy
Memory-Based
Winnow
Perceptron
Naïve Bayes
Figure 1. Learning Curves for Confusion Set
Disambiguation
We collected a 1-billion-word training
corpus from a variety of English texts, including
“We may want to reconsider the
trade-off between spending
time and money on algorithm
development versus spending it
on corpus development”
-- (Banko and Brill 2001)
From (Banko and Brill 2001)
21. Ohio Center of Excellence in Knowledge-Enabled Computing
Why do We Need More Training Data? (II)
• Emotions arise in various situations, which leads to very
diverse expressions conveying the emotions.
21
“I hate when my mom compares me to my friends”
“When I see a cop, no matter where I am or what
I’m doing, I always feel like every law I’ve broken is
stamped all over my body”
“I hate when I get the hiccups in class”
“Omg I finally fit into one pair of my jeans from last
year!!”
“A dog barked at me!”
22. Ohio Center of Excellence in Knowledge-Enabled Computing
The Use of Hashtags on Twitter
22
“I hate when my mom compares me to my friends
#annoying”
“When I see a cop, no matter where I am or what
I’m doing, I always feel like every law I’ve broken is
stamped all over my body #nervous”
“I hate when I get the hiccups in class
#embarrassing”
“Omg I finally fit into one pair of my jeans from last
year!! #excited”
“A dog barked at me! #scared #weak”
23. Ohio Center of Excellence in Knowledge-Enabled Computing
2. SELF-LABELED DATA
CREATION
23
Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit P. Sheth. Harnessing
Twitter ‘Big Data’ for Automatic Emotion Identification. 2012 ASE International
Conference on Social Computing (SocialCom 2012)
24. Ohio Center of Excellence in Knowledge-Enabled Computing
Emotion Hashtags
• From existing psychology literature (Shaver et. al.
1987), collected 7 sets of emotion words for 7 different
emotions – joy, sadness, anger, love, fear,
thankfulness, and surprise.
24
Emotion Hashtag Word Examples Number of Tweets
Joy excited, happy, elated, proud (36) 706,182
Sadness sorrow, unhappy, depressing, lonely (36) 616,471
Anger irritating, annoyed, frustrate, fury (23) 574,170
Love affection, lovin, loving, fondness (7) 301,759
Fear fear, panic, fright, worry, scare (22) 135,154
Thankfulness thankfulness, thankful (2) 131,340
Surprise surprised, astonished, unexpected (5) 23,906
Total 131 2,488,982
25. Ohio Center of Excellence in Knowledge-Enabled Computing
Removing Irrelevant Tweets
25
Hashtag count > 2
Emotion hashtag is not at the end
Word count < 5
Has URL or quotations
About 5 million tweets -> 2,488,982 tweets
26. Ohio Center of Excellence in Knowledge-Enabled Computing
Results with Increasing Training Data
0.4
0.45
0.5
0.55
0.6
0.65
1,000 10,000 248,898 497,796 746,694 995,592 1,244,490 1,493,388 1,742,286 1,991,184
accuracy
number of tweets in training data
LIBLINEAR
MNB
26
0.4341
0.5292 Logistic Regression (LR)
Training instance: 1K -> 2M
Percentage gain = 51.05%
0.6557
LR
0.6156
27. Ohio Center of Excellence in Knowledge-Enabled Computing
Results with Increasing Training Data
0.4
0.45
0.5
0.55
0.6
0.65
1,000 10,000 248,898 497,796 746,694 995,592 1,244,490 1,493,388 1,742,286 1,991,184
accuracy
number of tweets in training data
LIBLINEAR
MNB
27
0.4580
0.5426
Multinomial Naive Bayes (MNB)
Training instance: 1K -> 2M
Percentage gain = 38.65%
0.6350
LR
0.6113
28. Ohio Center of Excellence in Knowledge-Enabled Computing
For three popular emotions (76.2% of the tweets), the classifier
achieves F-measures of over 64%
Detailed Results
28
29. Ohio Center of Excellence in Knowledge-Enabled Computing
Detailed Results
29
For three less popular emotions (22.8% of the tweets), the
precisions are relatively higher compared with the recalls, and
the F-measures are over 43%.
30. Ohio Center of Excellence in Knowledge-Enabled Computing
What Have We Learned?
• We can automatically create training datasets for
emotion identification by leveraging emotion hashtags
on Twitter.
– A large amount of labeled data are collected with little effort
and cost
– Covers a variety of situations that elicit emotions
– Performance gain with increasing size of training data
• However, there is still a lack of labeled data in many
other domains/data sources.
30
31. Ohio Center of Excellence in Knowledge-Enabled Computing
New Challenge
31
Lots of labeled tweets
Far less labeled data in
many other domains
Can we use emotion-labeled tweets to help emotion
identification in other domains?
32. Ohio Center of Excellence in Knowledge-Enabled Computing
3. DOMAIN ADAPTATION FOR
EMOTION IDENTIFICATION
32
Wenbo Wang, Lu Chen, Keke Chen, Krishnaprasad Thirunarayan, Amit P. Sheth.
Domain Adaptation for Emotion Identification via Data Selection. Technical paper
(under review) 2015
33. Ohio Center of Excellence in Knowledge-Enabled Computing
Problem Definition
• Input
– Large amount of emotion-labeled tweets
– Small amount of labeled sentences from target
domains (e.g., blogs, fairy tales)
• Objective
– Select informative tweets and add them to target
domain training data, and train an adaptive classifier
for the target domain
33
34. Ohio Center of Excellence in Knowledge-Enabled Computing
The Bootstrapping Framework
34
Self-labeled tweets
Target domain labeled data
Credit1, credit2, credit3
• Train classifier c
• Apply c to tweets
35. Ohio Center of Excellence in Knowledge-Enabled Computing
The Bootstrapping Framework
35
Target domain labeled data
Credit1, credit2, credit3
Correctly
classified
Misclassified
• Train classifier c
• Apply c to tweets
• Identify informative tweets
from misclassified tweets
• Add them to target domain
training data
Why select from
misclassified tweets?
36. Ohio Center of Excellence in Knowledge-Enabled Computing
Informativeness Overview
36
Consistency Diversity Similarity
37. Ohio Center of Excellence in Knowledge-Enabled Computing
Consistency
• Fear: “Amazing night with my baby. Hope she liked our
anniversary present. Alil early but whatever. :) hopefully tmmrw
goes as planned.”
– Top supporting features for emotion fear
– Top supporting features for any emotion other than fear
– Use the margin to estimate consistency:
0.5094 – 0.5962 = -0.0868
37
Consistency measures how much is a tweet’s Label
consistent with its content.
38. Ohio Center of Excellence in Knowledge-Enabled Computing
Diversity
• Sadness: “Searching for vinyl proved to be quite disappointing”
– “disappoint” occurs 2 times
• Sadness: “I'm about to lose everything I've ever wanted, my
whole world, and it's all my fault..”
– “lose” occurs 15 times
38
0.00
0.25
0.50
0.75
1.00
0 25 50 75 100
term_freq
diversity
0.9048 (disappoint)
0.4724 (lose)
Exponential decay of its term
frequency in target domain
training data
Diversity encourages the selection of source instances containing
discriminative features that are infrequent or underrepresented in
the target domain.
39. Ohio Center of Excellence in Knowledge-Enabled Computing
Similarity Intuition
• Inspired by domain adaptation for machine translation
studies that select source instances similar to test
instances (Eck et al., 2004; Lu et al., 2007)
• Given a target test sentence
– Disgust: “im sick of look at a comput screen.”
• Retrieve most similar tweets
– Anger: “im sick and tire of look like a fool”
– Joy: “i have get usb fairi light around my comput screen .”
39
Content Similarity is not sufficient!
40. Ohio Center of Excellence in Knowledge-Enabled Computing
Similarity Overview
40
Content
similarity
Label
similarity Uncertainty
41. Ohio Center of Excellence in Knowledge-Enabled Computing
Content Similarity
• Upweight important words
– Source instance:
– Target test instance: inverse document frequency
(idf)
41
42. Ohio Center of Excellence in Knowledge-Enabled Computing
Label Similarity
• Target test sentence
• Disgust: “im sick of look at a comput screen.”
• Source tweet
• Anger: “im sick and tire of look like a fool”
• How likely will the test sentence express anger?
• Apply the same formula used for Consistency factor
• Top supporting features for emotion anger
• Top supporting features for any emotion other than anger
• Use the margin to estimate consistency: 0.5838 – 0.625 = -0.0412
42
43. Ohio Center of Excellence in Knowledge-Enabled Computing
Uncertainty
Sentence Label
Predicted
Label
Classifier
confidence
Uncertainty
the second day i go in and i
be so paranoid .
Fear Sadness 0.2352
we are total awesome! Joy Joy 0.8683
43
0.7648
0.1317
The more confident the classifier is, the more likely the prediction
is correct, the less focus we should give to this sentence.
44. Ohio Center of Excellence in Knowledge-Enabled Computing
Similarity Revisit
• Encourage the selection of source instances that share high
content and label similarities with target domain test instances
that classifier c is most uncertain about.
44
Content
similarity
Label
similarity Uncertainty
45. Ohio Center of Excellence in Knowledge-Enabled Computing
Informativeness Revisit
• A tweet is informative when
– 1) its label is consistent with its content
– AND 2) it contains a discriminative feature that is infrequent in
target training data
– AND 3) it is similar to an target domain test instance whose
label cannot be predicted by the classifier c with high
confidence.
45
Consistency Diversity Similarity
Our proposed approach: CDS
46. Ohio Center of Excellence in Knowledge-Enabled Computing
Baseline approaches
• Source Only (SO): train classifiers using only Twit
• Target Only (TO): train classifiers using only target domain
training data
• Feature Injection (FI): first train a source classifier using only
source data (Daume III, 2007)
• Feature Augmentation (FA) (Daume III, 2007)
– Source instances: X -> XX0 (common, source, target)
– Target instances: X -> XoX (common, source, target)
• Balance Weight (BW): assign larger weights for the target
instances so that the weight sum of target instances equals to
that of source instances (Jiang and Zhai, 2007)
46
47. Ohio Center of Excellence in Knowledge-Enabled Computing
Baseline approaches
• Source Only (SO): train classifiers using only Twit
• Target Only (TO): train classifiers using only target domain
training data
• Feature Injection (FI): first train a source classifier using only
source data (Daume III, 2007)
• Feature Augmentation (FA) (Daume III, 2007)
– Source instances: X -> XX0 (common, source, target)
– Target instances: X -> XoX (common, source, target)
• Balance Weight (BW): assign larger weights for the target
instances so that the weight sum of target instances equals to
that of source instances (Jiang and Zhai, 2007)
47
48. Ohio Center of Excellence in Knowledge-Enabled Computing
Experimental settings
• Features
– Experimented unigrams, bigrams, unigrams+bigrams
– Applied unigrams in the end
• Logistic regression
– Fast, support probability output (uncertainty)
• Five-fold cross validation
– Four folds: training; 1 fold; testing
• Add-0.5 smoothing
48
49. Ohio Center of Excellence in Knowledge-Enabled Computing
Results on four target datasets*
49
Percentage gain
8.01%
24.07%
36.53%
3.62%
16.45%
*: The numbers are different from those in the dissertation defense video, because I fixed a bug after that. Results
got slightly improved because of this.
50. Ohio Center of Excellence in Knowledge-Enabled Computing
Different Instance Selection Strategies
• CDS: select tweets from misclassified tweets
• CD: removed similarity factor from CDS
• CDS-ALL: select tweets from all source tweets
• CDS-CORR: select tweets from source tweets that can be
correctly classified by c
50
51. Ohio Center of Excellence in Knowledge-Enabled Computing
Comparing instance selection strategies
51
Among all the strategies, CDS
improves F1 in the fastest way.
52. Ohio Center of Excellence in Knowledge-Enabled Computing
Comparing instance selection strategies
52
CDS-ALL achieves a similar performance
as CDS does but takes more iterations,
because the input of CDS-ALL is a
superset of CDS.
53. Ohio Center of Excellence in Knowledge-Enabled Computing
Comparing instance selection strategies
53
CDS-CORR performs the worst because it
selects tweets from correctly classified tweets,
the knowledge of which might already exist in
target domains.
54. Ohio Center of Excellence in Knowledge-Enabled Computing
Summary
• People’s emotions can be gleaned from their texts using machine learning
techniques.
– The combination of n-grams (n=1,2), knowledge-based and syntactic features
achieves the best performance.
– Knowledge features and syntactic features become less important on large training
data.
• We can automatically create a large training dataset for emotion identification
by leveraging emotion hashtags on Twitter.
– A large amount of labeled data are collected with little effort and cost
– Covers a variety of situations that elicit emotions
– Performance gain with increasing size of training data
• This self-labeled emotion dataset can be used to improve emotion
identification in text from other domains/data sources.
– Domain adaptation via selecting tweets that are informative to the target domain
– It is superior to select source instances that cannot be correctly classified.
– Informativeness of a source instance is measured by three factors: consistency,
diversity and similarity.
54
55. Ohio Center of Excellence in Knowledge-Enabled Computing
Publications
• Wenbo Wang, Lei Duan, Anirudh Koul, Amit P. Sheth. YouRank: Let User Engagement Rank
Microblog Search Results. In the Eighth International AAAI Conference on Weblogs and Social
Media (ICWSM'14) 2014
• Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit P. Sheth. Cursing in English on Twitter.
In ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW'14)
2014
• Amit Sheth, Ashutosh Jadhav, Pavan Kapanipathi, Lu Chen, Hemant Purohit, Gary Alan Smith, and
Wenbo Wang. "Twitris: A system for collective social intelligence." In Encyclopedia of Social
Network Analysis and Mining, pp. 2240-2253. Springer New York, 2014.
• Lu Chen, Wenbo Wang, Amit P. Sheth. Are Twitter Users Equal in Predicting Elections? A Study of
User Groups in Predicting 2012 U.S. Republican Presidential Primaries. In Proceedings of the
Fourth International Conference on Social Informatics (SocInfo'12) 2012
• Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, Amit P. Sheth. Harnessing Twitter ‘Big Data’
for Automatic Emotion Identification. 2012 ASE International Conference on Social Computing
(SocialCom 2012), 2012
• Lu Chen, Wenbo Wang, Meenakshi Nagarajan, Shaojun Wang, Amit P. Sheth. Extracting Diverse
Sentiment Expressions with Target-dependent Polarity from Twitter. In Proceedings of the 6th
International AAAI Conference on Weblogs and Social Media (ICWSM), 2012
55
56. Ohio Center of Excellence in Knowledge-Enabled Computing
Publications
• Wenbo Wang, Lu Chen, Ming Tan, Shaojun Wang, Amit P. Sheth. Discovering Fine-grained
Sentiment in Suicide Notes. Biomedical Informatics Insights, 2012
• Ramakanth Kavuluru, Christopher Thomas, Amit Sheth, Victor Chan, Wenbo Wang, Alan Smith, An
Up-to-date Knowledge-Based Literature Search and Exploration Framework for Focused
Bioscience Domains, IHI 2012 - 2nd ACM SIGHIT Intl Health Informatics Symposium, January 28-
30, 2012.
• Wenbo Wang, Christopher Thomas, Amit Sheth, Victor Chan. Pattern-Based Synonym and
Antonym Extraction. 48th ACM Southeast Conference, ACMSE2010, Oxford Mississippi, April 15-
17, 2010
• Christopher J. Thomas, Wenbo Wang, Pankaj Mehra, Delroy Cameron, Pablo N. Mendes, and Amit
P. Sheth.. What Goes Around Comes Around – Improving Linked Opend Data through On-Demand
Model Creation. In: Proceedings of the WebSci10: Extending the Frontiers of Society On-Line, April
26-27th, 2010, Raleigh, NC: US.
• Ashutosh Jadhav, Wenbo Wang, Raghava Mutharaju, Pramod Anantharam, Vinh Nyugen, Amit P.
Sheth, Karthik Gomadam, Meenakshi Nagarajan, and Ajith Ranabahu, Twitris: Socially Influenced
Browsing, Semantic Web Challenge 2009, demo at 8th International Semantic Web Conference,
Oct. 25-29 2009, Washington, DC, USA
56
57. Ohio Center of Excellence in Knowledge-Enabled Computing
Patents & Proposal
• Wenbo Wang, Lei Duan. "Temporal User Engagement Features", U.S. Patent
No. 20,150,120,753. 30 Apr. 2015.
• Lu Chen, Wenbo Wang, Amit Sheth. "Topic-specific Sentiment Extraction", U.S.
Patent No. 20,140,358,523. 4 Dec. 2014.
• Context-Aware Harassment Detection on Social Media. NSF proposal
57
58. Ohio Center of Excellence in Knowledge-Enabled Computing
Special thanks to AFRL and NSF
58
Credit, credit
*Part of this material is based upon work supported by the National Science Foundation under Grant IIS-1111182 ``
SoCS: Collaborative Research: Social Media Enhanced Organizational Sensemaking in Emergency Response.''
59. Ohio Center of Excellence in Knowledge-Enabled Computing 59
Thank You! & Questions?