SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Trend Detection and Analysis
on Twitter
Benjamin Räthlein
Henning Muszynski
Lukas Masuch
2
Agenda
Motivation
Architecture
Data Preparation
Trend Analysis
Analyzed Trends
Conclusion
3
Motivation
Predict the stock market in real time
source
source
Detecting influenza epidemics
Automatic crime prediction
source
“Successful results of mainly research-based projects
helped to open up new business opportunities”
4
Twitter
5
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Bag of Words
Bags Count
#newyear 7
#christmas 6
@bigdata 2
@sap 3
6
Statistical MeasurementEarly Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Report statistics (every 20 minutes):
• Total hashtags & user mentions
• Hashtag/mentions count
• Usage growth per hashtag/mention
• Participating users per hashtag/mention
• Retweet count per hashtag/mention
7
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Anomaly Detection
Time Series Analysis
Calculated for every hashtag / user mention
Every 2 / 4 hours based on reports
Anomaly detection using:
• Relative & absolute fluctuation
• Total occurrences (sum)
• Minimum occurrences
• Maximum occurrences
• Average occurrences
Time Series Analysis
8
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
Lowercasing & tokenizing
URL & stopword removal
Stop Word Removal
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
9
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Stemming
Amazing
Amazement
Amazed
amaze
10
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Sentiment Analysis
I love cookiesI hate cookies
11
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Topic Modeling
Topics
• …
• …
• …
Trend Classification
14
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Wordcloud Visualization
Wordfreq.js
Wordcloud2.js
GeoSpatial Visualization
CartoDB
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Anomaly Detection
Statistical Measurement
(growth, average usage, retweets, participating users…)
Time Series Analysis
Trend Classification
Twitter Streaming API (Twython)
Architecture
15
Analyzed Trends
16
Limitations
Tweets collected: 38 million (70GB)
Only English tweets from the USA
Twitter Streaming API
17
New Year
Time Series
18
New Year
Word Cloud
19
New Year
Geospatial Analysis
Midnight Los Angeles Midnight New York
20
New Year
Sentiment Analysis
Positive Neutral Negative
Home sick on #nye. Horrible timing
stupid cold. Ugh. My date is my
couch & pillow watching.
#HappyNewYear everyone.
#HappyNewYear from the Youth for
Astronomy and Engineering Program
at Space Telescope Science Institute!
Happy New Year! Last year was
amazing, and here’s to another great
year of love & happiness! #NYE2015
21
Air Asia Tragedy
22
Air Asia Tragedy
Time Series
23
Air Asia Tragedy
Word Cloud
24
Air Asia Tragedy
Topic Modeling
News
airasia, missing, flight, air,
Indonesia, singapore, asia
Search for the Plane
airasia, missing, plane, find,
plane, world, technology
Sympathy
Prayers, families, thoughts,
airasia, crash, thought, airfrance
Cause
airasia, weather, flight,
pilots, fly, bad, path
International Help
raaf, butterworth, china, australia,
Russia, trndnl, trending
25
Air Asia Tragedy
Sentiment Analysis
Neutral Negative Positive
Prayers are USELESS! Stop repeating
meaningless crap, pretending that
you care … #PrayForAirAsia #QZ8501
#GrowABrain #ReligousNonsense
#BREAKING #AirAsia Flight #8501
likely “at the bottom of the sea”
rescue officials says.
May God’s great love shine on the
families and loved ones of all
passengers and crew #AirAsia #8501
26
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
27
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
28
Sony Hack
29
Sony Hack
Time Series
30
Sony Hack
Word Cloud
31
Sony Hack
Topic Modeling
Christmas Release
theinterview, christmas, day,
theaters, freedom, theater, showing
Reviews
theinterview, jamesfrancotv, sethrogen,
movie, interview, funny, hilarious
Suspicions
northkorea, sonyhack, korea,
north, internet, sony, amp
News
theinterview, sonypictures, sony,
movie, korea, north, interview
Insider Joke
theinterview, aint, hate, cuz,
jealous, anus, peanutbutter
32
Sony Hack
Geospatial Analysis
33
Sony Hack
Sentiment Analysis
Neutral Negative Positive
#TheInterview SUCKS!!! @sethrogen
Like I knew it would #Stupid
#NotFunny
#Sony says #TheInterview made
more than $1 million at the box office
on in 1 single day on Dec. 25.
Happy I joined my fellow Americans
in the great #TheInterview Christmas
Day Viewing. Plus it was pretty funny,
truth be told.
34
Network Outage
35
Network Outage
Time Series
36
Network Outage
Word Cloud
37
Network Outage
Topic Modeling
Network Error
xbox, psn, sign, connect,
live, error, account, issues
Connection between Hacks
xbox, playstation, watch, movie,
fuckcrucifix, north, korea, interview
Xbox Down
xbox, christmas, play, xboxlivedown,
live, xboxlive, xboxsupport, day
Caused Damage
playstation, dollar, psn, company,
lizardsquad, sony, billion, multi
Hacker Group
fuckcrucifix, lizardmafia, lizardsquad,
fuck,lizard, squad, finestsquad, stop
Restored
psn, back, playstation, online,
askplaystation, network, psndown, working
38
Network Outage
Sentiment Analysis
Neutral Negative Positive
@XboxSupport f*** your servers, a
big ass company like you should
handle these teenage kids, terrible
@AskPlayStation when will the
service be back online because it says
there’s maintenance?
@PlayStation thanks for the great
year. I am sure this new year will be
amazing. Don’t allow yourselves to
be hacked ever again.
39
Conclusion
High quality insights into world’s interest
Twitter is very good for detecting and predicting trends
Maintaining a high data quality is important
40
#Questions
Benjamin Räthlein
@B3nRa
Henning Muszynski
@henningmus
Lukas Masuch
@LukasMasuch

Weitere ähnliche Inhalte

Was ist angesagt?

Design Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileDesign Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileYian Lu
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentAshley Segura
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupAshley Segura
 
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardIncrease Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardState of Search Conference
 
Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Patricia Aas
 

Was ist angesagt? (6)

Design Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileDesign Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing Mobile
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable Content
 
#Love isinmyblood twitter report
#Love isinmyblood   twitter report#Love isinmyblood   twitter report
#Love isinmyblood twitter report
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
 
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardIncrease Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
 
Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)
 

Andere mochten auch

Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...
Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...
Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...WhereCampBerlin
 
Complete list of publications december 2015 Sten Rasmussen
Complete list of publications december 2015 Sten RasmussenComplete list of publications december 2015 Sten Rasmussen
Complete list of publications december 2015 Sten RasmussenSten Rasmussen
 
Ф.Достоєвський. Життя і творчість
Ф.Достоєвський. Життя і творчість Ф.Достоєвський. Життя і творчість
Ф.Достоєвський. Життя і творчість dfktynbyf15
 
African relgion, trade, and culture station
African relgion, trade, and culture stationAfrican relgion, trade, and culture station
African relgion, trade, and culture stationClaire James
 
Загальна харатеристика літератури та культури 19 ст.
Загальна харатеристика літератури та культури 19 ст. Загальна харатеристика літератури та культури 19 ст.
Загальна харатеристика літератури та культури 19 ст. dfktynbyf15
 
Оноре де Бальзак
Оноре де БальзакОноре де Бальзак
Оноре де Бальзакdfktynbyf15
 
8 клас Біблія, Веди, Коран як літературні пам'ятки
8 клас Біблія, Веди, Коран як літературні пам'ятки8 клас Біблія, Веди, Коран як літературні пам'ятки
8 клас Біблія, Веди, Коран як літературні пам'яткиdfktynbyf15
 
Scientific Revolution and the Scientists
Scientific Revolution and the ScientistsScientific Revolution and the Scientists
Scientific Revolution and the ScientistsClaire James
 
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)Snezhana Pshenichnaya
 
Migrate BI to APEX 5: Are We There Yet?
Migrate BI to APEX 5: Are We There Yet?Migrate BI to APEX 5: Are We There Yet?
Migrate BI to APEX 5: Are We There Yet?Karen Cannell
 
How Machine Learning Works for Business
How Machine Learning Works for BusinessHow Machine Learning Works for Business
How Machine Learning Works for Business10x Nation
 
О.Грін. Біографія
О.Грін. Біографія О.Грін. Біографія
О.Грін. Біографія Adriana Himinets
 
презентація навчальний проект
презентація навчальний проектпрезентація навчальний проект
презентація навчальний проектanna1691
 
петербург достоевского полная
петербург достоевского полнаяпетербург достоевского полная
петербург достоевского полнаяSnezhanaP10
 

Andere mochten auch (17)

Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...
Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...
Wherecamp Navigation Conference 2015 - DB AG OSM Pilot Railway Station Indoor...
 
Complete list of publications december 2015 Sten Rasmussen
Complete list of publications december 2015 Sten RasmussenComplete list of publications december 2015 Sten Rasmussen
Complete list of publications december 2015 Sten Rasmussen
 
Ф.Достоєвський. Життя і творчість
Ф.Достоєвський. Життя і творчість Ф.Достоєвський. Життя і творчість
Ф.Достоєвський. Життя і творчість
 
African relgion, trade, and culture station
African relgion, trade, and culture stationAfrican relgion, trade, and culture station
African relgion, trade, and culture station
 
Загальна харатеристика літератури та культури 19 ст.
Загальна харатеристика літератури та культури 19 ст. Загальна харатеристика літератури та культури 19 ст.
Загальна харатеристика літератури та культури 19 ст.
 
Оноре де Бальзак
Оноре де БальзакОноре де Бальзак
Оноре де Бальзак
 
8 клас Біблія, Веди, Коран як літературні пам'ятки
8 клас Біблія, Веди, Коран як літературні пам'ятки8 клас Біблія, Веди, Коран як літературні пам'ятки
8 клас Біблія, Веди, Коран як літературні пам'ятки
 
Scientific Revolution and the Scientists
Scientific Revolution and the ScientistsScientific Revolution and the Scientists
Scientific Revolution and the Scientists
 
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)
н и нн в суффиксах причастий и отглагольных прилагательных (урок в 10 классе)
 
Migrate BI to APEX 5: Are We There Yet?
Migrate BI to APEX 5: Are We There Yet?Migrate BI to APEX 5: Are We There Yet?
Migrate BI to APEX 5: Are We There Yet?
 
How Machine Learning Works for Business
How Machine Learning Works for BusinessHow Machine Learning Works for Business
How Machine Learning Works for Business
 
О.Грін. Біографія
О.Грін. Біографія О.Грін. Біографія
О.Грін. Біографія
 
Baker street 221 b
Baker street 221 bBaker street 221 b
Baker street 221 b
 
презентація навчальний проект
презентація навчальний проектпрезентація навчальний проект
презентація навчальний проект
 
петербург достоевского полная
петербург достоевского полнаяпетербург достоевского полная
петербург достоевского полная
 
герман тетяна іванівна
герман тетяна іванівнагерман тетяна іванівна
герман тетяна іванівна
 
Cambridge day (1)
Cambridge day (1)Cambridge day (1)
Cambridge day (1)
 

Ähnlich wie Twitter Trend Detection and Analysis

Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on TwitterLukas Masuch
 
Final Presentation
Final PresentationFinal Presentation
Final PresentationLove Tyagi
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Alexandre Sieira
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonHetu Bhavsar
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentIoannis Katakis
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOToronto Metropolitan University
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies台灣資料科學年會
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweetsmitsmit
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
Metodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMetodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMontse Fernández Crespo
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Eric Schwartzman
 
Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkSteve Kramer
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Felix Victor Münch
 
Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016StampedeCon
 
Sentiment analysis of twitter using python
Sentiment analysis of twitter using pythonSentiment analysis of twitter using python
Sentiment analysis of twitter using pythonManan Gadhiya
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhyDavide Feltoni Gurini
 

Ähnlich wie Twitter Trend Detection and Analysis (20)

Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on Twitter
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
Metodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMetodologia para el analisis de redes sociales
Metodologia para el analisis de redes sociales
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2
 
Geekend 1 04 10 m francis
Geekend 1 04 10 m francisGeekend 1 04 10 m francis
Geekend 1 04 10 m francis
 
Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter Network
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
 
Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016
 
Sentiment analysis of twitter using python
Sentiment analysis of twitter using pythonSentiment analysis of twitter using python
Sentiment analysis of twitter using python
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and Why
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
 

Mehr von Henning Muszynski

Mehr von Henning Muszynski (8)

The ABC of Coded Style Guides
The ABC of Coded Style GuidesThe ABC of Coded Style Guides
The ABC of Coded Style Guides
 
From 0 to 100: How we jump-started our frontend testing
From 0 to 100: How we jump-started our frontend testingFrom 0 to 100: How we jump-started our frontend testing
From 0 to 100: How we jump-started our frontend testing
 
Alphabet from A to Z
Alphabet from A to ZAlphabet from A to Z
Alphabet from A to Z
 
Growth Hacking 101
Growth Hacking 101Growth Hacking 101
Growth Hacking 101
 
Context Discount
Context DiscountContext Discount
Context Discount
 
Roadtrip
RoadtripRoadtrip
Roadtrip
 
Spark X - Enterprise Crowdfunding
Spark X - Enterprise CrowdfundingSpark X - Enterprise Crowdfunding
Spark X - Enterprise Crowdfunding
 
Virtual Meeting Room
Virtual Meeting RoomVirtual Meeting Room
Virtual Meeting Room
 

Kürzlich hochgeladen

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 

Kürzlich hochgeladen (20)

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 

Twitter Trend Detection and Analysis

  • 1. Trend Detection and Analysis on Twitter Benjamin Räthlein Henning Muszynski Lukas Masuch
  • 3. 3 Motivation Predict the stock market in real time source source Detecting influenza epidemics Automatic crime prediction source “Successful results of mainly research-based projects helped to open up new business opportunities”
  • 5. 5 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Bag of Words Bags Count #newyear 7 #christmas 6 @bigdata 2 @sap 3
  • 6. 6 Statistical MeasurementEarly Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Report statistics (every 20 minutes): • Total hashtags & user mentions • Hashtag/mentions count • Usage growth per hashtag/mention • Participating users per hashtag/mention • Retweet count per hashtag/mention
  • 7. 7 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Anomaly Detection Time Series Analysis Calculated for every hashtag / user mention Every 2 / 4 hours based on reports Anomaly detection using: • Relative & absolute fluctuation • Total occurrences (sum) • Minimum occurrences • Maximum occurrences • Average occurrences Time Series Analysis
  • 8. 8 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) Lowercasing & tokenizing URL & stopword removal Stop Word Removal This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and. This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and.
  • 9. 9 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Stemming Amazing Amazement Amazed amaze
  • 10. 10 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Sentiment Analysis I love cookiesI hate cookies
  • 11. 11 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Topic Modeling Topics • … • … • … Trend Classification
  • 12. 14 Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Wordcloud Visualization Wordfreq.js Wordcloud2.js GeoSpatial Visualization CartoDB Early Trend Detector Bag-of-words (Hashtags, Mentions) Anomaly Detection Statistical Measurement (growth, average usage, retweets, participating users…) Time Series Analysis Trend Classification Twitter Streaming API (Twython) Architecture
  • 14. 16 Limitations Tweets collected: 38 million (70GB) Only English tweets from the USA Twitter Streaming API
  • 17. 19 New Year Geospatial Analysis Midnight Los Angeles Midnight New York
  • 18. 20 New Year Sentiment Analysis Positive Neutral Negative Home sick on #nye. Horrible timing stupid cold. Ugh. My date is my couch & pillow watching. #HappyNewYear everyone. #HappyNewYear from the Youth for Astronomy and Engineering Program at Space Telescope Science Institute! Happy New Year! Last year was amazing, and here’s to another great year of love & happiness! #NYE2015
  • 22. 24 Air Asia Tragedy Topic Modeling News airasia, missing, flight, air, Indonesia, singapore, asia Search for the Plane airasia, missing, plane, find, plane, world, technology Sympathy Prayers, families, thoughts, airasia, crash, thought, airfrance Cause airasia, weather, flight, pilots, fly, bad, path International Help raaf, butterworth, china, australia, Russia, trndnl, trending
  • 23. 25 Air Asia Tragedy Sentiment Analysis Neutral Negative Positive Prayers are USELESS! Stop repeating meaningless crap, pretending that you care … #PrayForAirAsia #QZ8501 #GrowABrain #ReligousNonsense #BREAKING #AirAsia Flight #8501 likely “at the bottom of the sea” rescue officials says. May God’s great love shine on the families and loved ones of all passengers and crew #AirAsia #8501
  • 24. 26 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 25. 27 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 29. 31 Sony Hack Topic Modeling Christmas Release theinterview, christmas, day, theaters, freedom, theater, showing Reviews theinterview, jamesfrancotv, sethrogen, movie, interview, funny, hilarious Suspicions northkorea, sonyhack, korea, north, internet, sony, amp News theinterview, sonypictures, sony, movie, korea, north, interview Insider Joke theinterview, aint, hate, cuz, jealous, anus, peanutbutter
  • 31. 33 Sony Hack Sentiment Analysis Neutral Negative Positive #TheInterview SUCKS!!! @sethrogen Like I knew it would #Stupid #NotFunny #Sony says #TheInterview made more than $1 million at the box office on in 1 single day on Dec. 25. Happy I joined my fellow Americans in the great #TheInterview Christmas Day Viewing. Plus it was pretty funny, truth be told.
  • 35. 37 Network Outage Topic Modeling Network Error xbox, psn, sign, connect, live, error, account, issues Connection between Hacks xbox, playstation, watch, movie, fuckcrucifix, north, korea, interview Xbox Down xbox, christmas, play, xboxlivedown, live, xboxlive, xboxsupport, day Caused Damage playstation, dollar, psn, company, lizardsquad, sony, billion, multi Hacker Group fuckcrucifix, lizardmafia, lizardsquad, fuck,lizard, squad, finestsquad, stop Restored psn, back, playstation, online, askplaystation, network, psndown, working
  • 36. 38 Network Outage Sentiment Analysis Neutral Negative Positive @XboxSupport f*** your servers, a big ass company like you should handle these teenage kids, terrible @AskPlayStation when will the service be back online because it says there’s maintenance? @PlayStation thanks for the great year. I am sure this new year will be amazing. Don’t allow yourselves to be hacked ever again.
  • 37. 39 Conclusion High quality insights into world’s interest Twitter is very good for detecting and predicting trends Maintaining a high data quality is important