SlideShare a Scribd company logo
1 of 38
Download to read offline
Trend Detection and Analysis
on Twitter
Benjamin Räthlein
Henning Muszynski
Lukas Masuch
2
Agenda
Motivation
Architecture
Data Preparation
Trend Analysis
Analyzed Trends
Conclusion
3
Motivation
Predict the stock market in real time
source
source
Detecting influenza epidemics
Automatic crime prediction
source
“Successful results of mainly research-based projects
helped to open up new business opportunities”
4
Twitter
5
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Bag of Words
Bags Count
#newyear 7
#christmas 6
@bigdata 2
@sap 3
6
Statistical MeasurementEarly Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Report statistics (every 20 minutes):
• Total hashtags & user mentions
• Hashtag/mentions count
• Usage growth per hashtag/mention
• Participating users per hashtag/mention
• Retweet count per hashtag/mention
7
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Twitter Streaming API (Twython)
Architecture
Statistical Measurement
(growth, average usage, retweets, participating users…)
Anomaly Detection
Time Series Analysis
Calculated for every hashtag / user mention
Every 2 / 4 hours based on reports
Anomaly detection using:
• Relative & absolute fluctuation
• Total occurrences (sum)
• Minimum occurrences
• Maximum occurrences
• Average occurrences
Time Series Analysis
8
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
Lowercasing & tokenizing
URL & stopword removal
Stop Word Removal
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
This sample text shows which words will
be removed when applying stop word
removal. Mostly words like the, a or and.
9
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Stemming
Amazing
Amazement
Amazed
amaze
10
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Sentiment Analysis
I love cookiesI hate cookies
11
Twitter Streaming API (Twython)
Architecture
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Topic Modeling
Topics
• …
• …
• …
Trend Classification
14
Trend Analyzer
Text Preprocessing (Python NLTK)
URL & stopword removal
Lowercasing & tokenizing
Word stemming
Sentiment Analysis
Topic Modeling (LDA)
Wordcloud Visualization
Wordfreq.js
Wordcloud2.js
GeoSpatial Visualization
CartoDB
Early Trend Detector
Bag-of-words (Hashtags, Mentions)
Anomaly Detection
Statistical Measurement
(growth, average usage, retweets, participating users…)
Time Series Analysis
Trend Classification
Twitter Streaming API (Twython)
Architecture
15
Analyzed Trends
16
Limitations
Tweets collected: 38 million (70GB)
Only English tweets from the USA
Twitter Streaming API
17
New Year
Time Series
18
New Year
Word Cloud
19
New Year
Geospatial Analysis
Midnight Los Angeles Midnight New York
20
New Year
Sentiment Analysis
Positive Neutral Negative
Home sick on #nye. Horrible timing
stupid cold. Ugh. My date is my
couch & pillow watching.
#HappyNewYear everyone.
#HappyNewYear from the Youth for
Astronomy and Engineering Program
at Space Telescope Science Institute!
Happy New Year! Last year was
amazing, and here’s to another great
year of love & happiness! #NYE2015
21
Air Asia Tragedy
22
Air Asia Tragedy
Time Series
23
Air Asia Tragedy
Word Cloud
24
Air Asia Tragedy
Topic Modeling
News
airasia, missing, flight, air,
Indonesia, singapore, asia
Search for the Plane
airasia, missing, plane, find,
plane, world, technology
Sympathy
Prayers, families, thoughts,
airasia, crash, thought, airfrance
Cause
airasia, weather, flight,
pilots, fly, bad, path
International Help
raaf, butterworth, china, australia,
Russia, trndnl, trending
25
Air Asia Tragedy
Sentiment Analysis
Neutral Negative Positive
Prayers are USELESS! Stop repeating
meaningless crap, pretending that
you care … #PrayForAirAsia #QZ8501
#GrowABrain #ReligousNonsense
#BREAKING #AirAsia Flight #8501
likely “at the bottom of the sea”
rescue officials says.
May God’s great love shine on the
families and loved ones of all
passengers and crew #AirAsia #8501
26
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
27
Air Asia Tragedy
Google Trends Comparison
Google Trends Twitter Sample
28
Sony Hack
29
Sony Hack
Time Series
30
Sony Hack
Word Cloud
31
Sony Hack
Topic Modeling
Christmas Release
theinterview, christmas, day,
theaters, freedom, theater, showing
Reviews
theinterview, jamesfrancotv, sethrogen,
movie, interview, funny, hilarious
Suspicions
northkorea, sonyhack, korea,
north, internet, sony, amp
News
theinterview, sonypictures, sony,
movie, korea, north, interview
Insider Joke
theinterview, aint, hate, cuz,
jealous, anus, peanutbutter
32
Sony Hack
Geospatial Analysis
33
Sony Hack
Sentiment Analysis
Neutral Negative Positive
#TheInterview SUCKS!!! @sethrogen
Like I knew it would #Stupid
#NotFunny
#Sony says #TheInterview made
more than $1 million at the box office
on in 1 single day on Dec. 25.
Happy I joined my fellow Americans
in the great #TheInterview Christmas
Day Viewing. Plus it was pretty funny,
truth be told.
34
Network Outage
35
Network Outage
Time Series
36
Network Outage
Word Cloud
37
Network Outage
Topic Modeling
Network Error
xbox, psn, sign, connect,
live, error, account, issues
Connection between Hacks
xbox, playstation, watch, movie,
fuckcrucifix, north, korea, interview
Xbox Down
xbox, christmas, play, xboxlivedown,
live, xboxlive, xboxsupport, day
Caused Damage
playstation, dollar, psn, company,
lizardsquad, sony, billion, multi
Hacker Group
fuckcrucifix, lizardmafia, lizardsquad,
fuck,lizard, squad, finestsquad, stop
Restored
psn, back, playstation, online,
askplaystation, network, psndown, working
38
Network Outage
Sentiment Analysis
Neutral Negative Positive
@XboxSupport f*** your servers, a
big ass company like you should
handle these teenage kids, terrible
@AskPlayStation when will the
service be back online because it says
there’s maintenance?
@PlayStation thanks for the great
year. I am sure this new year will be
amazing. Don’t allow yourselves to
be hacked ever again.
39
Conclusion
High quality insights into world’s interest
Twitter is very good for detecting and predicting trends
Maintaining a high data quality is important
40
#Questions
Benjamin Räthlein
@B3nRa
Henning Muszynski
@henningmus
Lukas Masuch
@LukasMasuch

More Related Content

What's hot

Design Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileDesign Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileYian Lu
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentAshley Segura
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupAshley Segura
 
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardIncrease Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardState of Search Conference
 
Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Patricia Aas
 

What's hot (6)

Design Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing MobileDesign Exercise_Mobile celebrity search experience on Bing Mobile
Design Exercise_Mobile celebrity search experience on Bing Mobile
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable Content
 
#Love isinmyblood twitter report
#Love isinmyblood   twitter report#Love isinmyblood   twitter report
#Love isinmyblood twitter report
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
 
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley WardIncrease Your Conversions Using Reusable Content - SEMRush - Ashley Ward
Increase Your Conversions Using Reusable Content - SEMRush - Ashley Ward
 
Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)Trust, Elections and Twitter (fscons 2017)
Trust, Elections and Twitter (fscons 2017)
 

Similar to Trend Detection and Analysis on Twitter Using Machine Learning

Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on TwitterLukas Masuch
 
Final Presentation
Final PresentationFinal Presentation
Final PresentationLove Tyagi
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Alexandre Sieira
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonHetu Bhavsar
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentIoannis Katakis
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOToronto Metropolitan University
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies台灣資料科學年會
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweetsmitsmit
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
Metodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMetodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMontse Fernández Crespo
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Eric Schwartzman
 
Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkSteve Kramer
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Felix Victor Münch
 
Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016StampedeCon
 
Sentiment analysis of twitter using python
Sentiment analysis of twitter using pythonSentiment analysis of twitter using python
Sentiment analysis of twitter using pythonManan Gadhiya
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhyDavide Feltoni Gurini
 

Similar to Trend Detection and Analysis on Twitter Using Machine Learning (20)

Trend detection and analysis on Twitter
Trend detection and analysis on TwitterTrend detection and analysis on Twitter
Trend detection and analysis on Twitter
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
Threat Intelligence Baseada em Dados: Métricas de Disseminação e Compartilham...
 
Sentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using pythonSentiment analysis of Twitter data using python
Sentiment analysis of Twitter data using python
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook:  case study of BlogTOPredicting what gets ‘Likes’ on Facebook:  case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies
 
Semantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports TweetsSemantic Entity extraction from Sports Tweets
Semantic Entity extraction from Sports Tweets
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
Metodologia para el analisis de redes sociales
Metodologia para el analisis de redes socialesMetodologia para el analisis de redes sociales
Metodologia para el analisis de redes sociales
 
Social Media Training at AED: Day 2
Social Media Training at AED: Day 2Social Media Training at AED: Day 2
Social Media Training at AED: Day 2
 
Geekend 1 04 10 m francis
Geekend 1 04 10 m francisGeekend 1 04 10 m francis
Geekend 1 04 10 m francis
 
Using Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter NetworkUsing Chaos to Disentangle an ISIS-Related Twitter Network
Using Chaos to Disentangle an ISIS-Related Twitter Network
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
Mining Influencers in the German Twittersphere – Mapping a Language-Based Fol...
 
Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016Floods of Twitter Data - StampedeCon 2016
Floods of Twitter Data - StampedeCon 2016
 
Sentiment analysis of twitter using python
Sentiment analysis of twitter using pythonSentiment analysis of twitter using python
Sentiment analysis of twitter using python
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and Why
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
 

More from Benjamin Raethlein

Everyday Machine Intelligence For Your Everyday Applications
Everyday Machine Intelligence For Your Everyday ApplicationsEveryday Machine Intelligence For Your Everyday Applications
Everyday Machine Intelligence For Your Everyday ApplicationsBenjamin Raethlein
 
Google Cloud Platform - Building a scalable Mobile Application
Google Cloud Platform - Building a scalable Mobile ApplicationGoogle Cloud Platform - Building a scalable Mobile Application
Google Cloud Platform - Building a scalable Mobile ApplicationBenjamin Raethlein
 

More from Benjamin Raethlein (6)

Everyday Machine Intelligence For Your Everyday Applications
Everyday Machine Intelligence For Your Everyday ApplicationsEveryday Machine Intelligence For Your Everyday Applications
Everyday Machine Intelligence For Your Everyday Applications
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Virtual Meeting Room
Virtual Meeting RoomVirtual Meeting Room
Virtual Meeting Room
 
Google Cloud Platform - Building a scalable Mobile Application
Google Cloud Platform - Building a scalable Mobile ApplicationGoogle Cloud Platform - Building a scalable Mobile Application
Google Cloud Platform - Building a scalable Mobile Application
 
Growth hacking 101
Growth hacking 101Growth hacking 101
Growth hacking 101
 
Alphabet - From A to Z
Alphabet - From A to ZAlphabet - From A to Z
Alphabet - From A to Z
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Trend Detection and Analysis on Twitter Using Machine Learning

  • 1. Trend Detection and Analysis on Twitter Benjamin Räthlein Henning Muszynski Lukas Masuch
  • 3. 3 Motivation Predict the stock market in real time source source Detecting influenza epidemics Automatic crime prediction source “Successful results of mainly research-based projects helped to open up new business opportunities”
  • 5. 5 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Bag of Words Bags Count #newyear 7 #christmas 6 @bigdata 2 @sap 3
  • 6. 6 Statistical MeasurementEarly Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Report statistics (every 20 minutes): • Total hashtags & user mentions • Hashtag/mentions count • Usage growth per hashtag/mention • Participating users per hashtag/mention • Retweet count per hashtag/mention
  • 7. 7 Early Trend Detector Bag-of-words (Hashtags, Mentions) Twitter Streaming API (Twython) Architecture Statistical Measurement (growth, average usage, retweets, participating users…) Anomaly Detection Time Series Analysis Calculated for every hashtag / user mention Every 2 / 4 hours based on reports Anomaly detection using: • Relative & absolute fluctuation • Total occurrences (sum) • Minimum occurrences • Maximum occurrences • Average occurrences Time Series Analysis
  • 8. 8 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) Lowercasing & tokenizing URL & stopword removal Stop Word Removal This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and. This sample text shows which words will be removed when applying stop word removal. Mostly words like the, a or and.
  • 9. 9 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Stemming Amazing Amazement Amazed amaze
  • 10. 10 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Sentiment Analysis I love cookiesI hate cookies
  • 11. 11 Twitter Streaming API (Twython) Architecture Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Topic Modeling Topics • … • … • … Trend Classification
  • 12. 14 Trend Analyzer Text Preprocessing (Python NLTK) URL & stopword removal Lowercasing & tokenizing Word stemming Sentiment Analysis Topic Modeling (LDA) Wordcloud Visualization Wordfreq.js Wordcloud2.js GeoSpatial Visualization CartoDB Early Trend Detector Bag-of-words (Hashtags, Mentions) Anomaly Detection Statistical Measurement (growth, average usage, retweets, participating users…) Time Series Analysis Trend Classification Twitter Streaming API (Twython) Architecture
  • 14. 16 Limitations Tweets collected: 38 million (70GB) Only English tweets from the USA Twitter Streaming API
  • 17. 19 New Year Geospatial Analysis Midnight Los Angeles Midnight New York
  • 18. 20 New Year Sentiment Analysis Positive Neutral Negative Home sick on #nye. Horrible timing stupid cold. Ugh. My date is my couch & pillow watching. #HappyNewYear everyone. #HappyNewYear from the Youth for Astronomy and Engineering Program at Space Telescope Science Institute! Happy New Year! Last year was amazing, and here’s to another great year of love & happiness! #NYE2015
  • 22. 24 Air Asia Tragedy Topic Modeling News airasia, missing, flight, air, Indonesia, singapore, asia Search for the Plane airasia, missing, plane, find, plane, world, technology Sympathy Prayers, families, thoughts, airasia, crash, thought, airfrance Cause airasia, weather, flight, pilots, fly, bad, path International Help raaf, butterworth, china, australia, Russia, trndnl, trending
  • 23. 25 Air Asia Tragedy Sentiment Analysis Neutral Negative Positive Prayers are USELESS! Stop repeating meaningless crap, pretending that you care … #PrayForAirAsia #QZ8501 #GrowABrain #ReligousNonsense #BREAKING #AirAsia Flight #8501 likely “at the bottom of the sea” rescue officials says. May God’s great love shine on the families and loved ones of all passengers and crew #AirAsia #8501
  • 24. 26 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 25. 27 Air Asia Tragedy Google Trends Comparison Google Trends Twitter Sample
  • 29. 31 Sony Hack Topic Modeling Christmas Release theinterview, christmas, day, theaters, freedom, theater, showing Reviews theinterview, jamesfrancotv, sethrogen, movie, interview, funny, hilarious Suspicions northkorea, sonyhack, korea, north, internet, sony, amp News theinterview, sonypictures, sony, movie, korea, north, interview Insider Joke theinterview, aint, hate, cuz, jealous, anus, peanutbutter
  • 31. 33 Sony Hack Sentiment Analysis Neutral Negative Positive #TheInterview SUCKS!!! @sethrogen Like I knew it would #Stupid #NotFunny #Sony says #TheInterview made more than $1 million at the box office on in 1 single day on Dec. 25. Happy I joined my fellow Americans in the great #TheInterview Christmas Day Viewing. Plus it was pretty funny, truth be told.
  • 35. 37 Network Outage Topic Modeling Network Error xbox, psn, sign, connect, live, error, account, issues Connection between Hacks xbox, playstation, watch, movie, fuckcrucifix, north, korea, interview Xbox Down xbox, christmas, play, xboxlivedown, live, xboxlive, xboxsupport, day Caused Damage playstation, dollar, psn, company, lizardsquad, sony, billion, multi Hacker Group fuckcrucifix, lizardmafia, lizardsquad, fuck,lizard, squad, finestsquad, stop Restored psn, back, playstation, online, askplaystation, network, psndown, working
  • 36. 38 Network Outage Sentiment Analysis Neutral Negative Positive @XboxSupport f*** your servers, a big ass company like you should handle these teenage kids, terrible @AskPlayStation when will the service be back online because it says there’s maintenance? @PlayStation thanks for the great year. I am sure this new year will be amazing. Don’t allow yourselves to be hacked ever again.
  • 37. 39 Conclusion High quality insights into world’s interest Twitter is very good for detecting and predicting trends Maintaining a high data quality is important