There are billions of human interactions on social media every day. How do you tune in to extract value from disparate and unstructured pieces of social data?
DataSift was built to solve this very problem. The platform unifies all the types of social, news and blogs data in a single place. And our unique coding language helps you find meaning in the data faster.
Learn about:
-The basics of the DataSift Platform
-What types of data you can access and query on our platform (hint: there are LOTS!)
-How you can enrich social data with DataSift in terms of context, sentiment, demographics and links content
-Our magic ingredients - CSDL and VEDO that revolutionize social data analysis
2. HUMAN DATA
INTELLIGENCE
FILTER TAG • ENRICH
STORE
Stream products will be covered today
To see PYLON (our aggregated, anonymized Facebook topic data), join our next live demo:
http://lp.datasift.com/20150701-Live-SE-Demo-Registration
DataSift is of Two Minds:
Indexed Data & Streaming
#DSwebinar
VEDO
3. 2011 1K 4
Launched
• San Francisco
• New York
• London
• Reading, UK
Customers across
40 countries
2B
Items
processed
per day
(These don’t count toward the 5 things)
Global offices:
#DSwebinar
4. Brave New Data World
of all digital data created
by consumers
emails a day
of US adults’ location is
known
increase in global
data by 2020
Thoughts
EmotionsLIKES
Dislikes
Intentions IdeasCurrent
Events
GEO
OccupationAge
Topics
GenderIdeas
Gender
Occupation
Intentions
Age
Thoughts
GEO
Dislikes
Age
Ideas
Thoughts
Age
Intentions
Current Events
Current Events
Emotions
GEO
Ideas
GEO
#DSwebinar
6. The Complexity of Human Data
VOLUME
VARIET
Y
VELOCITY
Billions of users
Noisy
Generated in real time
per second
Post vs blog vs like
Terabytes per day
Ambiguous
Big spikesUnstructured
#DSwebinar
10. Facebook Tencent Weibo Sina Weibo Google+ YouTube Instagram
LexisNexis Wikipedia
Wordpress
Tumblr Intense Debate Disqus NewsCred Reddit
TopixJiveTwitter EDGAR NewsVideoIMDBYammer
Unifying data from across the web
#DSwebinar
Human data is a particular challenge: not only is there a lot of it – but it’s complex, highly varied, and comes at you fast. It can also have
We bring in data from a ton of different places. You’ve probably heard of most of these – and we’d be happy to dig into more detail on any of these later on if you’re curious, or you can find more on our website.
A Facebook post looks different than a Disqus comment. But you might want to search for your company or product anywhere. Because we’ve already normalized the data, you write simplified filters that make it easy for you. You can write against both generic targets – like “the main body text contains android” or more specific, nuanced targets, such as “the author’s account is at least 90 days old”
Once we have the data in a standardized format we enrich it with a lot of really useful stuff. Just like the raw content and other information can be filtered on, so can all the enhanced data we add.
“This is cool! http://bit.ly/AsdFa”
Shortened URLs and tracking URLs are incredibly common in social data. What we do is not only traverse these redirects to their final destination, but we also fetch the page header information and metadata and append it to the source object. This means you can filter not only on posts which contain “Android”, but also posts with links which contain “Android” in the title, description, or keywords. We do this at line speeds, across every social post on the planet, as it happens. This is an extremely powerful tool and the value it can provide is considerable. So much of the social landscape is dominated by discussions of a shared link, and without that content, you can miss the entirety.