SlideShare ist ein Scribd-Unternehmen logo
1 von 47
seevl: Data-driven music discovery
Alexandre Passant, co-founder, CEO, MDG Web ltd
http://seevl.net // @seevl // alex@seevl.net // @terraces

LA SemWeb & WebSpeed Meet-up, 2 October 2012
Cross Campus, Santa Monica
a bit of backgroud...
• Knowledge Engineering
• Social Web & Enterprise 2.0
• Sensor Networks & Real-Time
architecture
dbpedia:Bad_Brains                         dbpedia:Hardcore_Punk



                              p:associatedActs       p:genre                    p:genre



:alex   foaf:topic_interest          dbpedia:Beastie_Boys                            dbpedia:Black_Flag_(band)



                                       p:currentMembers



                                     dbpedia:Adam_Yauch                          dbpedia:B._B._King




                                                    skos:subject          skos:subject



                                                 dbpedia:Category:American_vegatarians
dbpedia:Bad_Brains                         dbpedia:Hardcore_Punk



                              p:associatedActs       p:genre                    p:genre



:alex   foaf:topic_interest          dbpedia:Beastie_Boys                            dbpedia:Black_Flag_(band)



                                       p:currentMembers



                                     dbpedia:Adam_Yauch                          dbpedia:B._B._King




                                                    skos:subject          skos:subject



                                                 dbpedia:Category:American_vegatarians
Our approach: SLADE

• Semantic LAyer for Data Exploration
 • A framework to build data-driven apps
 • ETL from existing sources / APIs
 • Search, discovery, recommendations
 • Data access / API
 • Generic, config-based, domain-agnostic
The pipeline

                    Data-extraction
                         and
                     interlinking

                                        Entity-centric semantic knowledge base
Web data sources                           (artists, genres, labels, locations...)

                                                 Storage




                   REST-ful interface

                                        Search, discovery and recommendation
 seevl products                           engine, on-top of our graph-database
Challenges
• Some technical challenges faced when building
  SLADE and seevl.net
 • Data models: Chosing the right schemas
 • Data access: SPARQL or API or ... ?
 • Scalability: Caching and optimisation strategies
 • User Experience: User-centric design
data models
RDF since day one
• RDF ?
 • Agile model (ideal when iterating)
 • Intuitive aspect of graph modelling
 • Standard toolkits (SPARQL / HTTP)
• OWL? RDFS?
 • Minor use of inference (type, hierarchies)
Artist data
• Music Ontology
 • Label, Genres, Influences,Origins ...
 • Collaborations between artists
 • Activity period (add-on)
• Additional models/mappings
 • e.g. Bio Vocabulary (birth/death), FOAF...
Social activities
• SIOC & SIOC-actions
 • Social graph / sub-graph
 • Action-centric activities (like, listen)
• Inferring user’s taste profile
 • Top artist, genres, labels
 • Using latest actions
Similarity / Recsys
• Graph-based similarities
 • Data-driven recommendations
 • Ranking using weight-factors
 • Explanations / tracking
• The Similarity Ontology
 • Domain-agnostic
Provenance
• Keep trace of every statement in the ETL
 • Origin, type and time of extraction
• With a low number of additional triples
 • Introducing “data-slices”
 • Multiple slices (=subgraphs) per resource
 • Quick updates (DELETE / INSERT)
Provenance and graphs
GRAPH svl:seevl_id/wikipedia/facts/extract
{
    svl: seevl_id mo:genre svl:BntvuZAy .
    svl:seevl_id/wikipedia/extract dc:created
    “2012-10-25” ; rdfs:seeAlso
    wikipedia:Social_Distortion .
}
data access
SPARQL
• Pros
 • W3C Standard, Powerful
 • HTTP-based w/ SPARQL Protocol
 • SPARQL Update in 1.1
• Cons
 • Learning curve for non-RDF people
URI patterns + JSON-LD
 • Pre-defined URIs mapped to SPARQL
   query patterns, returning JSON-LD data
  • Search queries or resources description
  • Content-negotiation or ?_format=json
 • GET and POST
  • POST => SPARQL UPDATE
  • GET => SPARQL SELECT / ASK
JSON-LD

• JSON for Linking Data
 • The best of both worlds
 • JSON serialization, works with any parser
 • Additional semantics (URIs, typed links,
    etc.) with JSON-LD parsers
 • Use of context/mappings to avoid URIs
Search

• /entity/?property=value
    • JSON-LD mappings used in URI templates
    • Works with literals, dates, resources
    • Ranking algorithm / alpha-ranking
    • Patterns defined in a single config file
Search (text)
• /entity/?
  prefLabel=clash&type=artist&_sort=count_desc
• Translated into
    SELECT ?x WHERE {
        ?x a mo:artist ; skos:prefLabel ?x .
        ?x bif:contains “clash” .
    }
Search (relations)
• /entity/?genre=BntvuZAy&type=artist
• Translated into
   SELECT ?x WHERE {
       ?x a mo:artist ; mo:genre svl:BntvuZAy .
   }
Resource description
• Patterns mapped to resource URI to
  retrieve subset of the resource description
 • /entity/seevl_id/infos
 • /entity/seevl_id/facts
 • /entity/seevl_id/links
 • /entity/seevl_id/related(/related_id)
scalability
Is SPARQL fast enough?
• SPARQL is very powerful, but can be slow
 • Some simple queries may lead to deep
    graph patterns or transversal queries
    depending on the modelling
 • FILTERS (e.g. text and date based queries)
    are expensive
 • Not all triple-stores are equal
Splitting queries
• “List all resource sharing common
  property-values with the current one,
  whatever that property is”
 • Fits in a single SPARQL query
 • Doesn’t properly scale
• Becoming faster when splitting the query
  and recomposing results via internal scripts
SPARQL: splitting queries
                   Direct SPARQL       Property-slicing      Complete-slicing
                 Queries     Time    Queries       Time    Queries       Time
  Ramones          1        139.97     20         109.51     66         37.84
 Johnny Cash       1        257.81     30         152.60    135         75.35
     U2            1        155.53     22         122.91     70         44.03
  The Clash        1        146.43     20         110.84     79         42.61
 Bad Religion      1        104.08     23          86.49     97         47.35
The Aggrolites     1        145.92     13         114.52     28         28.33
 Janis Joplin      1        230.88     27         151.00     98         62.81
SPARQL + Redis
• Started by using Memcache to store query
  results (e.g. “?x genre $y”)
  • Good, but costly for the first user
• Then, materialising results in-memory using
  Redis as a key-value cache system
  • Low indexing time (few minute on laptop)
  • Increasing query-performance, real-time
SPARQL + Redis

• Redis
 • HSET to define entities (minimal data)
 • ZADD to store ordered sets of key-
    values, with our own ranking scheme
  • ZRANGE to retreive w/ correct order
• Everything in memory, instant query results
SPARQL + Redis
self.redis.hset(entity, 'uri', uri)
self.redis.hset(entity, 'prefLabel', prefLabel)
self.redis.hset(entity, 'description', description)
self.redis.zadd(‘genre:BntvuZAy’, entity, score)
...
self.redis.zrange(pattern, min, max, 'withscores')
user-experience
User-experience
• Interfaces for graph-based/semantic data
 • Don’t need to be ugly!
 • As long as they’re built for users first
• Focus on vertical-UX, rather than SemWeb-UX
 • Check best practices in the domain
 • Involve HCI / non-SemWeb people
take-away message
Lessons learnt
• Don’t reinvent the wheel, check existing
  stacks and use what fits for the job
• Make it simple for your developers, using
  REST-ful interfaces and design patterns
• Accept compromises, be pragmatic
• This of users / create persona who are not
  SemWeb-geeks when designing the UX
Questions?
http://seevl.net // @seevl
alex@seevl.net // @terraces

Weitere ähnliche Inhalte

Was ist angesagt?

Facets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data ExplorationFacets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data ExplorationRoberto García
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked DataGabriela Agustini
 
RDA and Hebraica: Applying RDA in one cataloging community
RDA and Hebraica: Applying RDA in one cataloging communityRDA and Hebraica: Applying RDA in one cataloging community
RDA and Hebraica: Applying RDA in one cataloging communityAJL2011
 
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...Dr.-Ing. Thomas Hartmann
 
"We want something like Google ... why do we get so many results?" : implemen...
"We want something like Google ... why do we get so many results?" : implemen..."We want something like Google ... why do we get so many results?" : implemen...
"We want something like Google ... why do we get so many results?" : implemen...CIGScotland
 
Spanish 3221
Spanish 3221Spanish 3221
Spanish 3221k-baril
 

Was ist angesagt? (6)

Facets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data ExplorationFacets and Pivoting for Flexible and Usable Linked Data Exploration
Facets and Pivoting for Flexible and Usable Linked Data Exploration
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
 
RDA and Hebraica: Applying RDA in one cataloging community
RDA and Hebraica: Applying RDA in one cataloging communityRDA and Hebraica: Applying RDA in one cataloging community
RDA and Hebraica: Applying RDA in one cataloging community
 
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
 
"We want something like Google ... why do we get so many results?" : implemen...
"We want something like Google ... why do we get so many results?" : implemen..."We want something like Google ... why do we get so many results?" : implemen...
"We want something like Google ... why do we get so many results?" : implemen...
 
Spanish 3221
Spanish 3221Spanish 3221
Spanish 3221
 

Ähnlich wie Data-driven music discovery with seevl

SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic WebJan Beeck
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databasesthai
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrCarly Strasser
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic WebRoberto García
 
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Erik Hatcher
 
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012Amazon Web Services
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphLucidworks
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015Himanshu Desai
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrJake Mannix
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkJake Mannix
 
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16MLconf
 

Ähnlich wie Data-driven music discovery with seevl (20)

Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic Web
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
Presentation shexer
Presentation shexerPresentation shexer
Presentation shexer
 
IASSIST identifiers By Joan Starr
IASSIST identifiers By Joan StarrIASSIST identifiers By Joan Starr
IASSIST identifiers By Joan Starr
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
Sindice warehousing meetup
Sindice warehousing meetupSindice warehousing meetup
Sindice warehousing meetup
 
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
 
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
BDT204 Awesome Applications of Open Data - AWS re: Invent 2012
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
Webinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and GraphWebinar: Solr 6 Deep Dive - SQL and Graph
Webinar: Solr 6 Deep Dive - SQL and Graph
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+Solr
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and Spark
 
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
 

Mehr von Alexandre Passant

seevl: Cloud computing, the Semantic Web and Music Discovery
seevl: Cloud computing, the Semantic Web and Music Discoveryseevl: Cloud computing, the Semantic Web and Music Discovery
seevl: Cloud computing, the Semantic Web and Music DiscoveryAlexandre Passant
 
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)Alexandre Passant
 
Seevl - SemTech lightning talk
Seevl - SemTech lightning talkSeevl - SemTech lightning talk
Seevl - SemTech lightning talkAlexandre Passant
 
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le Web
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le WebSPARQL 1.1 - Quoi de neuf pour manipuler les données sur le Web
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le WebAlexandre Passant
 
Social Web - The Next Generation
Social Web - The Next GenerationSocial Web - The Next Generation
Social Web - The Next GenerationAlexandre Passant
 
Dbrec - Music recommendations using DBpedia
Dbrec - Music recommendations using DBpediaDbrec - Music recommendations using DBpedia
Dbrec - Music recommendations using DBpediaAlexandre Passant
 
Semwebbers, LODers: What PubSubHubbub can do for you
Semwebbers, LODers: What PubSubHubbub can do for you Semwebbers, LODers: What PubSubHubbub can do for you
Semwebbers, LODers: What PubSubHubbub can do for you Alexandre Passant
 
Rethinking Microblogging: Open Distributed Semantic
Rethinking Microblogging: Open Distributed SemanticRethinking Microblogging: Open Distributed Semantic
Rethinking Microblogging: Open Distributed SemanticAlexandre Passant
 
SMOB - A Framework for Semantic Microblogging
SMOB - A Framework for Semantic MicrobloggingSMOB - A Framework for Semantic Microblogging
SMOB - A Framework for Semantic MicrobloggingAlexandre Passant
 
A semantic framework for modelling quotes in email conversations
A semantic framework for modelling quotes in email conversationsA semantic framework for modelling quotes in email conversations
A semantic framework for modelling quotes in email conversationsAlexandre Passant
 
Hey! Ho! Let’s go! Explanatory music recommendations with dbrec
Hey! Ho! Let’s go! Explanatory music recommendations with dbrecHey! Ho! Let’s go! Explanatory music recommendations with dbrec
Hey! Ho! Let’s go! Explanatory music recommendations with dbrecAlexandre Passant
 
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...Alexandre Passant
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebAlexandre Passant
 
Ontologies et Web 2.0 : une Expérimentation en Entreprise
Ontologies et Web 2.0 : une Expérimentation en EntrepriseOntologies et Web 2.0 : une Expérimentation en Entreprise
Ontologies et Web 2.0 : une Expérimentation en EntrepriseAlexandre Passant
 
A user-friendly interface to browse and find DOAP project with doap:store
A user-friendly interface to browse and find DOAP project with doap:storeA user-friendly interface to browse and find DOAP project with doap:store
A user-friendly interface to browse and find DOAP project with doap:storeAlexandre Passant
 
Folksonomies, Ontologies and Corporate Blogging
Folksonomies, Ontologies and Corporate BloggingFolksonomies, Ontologies and Corporate Blogging
Folksonomies, Ontologies and Corporate BloggingAlexandre Passant
 
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...Alexandre Passant
 
Using Semantics to Improve Corporate Online Communities
Using Semantics to Improve Corporate Online CommunitiesUsing Semantics to Improve Corporate Online Communities
Using Semantics to Improve Corporate Online CommunitiesAlexandre Passant
 

Mehr von Alexandre Passant (20)

seevl: Cloud computing, the Semantic Web and Music Discovery
seevl: Cloud computing, the Semantic Web and Music Discoveryseevl: Cloud computing, the Semantic Web and Music Discovery
seevl: Cloud computing, the Semantic Web and Music Discovery
 
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)
Semwebbers, LODers, what PubSubHubbub can do for you (SemTech)
 
Seevl - SemTech lightning talk
Seevl - SemTech lightning talkSeevl - SemTech lightning talk
Seevl - SemTech lightning talk
 
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le Web
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le WebSPARQL 1.1 - Quoi de neuf pour manipuler les données sur le Web
SPARQL 1.1 - Quoi de neuf pour manipuler les données sur le Web
 
Social Web - The Next Generation
Social Web - The Next GenerationSocial Web - The Next Generation
Social Web - The Next Generation
 
Dbrec - Music recommendations using DBpedia
Dbrec - Music recommendations using DBpediaDbrec - Music recommendations using DBpedia
Dbrec - Music recommendations using DBpedia
 
Semwebbers, LODers: What PubSubHubbub can do for you
Semwebbers, LODers: What PubSubHubbub can do for you Semwebbers, LODers: What PubSubHubbub can do for you
Semwebbers, LODers: What PubSubHubbub can do for you
 
i-Semantics panel
i-Semantics paneli-Semantics panel
i-Semantics panel
 
Rethinking Microblogging: Open Distributed Semantic
Rethinking Microblogging: Open Distributed SemanticRethinking Microblogging: Open Distributed Semantic
Rethinking Microblogging: Open Distributed Semantic
 
SMOB - A Framework for Semantic Microblogging
SMOB - A Framework for Semantic MicrobloggingSMOB - A Framework for Semantic Microblogging
SMOB - A Framework for Semantic Microblogging
 
A semantic framework for modelling quotes in email conversations
A semantic framework for modelling quotes in email conversationsA semantic framework for modelling quotes in email conversations
A semantic framework for modelling quotes in email conversations
 
Hey! Ho! Let’s go! Explanatory music recommendations with dbrec
Hey! Ho! Let’s go! Explanatory music recommendations with dbrecHey! Ho! Let’s go! Explanatory music recommendations with dbrec
Hey! Ho! Let’s go! Explanatory music recommendations with dbrec
 
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...
sparqlPuSH: Proactive notification of data updates in RDF stores using PubSub...
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Ontologies et Web 2.0 : une Expérimentation en Entreprise
Ontologies et Web 2.0 : une Expérimentation en EntrepriseOntologies et Web 2.0 : une Expérimentation en Entreprise
Ontologies et Web 2.0 : une Expérimentation en Entreprise
 
A user-friendly interface to browse and find DOAP project with doap:store
A user-friendly interface to browse and find DOAP project with doap:storeA user-friendly interface to browse and find DOAP project with doap:store
A user-friendly interface to browse and find DOAP project with doap:store
 
Folksonomies, Ontologies and Corporate Blogging
Folksonomies, Ontologies and Corporate BloggingFolksonomies, Ontologies and Corporate Blogging
Folksonomies, Ontologies and Corporate Blogging
 
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval ...
 
The Social Web
The Social WebThe Social Web
The Social Web
 
Using Semantics to Improve Corporate Online Communities
Using Semantics to Improve Corporate Online CommunitiesUsing Semantics to Improve Corporate Online Communities
Using Semantics to Improve Corporate Online Communities
 

Kürzlich hochgeladen

Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Kürzlich hochgeladen (20)

Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Data-driven music discovery with seevl

  • 1. seevl: Data-driven music discovery Alexandre Passant, co-founder, CEO, MDG Web ltd http://seevl.net // @seevl // alex@seevl.net // @terraces LA SemWeb & WebSpeed Meet-up, 2 October 2012 Cross Campus, Santa Monica
  • 2. a bit of backgroud...
  • 3. • Knowledge Engineering • Social Web & Enterprise 2.0 • Sensor Networks & Real-Time
  • 4.
  • 5.
  • 6.
  • 8. dbpedia:Bad_Brains dbpedia:Hardcore_Punk p:associatedActs p:genre p:genre :alex foaf:topic_interest dbpedia:Beastie_Boys dbpedia:Black_Flag_(band) p:currentMembers dbpedia:Adam_Yauch dbpedia:B._B._King skos:subject skos:subject dbpedia:Category:American_vegatarians
  • 9. dbpedia:Bad_Brains dbpedia:Hardcore_Punk p:associatedActs p:genre p:genre :alex foaf:topic_interest dbpedia:Beastie_Boys dbpedia:Black_Flag_(band) p:currentMembers dbpedia:Adam_Yauch dbpedia:B._B._King skos:subject skos:subject dbpedia:Category:American_vegatarians
  • 10. Our approach: SLADE • Semantic LAyer for Data Exploration • A framework to build data-driven apps • ETL from existing sources / APIs • Search, discovery, recommendations • Data access / API • Generic, config-based, domain-agnostic
  • 11. The pipeline Data-extraction and interlinking Entity-centric semantic knowledge base Web data sources (artists, genres, labels, locations...) Storage REST-ful interface Search, discovery and recommendation seevl products engine, on-top of our graph-database
  • 12. Challenges • Some technical challenges faced when building SLADE and seevl.net • Data models: Chosing the right schemas • Data access: SPARQL or API or ... ? • Scalability: Caching and optimisation strategies • User Experience: User-centric design
  • 14. RDF since day one • RDF ? • Agile model (ideal when iterating) • Intuitive aspect of graph modelling • Standard toolkits (SPARQL / HTTP) • OWL? RDFS? • Minor use of inference (type, hierarchies)
  • 15. Artist data • Music Ontology • Label, Genres, Influences,Origins ... • Collaborations between artists • Activity period (add-on) • Additional models/mappings • e.g. Bio Vocabulary (birth/death), FOAF...
  • 16.
  • 17. Social activities • SIOC & SIOC-actions • Social graph / sub-graph • Action-centric activities (like, listen) • Inferring user’s taste profile • Top artist, genres, labels • Using latest actions
  • 18.
  • 19. Similarity / Recsys • Graph-based similarities • Data-driven recommendations • Ranking using weight-factors • Explanations / tracking • The Similarity Ontology • Domain-agnostic
  • 20.
  • 21. Provenance • Keep trace of every statement in the ETL • Origin, type and time of extraction • With a low number of additional triples • Introducing “data-slices” • Multiple slices (=subgraphs) per resource • Quick updates (DELETE / INSERT)
  • 22. Provenance and graphs GRAPH svl:seevl_id/wikipedia/facts/extract { svl: seevl_id mo:genre svl:BntvuZAy . svl:seevl_id/wikipedia/extract dc:created “2012-10-25” ; rdfs:seeAlso wikipedia:Social_Distortion . }
  • 24. SPARQL • Pros • W3C Standard, Powerful • HTTP-based w/ SPARQL Protocol • SPARQL Update in 1.1 • Cons • Learning curve for non-RDF people
  • 25. URI patterns + JSON-LD • Pre-defined URIs mapped to SPARQL query patterns, returning JSON-LD data • Search queries or resources description • Content-negotiation or ?_format=json • GET and POST • POST => SPARQL UPDATE • GET => SPARQL SELECT / ASK
  • 26. JSON-LD • JSON for Linking Data • The best of both worlds • JSON serialization, works with any parser • Additional semantics (URIs, typed links, etc.) with JSON-LD parsers • Use of context/mappings to avoid URIs
  • 27. Search • /entity/?property=value • JSON-LD mappings used in URI templates • Works with literals, dates, resources • Ranking algorithm / alpha-ranking • Patterns defined in a single config file
  • 28. Search (text) • /entity/? prefLabel=clash&type=artist&_sort=count_desc • Translated into SELECT ?x WHERE { ?x a mo:artist ; skos:prefLabel ?x . ?x bif:contains “clash” . }
  • 29.
  • 30. Search (relations) • /entity/?genre=BntvuZAy&type=artist • Translated into SELECT ?x WHERE { ?x a mo:artist ; mo:genre svl:BntvuZAy . }
  • 31.
  • 32.
  • 33. Resource description • Patterns mapped to resource URI to retrieve subset of the resource description • /entity/seevl_id/infos • /entity/seevl_id/facts • /entity/seevl_id/links • /entity/seevl_id/related(/related_id)
  • 34.
  • 35.
  • 37. Is SPARQL fast enough? • SPARQL is very powerful, but can be slow • Some simple queries may lead to deep graph patterns or transversal queries depending on the modelling • FILTERS (e.g. text and date based queries) are expensive • Not all triple-stores are equal
  • 38. Splitting queries • “List all resource sharing common property-values with the current one, whatever that property is” • Fits in a single SPARQL query • Doesn’t properly scale • Becoming faster when splitting the query and recomposing results via internal scripts
  • 39. SPARQL: splitting queries Direct SPARQL Property-slicing Complete-slicing Queries Time Queries Time Queries Time Ramones 1 139.97 20 109.51 66 37.84 Johnny Cash 1 257.81 30 152.60 135 75.35 U2 1 155.53 22 122.91 70 44.03 The Clash 1 146.43 20 110.84 79 42.61 Bad Religion 1 104.08 23 86.49 97 47.35 The Aggrolites 1 145.92 13 114.52 28 28.33 Janis Joplin 1 230.88 27 151.00 98 62.81
  • 40. SPARQL + Redis • Started by using Memcache to store query results (e.g. “?x genre $y”) • Good, but costly for the first user • Then, materialising results in-memory using Redis as a key-value cache system • Low indexing time (few minute on laptop) • Increasing query-performance, real-time
  • 41. SPARQL + Redis • Redis • HSET to define entities (minimal data) • ZADD to store ordered sets of key- values, with our own ranking scheme • ZRANGE to retreive w/ correct order • Everything in memory, instant query results
  • 42. SPARQL + Redis self.redis.hset(entity, 'uri', uri) self.redis.hset(entity, 'prefLabel', prefLabel) self.redis.hset(entity, 'description', description) self.redis.zadd(‘genre:BntvuZAy’, entity, score) ... self.redis.zrange(pattern, min, max, 'withscores')
  • 44. User-experience • Interfaces for graph-based/semantic data • Don’t need to be ugly! • As long as they’re built for users first • Focus on vertical-UX, rather than SemWeb-UX • Check best practices in the domain • Involve HCI / non-SemWeb people
  • 46. Lessons learnt • Don’t reinvent the wheel, check existing stacks and use what fits for the job • Make it simple for your developers, using REST-ful interfaces and design patterns • Accept compromises, be pragmatic • This of users / create persona who are not SemWeb-geeks when designing the UX

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n