SlideShare ist ein Scribd-Unternehmen logo
1 von 85
Downloaden Sie, um offline zu lesen
From publisher to platform:
    How the Guardian embraced the internet
    using content, search, and Open Source
                           Stephen Dunn, Guardian News and Media
                        stephen.dunn@guardian.co.uk, 25th May, 2011
                               Twitter: @cuica, @openplatform




Thursday, 26 May 2011
1




       From publisher to platform
       How the Guardian embraced the Internet using
       content, search, and Open Source
       Stephen Dunn, Guardian News and Media

                                                      2


Thursday, 26 May 2011
The publishing era




                                             3


Thursday, 26 May 2011
We started a long
          time ago:




Thursday, 26 May 2011
Keyword page

                                                                                        Live blogs
             Apps                    Mobile site




                  Twitter updates
                                                           Swine flu                           Comment



              Content partnerships



                                                                                                     Newspapers

                           Audio


                                                   Video            Open platform API




Thursday, 26 May 2011
To secure the financial and editorial
  To secure the financial and editorial independence
  independence of the Guardian in perpetuity.
  To promote freedom in thein perpetuity
          of the Guardian press and liberal
  journalism globally.

        To promote freedom in the press and liberal
      To become the world's leading liberal voice.

                        journalism globally


Thursday, 26 May 2011
Open Web Principles




                                              7


Thursday, 26 May 2011
2009




                        8


Thursday, 26 May 2011
1. Permanent




                                                      http://www.flickr.com/photos/fstorr/




             •      “A cool URI is one that does not change”                    Tim Berners-Lee 1998
             •      1.5 million resources redirected to new scheme
                                                                                                  9




Thursday, 26 May 2011
2. Addressable
                        ★ Resources are “about” something - ready for the
                          social web.

                        ★ We live in “the age of point-at-things” (Coates 2005)




                                                                                  10


Thursday, 26 May 2011
3. Discoverable


                 ★ Multiple routes
                   to content

                 ★ Tagging drives
                   discovery




                                              11


Thursday, 26 May 2011
4. Open




                                  12


Thursday, 26 May 2011
Example: The Hackable Guardian


            http://
    www.guardian.co.uk/....

        /technology/internet /rss

        /technology/all /rss

        /environment/climatechange +business/globaleconomy/rss


Thursday, 26 May 2011
Results...




                                     14


Thursday, 26 May 2011
Site traffic growth                                      Final Release

                                                                  Unique Users
                         30,000,000

                         26,250,000                             First release


                         22,500,000
          Unique Users




                                             Pre - project
                         18,750,000

                         15,000,000

                         11,250,000
                                                                                            40M
                          7,500,000

                          3,750,000



                                  Sep 2005                   Oct 2006            Nov 2007             Dec 2008

                                                                                                                 15


Thursday, 26 May 2011
However...


                                     16


Thursday, 26 May 2011
1 Billion+
                         Internet
                          Users!




                                     17


Thursday, 26 May 2011
18


Thursday, 26 May 2011
19


Thursday, 26 May 2011
20


Thursday, 26 May 2011
...“How I
       stopped
       worrying about
       my website and
       learned to love
       the whole
       internet.”
       Matt McAlister

                         21


Thursday, 26 May 2011
The Open Strategy

                  OPEN IN                  OPEN OUT

                  Bring in data and apps   Enable partners to
                  from the Internet        build applications
                                           using Guardian
                                           content and services
                                           for other platforms


                                                                  22




Thursday, 26 May 2011
23


Thursday, 26 May 2011
"Our most interesting experiments lie in combining
    what we know with the experience, opinions and
    expertise of the people who want to participate
    rather than passively receive.”
                                                     24


Thursday, 26 May 2011
25


Thursday, 26 May 2011
26


Thursday, 26 May 2011
27


Thursday, 26 May 2011
28


Thursday, 26 May 2011
29


Thursday, 26 May 2011
30


Thursday, 26 May 2011
31


Thursday, 26 May 2011
32


Thursday, 26 May 2011
33


Thursday, 26 May 2011
Jack Shenker
   “The Guardian alongside Al Jazeera was the one news source
   that everybody on the streets in Tahrir - not just in Cairo but in
   surrounding cities and major centers of revolutionary activity -
   that people were talking about.”
                                                                 34


Thursday, 26 May 2011
The Open Strategy

                  OPEN IN                  OPEN OUT

                  Bring in data and apps   Enable partners to
                  from the Internet        build applications
                                           using Guardian
                                           content and services
                                           for other platforms


                                                                  35
                                                                  22




Thursday, 26 May 2011
The Open Platform



                                            36


Thursday, 26 May 2011
The suite of services enabling
      partners to build applications with
                 the Guardian


                                        37


Thursday, 26 May 2011
OPEN IN                  OPEN OUT

                  Bring in data and apps   Enable partners to
                  from the Internet        build applications
                                           using Guardian
                                           content and services
                                           for other platforms


                                                                  38
                                                                  22




Thursday, 26 May 2011
CONTENT API      DATA STORE       POLITICS API
                         A service for   A directory of   Open database
                         selecting and    useful data      of candidates,
                           collecting     curated by      voting records,
                         content from      Guardian       constituencies,
                         the Guardian       editors       election results,
                           for re-use                       live data on
                                                            election day




Thursday, 26 May 2011
Mutualised news!




                                           40


Thursday, 26 May 2011
Mutualised news!




                                           41


Thursday, 26 May 2011
Mutualised news!




                                           42


Thursday, 26 May 2011
43


Thursday, 26 May 2011
44


Thursday, 26 May 2011
45


Thursday, 26 May 2011
46


Thursday, 26 May 2011
DATA STORE
                          A directory of
                        useful data curated
                           by Guardian
                              editors




Thursday, 26 May 2011
POLITICS API
           Open database of
           candidates, voting
        records, constituencies,
          election results, live
          data on election day




Thursday, 26 May 2011
POLITICS API
         Open database of
         candidates, voting
      records, constituencies,
        election results, live
        data on election day




                                 49


Thursday, 26 May 2011
<OBLIGATORY DOGFOOD SLIDE >


                                          50


Thursday, 26 May 2011
51


Thursday, 26 May 2011
Thursday, 26 May 2011
Thursday, 26 May 2011
Thursday, 26 May 2011
Thursday, 26 May 2011
Open for Business




                                            56


Thursday, 26 May 2011
3 Tiers of access
      3 Revenue models

      Keyless: Take our headlines. You keep associated
      revenues.

      Approved: Take our full article content, but with an
      advert. Guardian keeps ad revenue, you keep rest-of-
      page revenue.

      Bespoke: Take, reformat, augment our content
      Revenue model to be negotiated. Combination of
      Media, Fees, Downloads.


                                                             57


Thursday, 26 May 2011
58


Thursday, 26 May 2011
What this means
              Open Out: Developers can now access full content APIs on
              demand with keys post-approved

              Platform is positioned as a place to do business

              So rapid scalability, reliability and performance are now core
              requirements




                                                                               59


Thursday, 26 May 2011
OPEN IN            OPEN OUT
               Bring in data and   Allow partners to
                apps from the      build applications
                    internet        using Guardian
                                      content and
                                   services for other
                                       platforms


Thursday, 26 May 2011
Simple REST/HTTP
          MICROAPPS             framework allows lightweight
                                development
          A framework for
        integrating 3rd party   Applications proxied for
          applications into     performance
           guardian.co.uk
                                Apps generally hosted in the
                                cloud, allows hot deployment
                                into production




                                                               61


Thursday, 26 May 2011
MICROAPPS
         A framework for
       integrating 3rd party
         applications into
          guardian.co.uk




                               62


Thursday, 26 May 2011
• What could I cook?




Thursday, 26 May 2011
Bringing it together




                                               64


Thursday, 26 May 2011
65


Thursday, 26 May 2011
App showcase




                                       66


Thursday, 26 May 2011
From publisher to
                            platform
                        Seeking massive growth, but no longer only
                        broadcasting content on the website

                        User/partner engagement & contribution on
                         Journalism
                         data
                         software
                         applications
                         revenue and ads

                        Support developers and partners with data and APIs,
                        need scalability, reliability, speed
                                                                              67


Thursday, 26 May 2011
Evolving the
                        architecture


                                       68


Thursday, 26 May 2011
Web server     Web server     Web server


                        App server     App server      App server


                                 Memcached (added later)




                                         Oracle



                                         CMS




Thursday, 26 May 2011
Web server   Web server    Web server

                        Why RDBMS?
                        App server   App server    App server
                        5 years ago, fewer alternatives

                                   Memcached
                        Understand operations procedures

                        Can easily recruit DBAs / devs
                                       Oracle
                        Developer/ops tools

                        Business critical system: a safe choice
                                       CMS




Thursday, 26 May 2011
Scaling traffic
                                                   Unique Users
                         30,000,000

                         26,250,000

                         22,500,000
          Unique Users




                         18,750,000

                         15,000,000

                         11,250,000

                          7,500,000

                          3,750,000



                                  Sep 2005   Sep 2006         Sep 2007   Sep 2008

                                                                                    71


Thursday, 26 May 2011
72


Thursday, 26 May 2011
73


Thursday, 26 May 2011
74


Thursday, 26 May 2011
75


Thursday, 26 May 2011
76


Thursday, 26 May 2011
77


Thursday, 26 May 2011
We chose Solr/Lucene
                        Can perform complex queries, including full-text search

                        We can change the schema with no downtime

                        Most queries are of similar cost

                        Scales very well horizontally

                        “Just worked” in the cloud

                        No strange control processes/engines

                        Developers just loved working with it!
                                                                              78


Thursday, 26 May 2011
79


Thursday, 26 May 2011
Api
                        Web servers

                                              Solr
                         App server
                                              Solr
                        Memcached
                                              Solr

                         RDBMS        Solr
                                              Solr

                                              Solr
                          CMS

                                             Cloud, EC2

                                                          80




Thursday, 26 May 2011
What about Open In?

                  OPEN IN                  OPEN OUT

                  Bring in data and apps   Enable partners to
                  from the Internet        build applications
                                           using Guardian
                                           content and services
                                           for other platforms


                                                                  81
                                                                  22




Thursday, 26 May 2011
Apps
                                Web servers



                        Proxy
                 App
                                 App server
                 App

                 App            Memcached

                 App
                                 RDBMS
                 App

                 App
                                  CMS
    external hosting
    app engine etc


                                              82




Thursday, 26 May 2011
Core
                                                               Out
                  In
                                  Web servers

                                                           Solr

                          Proxy
                   App
                                       App server
                   App                                     Solr
                                  Memcached
                   App                                     Solr
                   App   CMS                        Solr
                                                           Solr
                   App
                                        rdbms
                                                           Solr
                   App

external hosting                                           Cloud, EC2
app engine etc
                                                                     83




Thursday, 26 May 2011
84


Thursday, 26 May 2011
85


Thursday, 26 May 2011

Weitere ähnliche Inhalte

Ähnlich wie Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn

Panasonic search
Panasonic searchPanasonic search
Panasonic searchAOE
 
Digital tools for professional learning
Digital tools for professional learningDigital tools for professional learning
Digital tools for professional learningIngrid Koehler
 
Digital isn't everything, it's part of the pie
Digital isn't everything, it's part of the pieDigital isn't everything, it's part of the pie
Digital isn't everything, it's part of the pieDominique Hind
 
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...From Publisher To Platform: How The Guardian Used Content, Search, and Open S...
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...The Guardian Open Platform
 
Cpython embedded in solr - By Roman Chyla
Cpython embedded in solr - By Roman Chyla Cpython embedded in solr - By Roman Chyla
Cpython embedded in solr - By Roman Chyla lucenerevolution
 
Kasbank presentatie 205 jaar
Kasbank presentatie 205 jaar Kasbank presentatie 205 jaar
Kasbank presentatie 205 jaar Vincent Everts
 
Onde KH? (where to poop?) Pitch Keynote at SWRIO
Onde KH? (where to poop?) Pitch Keynote at SWRIOOnde KH? (where to poop?) Pitch Keynote at SWRIO
Onde KH? (where to poop?) Pitch Keynote at SWRIOBruno Marinho
 
1110 cpa bayside
1110 cpa bayside1110 cpa bayside
1110 cpa baysideMel Kettle
 
Can Media Queries Save Us All?
Can Media Queries Save Us All?Can Media Queries Save Us All?
Can Media Queries Save Us All?Tim Kadlec
 
Beyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free KnowledgeBeyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free KnowledgeErikMoeller
 
Sharath Bulusu, Guardian News & Media
Sharath Bulusu, Guardian News & MediaSharath Bulusu, Guardian News & Media
Sharath Bulusu, Guardian News & MediaMashery
 
Andrew Nicklin, NYC DoITT
Andrew Nicklin, NYC DoITTAndrew Nicklin, NYC DoITT
Andrew Nicklin, NYC DoITTMashery
 
Networks and online journalism
Networks and online journalismNetworks and online journalism
Networks and online journalismPaul Bradshaw
 
Bootcamp jan 26
Bootcamp   jan 26Bootcamp   jan 26
Bootcamp jan 26GOSO
 

Ähnlich wie Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn (20)

Panasonic search
Panasonic searchPanasonic search
Panasonic search
 
Digital tools for professional learning
Digital tools for professional learningDigital tools for professional learning
Digital tools for professional learning
 
Relationships between Open Science, Science 2.0, and Social Media
Relationships between Open Science, Science 2.0, and Social MediaRelationships between Open Science, Science 2.0, and Social Media
Relationships between Open Science, Science 2.0, and Social Media
 
Frontend Caching, PHPTek 2011, Chicago
Frontend Caching, PHPTek 2011, ChicagoFrontend Caching, PHPTek 2011, Chicago
Frontend Caching, PHPTek 2011, Chicago
 
Digital isn't everything, it's part of the pie
Digital isn't everything, it's part of the pieDigital isn't everything, it's part of the pie
Digital isn't everything, it's part of the pie
 
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...From Publisher To Platform: How The Guardian Used Content, Search, and Open S...
From Publisher To Platform: How The Guardian Used Content, Search, and Open S...
 
Cpython embedded in solr - By Roman Chyla
Cpython embedded in solr - By Roman Chyla Cpython embedded in solr - By Roman Chyla
Cpython embedded in solr - By Roman Chyla
 
Embedding CPython in Solr
Embedding CPython in SolrEmbedding CPython in Solr
Embedding CPython in Solr
 
Kasbank presentatie 205 jaar
Kasbank presentatie 205 jaar Kasbank presentatie 205 jaar
Kasbank presentatie 205 jaar
 
ENoLL FAO Workshop Alvaro Oliveira
ENoLL FAO Workshop Alvaro OliveiraENoLL FAO Workshop Alvaro Oliveira
ENoLL FAO Workshop Alvaro Oliveira
 
Onde KH? (where to poop?) Pitch Keynote at SWRIO
Onde KH? (where to poop?) Pitch Keynote at SWRIOOnde KH? (where to poop?) Pitch Keynote at SWRIO
Onde KH? (where to poop?) Pitch Keynote at SWRIO
 
Open Data
Open DataOpen Data
Open Data
 
1110 cpa bayside
1110 cpa bayside1110 cpa bayside
1110 cpa bayside
 
Life After Web 2.0
Life After Web 2.0Life After Web 2.0
Life After Web 2.0
 
Can Media Queries Save Us All?
Can Media Queries Save Us All?Can Media Queries Save Us All?
Can Media Queries Save Us All?
 
Beyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free KnowledgeBeyond the Encylcopedia: The Frontiers of Free Knowledge
Beyond the Encylcopedia: The Frontiers of Free Knowledge
 
Sharath Bulusu, Guardian News & Media
Sharath Bulusu, Guardian News & MediaSharath Bulusu, Guardian News & Media
Sharath Bulusu, Guardian News & Media
 
Andrew Nicklin, NYC DoITT
Andrew Nicklin, NYC DoITTAndrew Nicklin, NYC DoITT
Andrew Nicklin, NYC DoITT
 
Networks and online journalism
Networks and online journalismNetworks and online journalism
Networks and online journalism
 
Bootcamp jan 26
Bootcamp   jan 26Bootcamp   jan 26
Bootcamp jan 26
 

Mehr von lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucenelucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

Mehr von lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Kürzlich hochgeladen

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Kürzlich hochgeladen (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Keynote: from publisher to platform, How The Guardian Embraced the Internet using Content, Search, and Open Source - By Stephen Dunn

  • 1. From publisher to platform: How the Guardian embraced the internet using content, search, and Open Source Stephen Dunn, Guardian News and Media stephen.dunn@guardian.co.uk, 25th May, 2011 Twitter: @cuica, @openplatform Thursday, 26 May 2011
  • 2. 1 From publisher to platform How the Guardian embraced the Internet using content, search, and Open Source Stephen Dunn, Guardian News and Media 2 Thursday, 26 May 2011
  • 3. The publishing era 3 Thursday, 26 May 2011
  • 4. We started a long time ago: Thursday, 26 May 2011
  • 5. Keyword page Live blogs Apps Mobile site Twitter updates Swine flu Comment Content partnerships Newspapers Audio Video Open platform API Thursday, 26 May 2011
  • 6. To secure the financial and editorial To secure the financial and editorial independence independence of the Guardian in perpetuity. To promote freedom in thein perpetuity of the Guardian press and liberal journalism globally. To promote freedom in the press and liberal To become the world's leading liberal voice. journalism globally Thursday, 26 May 2011
  • 7. Open Web Principles 7 Thursday, 26 May 2011
  • 8. 2009 8 Thursday, 26 May 2011
  • 9. 1. Permanent http://www.flickr.com/photos/fstorr/ • “A cool URI is one that does not change” Tim Berners-Lee 1998 • 1.5 million resources redirected to new scheme 9 Thursday, 26 May 2011
  • 10. 2. Addressable ★ Resources are “about” something - ready for the social web. ★ We live in “the age of point-at-things” (Coates 2005) 10 Thursday, 26 May 2011
  • 11. 3. Discoverable ★ Multiple routes to content ★ Tagging drives discovery 11 Thursday, 26 May 2011
  • 12. 4. Open 12 Thursday, 26 May 2011
  • 13. Example: The Hackable Guardian http:// www.guardian.co.uk/.... /technology/internet /rss /technology/all /rss /environment/climatechange +business/globaleconomy/rss Thursday, 26 May 2011
  • 14. Results... 14 Thursday, 26 May 2011
  • 15. Site traffic growth Final Release Unique Users 30,000,000 26,250,000 First release 22,500,000 Unique Users Pre - project 18,750,000 15,000,000 11,250,000 40M 7,500,000 3,750,000 Sep 2005 Oct 2006 Nov 2007 Dec 2008 15 Thursday, 26 May 2011
  • 16. However... 16 Thursday, 26 May 2011
  • 17. 1 Billion+ Internet Users! 17 Thursday, 26 May 2011
  • 21. ...“How I stopped worrying about my website and learned to love the whole internet.” Matt McAlister 21 Thursday, 26 May 2011
  • 22. The Open Strategy OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 22 Thursday, 26 May 2011
  • 24. "Our most interesting experiments lie in combining what we know with the experience, opinions and expertise of the people who want to participate rather than passively receive.” 24 Thursday, 26 May 2011
  • 34. Jack Shenker “The Guardian alongside Al Jazeera was the one news source that everybody on the streets in Tahrir - not just in Cairo but in surrounding cities and major centers of revolutionary activity - that people were talking about.” 34 Thursday, 26 May 2011
  • 35. The Open Strategy OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 35 22 Thursday, 26 May 2011
  • 36. The Open Platform 36 Thursday, 26 May 2011
  • 37. The suite of services enabling partners to build applications with the Guardian 37 Thursday, 26 May 2011
  • 38. OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 38 22 Thursday, 26 May 2011
  • 39. CONTENT API DATA STORE POLITICS API A service for A directory of Open database selecting and useful data of candidates, collecting curated by voting records, content from Guardian constituencies, the Guardian editors election results, for re-use live data on election day Thursday, 26 May 2011
  • 40. Mutualised news! 40 Thursday, 26 May 2011
  • 41. Mutualised news! 41 Thursday, 26 May 2011
  • 42. Mutualised news! 42 Thursday, 26 May 2011
  • 47. DATA STORE A directory of useful data curated by Guardian editors Thursday, 26 May 2011
  • 48. POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day Thursday, 26 May 2011
  • 49. POLITICS API Open database of candidates, voting records, constituencies, election results, live data on election day 49 Thursday, 26 May 2011
  • 50. <OBLIGATORY DOGFOOD SLIDE > 50 Thursday, 26 May 2011
  • 56. Open for Business 56 Thursday, 26 May 2011
  • 57. 3 Tiers of access 3 Revenue models Keyless: Take our headlines. You keep associated revenues. Approved: Take our full article content, but with an advert. Guardian keeps ad revenue, you keep rest-of- page revenue. Bespoke: Take, reformat, augment our content Revenue model to be negotiated. Combination of Media, Fees, Downloads. 57 Thursday, 26 May 2011
  • 59. What this means Open Out: Developers can now access full content APIs on demand with keys post-approved Platform is positioned as a place to do business So rapid scalability, reliability and performance are now core requirements 59 Thursday, 26 May 2011
  • 60. OPEN IN OPEN OUT Bring in data and Allow partners to apps from the build applications internet using Guardian content and services for other platforms Thursday, 26 May 2011
  • 61. Simple REST/HTTP MICROAPPS framework allows lightweight development A framework for integrating 3rd party Applications proxied for applications into performance guardian.co.uk Apps generally hosted in the cloud, allows hot deployment into production 61 Thursday, 26 May 2011
  • 62. MICROAPPS A framework for integrating 3rd party applications into guardian.co.uk 62 Thursday, 26 May 2011
  • 63. • What could I cook? Thursday, 26 May 2011
  • 64. Bringing it together 64 Thursday, 26 May 2011
  • 66. App showcase 66 Thursday, 26 May 2011
  • 67. From publisher to platform Seeking massive growth, but no longer only broadcasting content on the website User/partner engagement & contribution on Journalism data software applications revenue and ads Support developers and partners with data and APIs, need scalability, reliability, speed 67 Thursday, 26 May 2011
  • 68. Evolving the architecture 68 Thursday, 26 May 2011
  • 69. Web server Web server Web server App server App server App server Memcached (added later) Oracle CMS Thursday, 26 May 2011
  • 70. Web server Web server Web server Why RDBMS? App server App server App server 5 years ago, fewer alternatives Memcached Understand operations procedures Can easily recruit DBAs / devs Oracle Developer/ops tools Business critical system: a safe choice CMS Thursday, 26 May 2011
  • 71. Scaling traffic Unique Users 30,000,000 26,250,000 22,500,000 Unique Users 18,750,000 15,000,000 11,250,000 7,500,000 3,750,000 Sep 2005 Sep 2006 Sep 2007 Sep 2008 71 Thursday, 26 May 2011
  • 78. We chose Solr/Lucene Can perform complex queries, including full-text search We can change the schema with no downtime Most queries are of similar cost Scales very well horizontally “Just worked” in the cloud No strange control processes/engines Developers just loved working with it! 78 Thursday, 26 May 2011
  • 80. Api Web servers Solr App server Solr Memcached Solr RDBMS Solr Solr Solr CMS Cloud, EC2 80 Thursday, 26 May 2011
  • 81. What about Open In? OPEN IN OPEN OUT Bring in data and apps Enable partners to from the Internet build applications using Guardian content and services for other platforms 81 22 Thursday, 26 May 2011
  • 82. Apps Web servers Proxy App App server App App Memcached App RDBMS App App CMS external hosting app engine etc 82 Thursday, 26 May 2011
  • 83. Core Out In Web servers Solr Proxy App App server App Solr Memcached App Solr App CMS Solr Solr App rdbms Solr App external hosting Cloud, EC2 app engine etc 83 Thursday, 26 May 2011