SlideShare ist ein Scribd-Unternehmen logo
1 von 22
DATA FOR SCIENCE
HOW ELSEVIER IS USING DATA SCIENCE TO EMPOWER RESEARCHERS
Paul Groth | @pgroth | pgroth.com
Disruptive Technology Director
Elsevier Labs | @elsevierlabs
European Data Forum 2016
12 million people
per month
40 million reactions
75 million compounds
500 million facts
3 EXAMPLES
• Personalized: what should I read?
• Actionable: who should I collaborate with?
• Consumable: how do I make my data available?
RECOMMENDATIONS AT MENDELEY
• Maya Hristakeva
• Data Scientist at Mendeley
• @mayahhf
• Spark Summit 2015
• http://www.slideshare.net/SparkSummit/sparkin
g-science-up-with-research-recommendations-
by-maya-hristakeva
Read
&
Organize
Search
&
Discover
Collaborate
&
Network
Experiment
&
Synthesize
MENDELEY BUILDS TOOLS TO HELP
RESEARCHERS …
BEING THE BEST RESEARCHER YOU CAN BE!
• Good researchers are on top of their game
• Large amount of research produced
• Takes time to get what you need
• Help researchers by recommending relevant research
PERSONALIZED ARTICLE RECOMMENDATION
Input:
User libraries
Output:
Suggested
articles to read
Algorithms:
• Collaborative Filtering
– Item-based
– User-Based
– Matrix Factorization
• Content-based
Costly & GoodCostly & Bad
Cheap & GoodCheap & Bad
Tuned IB Mahout
Tuned UB Mahout
Tuned UB Spark
Tuned IB Spark
UB DimSum
Spark MLlib
ALS Matrix Fact.
Spark MLlib
Performance
+100%
+150%
~$50
CALCULATING 75 TRILLION METRICS
• Benchmark 4600 institutions & 220 countries updated weekly
• 40 terabytes of data
• HPCC massively parallel compute system – 40 node system
ALL DATA ISN’T CURATED
60 % OF TIME IS SPENT ON DATA
PREPARATION
10 ASPECTS OF HIGHLY EFFECTIVE RESEARCH DATA
https://www.elsevier.com/con
nect/10-aspects-of-highly-
effective-research-data
http://data.mendeley.com/
Each dataset receives a versioned DOI,
so it can be cited
The citation for the
associated article is
displayed
ACADEMIC COLLABORATIONS
CONCLUSION
• Researchers are faced with an ever growing amount of data and content
• Data Science is key to making systems that help them
• I’ve shown three Elsevier examples. Many more!
• Antonio Gulli’s codingplayground.blogspot.nl
• labs.elsevier.com
• Of course, we’re hiring 
Contact: Paul Groth @pgroth

Weitere ähnliche Inhalte

Was ist angesagt?

Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkPaul Groth
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
THOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEATHOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEAMaaike Duine
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset useHeather Piwowar
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureXiaogang (Marshall) Ma
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management EcosystemJohn Kunze
 
RDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuseRDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuseASIS&T
 
Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-supportSherry Lake
 
Machines are people too
Machines are people tooMachines are people too
Machines are people tooPaul Groth
 
BEng Product Design 1st years session 1 Oct 2021
BEng Product Design 1st years session 1 Oct 2021BEng Product Design 1st years session 1 Oct 2021
BEng Product Design 1st years session 1 Oct 2021EISLibrarian
 
Open Science: Research Data Management
Open Science: Research Data ManagementOpen Science: Research Data Management
Open Science: Research Data ManagementLibrary_Connect
 
Research methodology
Research methodologyResearch methodology
Research methodologyCutLiaisons
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierMaaike Duine
 
Sharing Sensitive Data With Confidence: The DataTags system
Sharing Sensitive Data With Confidence: The DataTags systemSharing Sensitive Data With Confidence: The DataTags system
Sharing Sensitive Data With Confidence: The DataTags systemMichael Bar-Sinai
 
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCarly Strasser
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingCarly Strasser
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse CommonsMerce Crosas
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 

Was ist angesagt? (20)

Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
THOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEATHOR Workshop - Services PANGAEA
THOR Workshop - Services PANGAEA
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset use
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
RDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuseRDAP13 Elizabeth Moss: The impact of data reuse
RDAP13 Elizabeth Moss: The impact of data reuse
 
Re tooling for data management-support
Re tooling for data management-supportRe tooling for data management-support
Re tooling for data management-support
 
Machines are people too
Machines are people tooMachines are people too
Machines are people too
 
BEng Product Design 1st years session 1 Oct 2021
BEng Product Design 1st years session 1 Oct 2021BEng Product Design 1st years session 1 Oct 2021
BEng Product Design 1st years session 1 Oct 2021
 
Open Science: Research Data Management
Open Science: Research Data ManagementOpen Science: Research Data Management
Open Science: Research Data Management
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing Elsevier
 
PDE2440 Nov 2019
PDE2440 Nov 2019PDE2440 Nov 2019
PDE2440 Nov 2019
 
Sharing Sensitive Data With Confidence: The DataTags system
Sharing Sensitive Data With Confidence: The DataTags systemSharing Sensitive Data With Confidence: The DataTags system
Sharing Sensitive Data With Confidence: The DataTags system
 
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP Students
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharing
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 

Ähnlich wie Data for Science: How Elsevier is using data science to empower researchers

Open Science for sustainability and inclusiveness: the SKA role model
 Open Science for sustainability and inclusiveness: the SKA role model Open Science for sustainability and inclusiveness: the SKA role model
Open Science for sustainability and inclusiveness: the SKA role modelLourdes Verdes-Montenegro
 
Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Maryann Martone
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsNicole Vasilevsky
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationWilliam Gunn
 
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015Alex Humphreys
 
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or ResearchFive Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or ResearchSean Ekins
 
Upgrading the Scholarly Infrastructure
Upgrading the Scholarly InfrastructureUpgrading the Scholarly Infrastructure
Upgrading the Scholarly InfrastructureBjörn Brembs
 
AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationWilliam Gunn
 
Is democracy the right system? Building an engaged RDM community - Marta Tepe...
Is democracy the right system? Building an engaged RDM community - Marta Tepe...Is democracy the right system? Building an engaged RDM community - Marta Tepe...
Is democracy the right system? Building an engaged RDM community - Marta Tepe...Mari Tinnemans
 
Melissa Terras' Report on the #UKMHLiveLab
Melissa Terras' Report on the #UKMHLiveLabMelissa Terras' Report on the #UKMHLiveLab
Melissa Terras' Report on the #UKMHLiveLabUniversity of Edinburgh
 
Destroying the silo: how breaking down barriers can lead to proactive and coo...
Destroying the silo: how breaking down barriers can lead to proactive and coo...Destroying the silo: how breaking down barriers can lead to proactive and coo...
Destroying the silo: how breaking down barriers can lead to proactive and coo...UKSG: connecting the knowledge community
 
Dataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. BorgmanDataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. Borgmandatascienceiqss
 
Advancing access to information - together
Advancing access to information - togetherAdvancing access to information - together
Advancing access to information - togetherIna Smith
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchJaap Kamps
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseAlejandra Gonzalez-Beltran
 
Responsive and Responsible Use of Digital Resources for Research
Responsive and Responsible Use of Digital Resources  for Research Responsive and Responsible Use of Digital Resources  for Research
Responsive and Responsible Use of Digital Resources for Research Shang Genon-Sieras
 
Plum analytics: Altmetrics in Practice - ALM workshop -- San Francisco - 201...
Plum analytics:  Altmetrics in Practice - ALM workshop -- San Francisco - 201...Plum analytics:  Altmetrics in Practice - ALM workshop -- San Francisco - 201...
Plum analytics: Altmetrics in Practice - ALM workshop -- San Francisco - 201...plumanalytics
 

Ähnlich wie Data for Science: How Elsevier is using data science to empower researchers (20)

Open Science for sustainability and inclusiveness: the SKA role model
 Open Science for sustainability and inclusiveness: the SKA role model Open Science for sustainability and inclusiveness: the SKA role model
Open Science for sustainability and inclusiveness: the SKA role model
 
Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11Open Access and Research Communication: The Perspective of Force11
Open Access and Research Communication: The Perspective of Force11
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate Students
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
 
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015
Of Libraries and Labs: Effecting User-Driven Innovation - RLUK Members Mtg 2015
 
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or ResearchFive Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
 
Upgrading the Scholarly Infrastructure
Upgrading the Scholarly InfrastructureUpgrading the Scholarly Infrastructure
Upgrading the Scholarly Infrastructure
 
Lern, june 2016, digital media slides
Lern, june 2016, digital media slidesLern, june 2016, digital media slides
Lern, june 2016, digital media slides
 
AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes Collaboration
 
Ngsp
NgspNgsp
Ngsp
 
Is democracy the right system? Building an engaged RDM community - Marta Tepe...
Is democracy the right system? Building an engaged RDM community - Marta Tepe...Is democracy the right system? Building an engaged RDM community - Marta Tepe...
Is democracy the right system? Building an engaged RDM community - Marta Tepe...
 
Melissa Terras' Report on the #UKMHLiveLab
Melissa Terras' Report on the #UKMHLiveLabMelissa Terras' Report on the #UKMHLiveLab
Melissa Terras' Report on the #UKMHLiveLab
 
Destroying the silo: how breaking down barriers can lead to proactive and coo...
Destroying the silo: how breaking down barriers can lead to proactive and coo...Destroying the silo: how breaking down barriers can lead to proactive and coo...
Destroying the silo: how breaking down barriers can lead to proactive and coo...
 
Dataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. BorgmanDataverse in the Universe of Data by Christine L. Borgman
Dataverse in the Universe of Data by Christine L. Borgman
 
Advancing access to information - together
Advancing access to information - togetherAdvancing access to information - together
Advancing access to information - together
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
 
Responsive and Responsible Use of Digital Resources for Research
Responsive and Responsible Use of Digital Resources  for Research Responsive and Responsible Use of Digital Resources  for Research
Responsive and Responsible Use of Digital Resources for Research
 
Plum analytics: Altmetrics in Practice - ALM workshop -- San Francisco - 201...
Plum analytics:  Altmetrics in Practice - ALM workshop -- San Francisco - 201...Plum analytics:  Altmetrics in Practice - ALM workshop -- San Francisco - 201...
Plum analytics: Altmetrics in Practice - ALM workshop -- San Francisco - 201...
 

Mehr von Paul Groth

Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIPaul Groth
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningPaul Groth
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-cziPaul Groth
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph FuturesPaul Groth
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of DataPaul Groth
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data ShowcasingPaul Groth
 
Elsevier’s Healthcare Knowledge Graph
Elsevier’s Healthcare Knowledge GraphElsevier’s Healthcare Knowledge Graph
Elsevier’s Healthcare Knowledge GraphPaul Groth
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for SciencePaul Groth
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsPaul Groth
 
Progressive Provenance Capture Through Re-computation
Progressive Provenance Capture Through Re-computationProgressive Provenance Capture Through Re-computation
Progressive Provenance Capture Through Re-computationPaul Groth
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsPaul Groth
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chainPaul Groth
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataPaul Groth
 

Mehr von Paul Groth (20)

Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AI
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Minimal viable-datareuse-czi
Minimal viable-datareuse-cziMinimal viable-datareuse-czi
Minimal viable-datareuse-czi
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
Elsevier’s Healthcare Knowledge Graph
Elsevier’s Healthcare Knowledge GraphElsevier’s Healthcare Knowledge Graph
Elsevier’s Healthcare Knowledge Graph
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for Science
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domains
 
Progressive Provenance Capture Through Re-computation
Progressive Provenance Capture Through Re-computationProgressive Provenance Capture Through Re-computation
Progressive Provenance Capture Through Re-computation
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 

Kürzlich hochgeladen

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Kürzlich hochgeladen (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Data for Science: How Elsevier is using data science to empower researchers

Hinweis der Redaktion

  1. 1.8 million unique authors worldwide submitted 1.3 million manuscripts to Elsevier journals
  2. 40 million reactions 75 million compounds 500 million experimental facts ,
  3. 40 million reactions 75 million compounds 500 million experimental facts ,
  4. At Mendeley we build tools to help researchers organise and read research articles, collaborate and connect with other researchers, search and discover new research articles, etc. 
  5. 815 million articles
  6. “Mendeley Suggest” is our personalised article recommender. It is based on what users have in their libraries, and recommends other related articles. 
  7. Calculate for over 4 million users We are building a personalised article recommender based on what users read. Input is the users’ libraries and the output is a list of articles they may want to add to their library and read. There are a number of different algorithms we can use to generate the recommendations (content-based, collaborative filtering), and this talk we’ll focus on three types of collaborative filtering algorithms (user and item-based as well as matrix factorisation).
  8. To sum, we now have a Spark implementation of our production UB CF algorithm which performs well, and is a lot simpler to maintain and extend. There are still a few areas where we can tune and optimise further, so that could only make it faster and get bigger gains of using Spark. Depending on your data different algorithms might work better, so do experiment. 
  9. 40 million reactions 75 million compounds 500 million experimental facts ,
  10. http://www.tamr.com/piketty-revisited-improving-economics-data-science/
  11. NASA, A.40 Computational Modeling Algorithms and Cyberinfrastructure, tech. report, NASA, 19 Dec. 2011
  12. Data enginnering pipleines