Each month 12 million people use Elsevier’s ScienceDirect platform. The Mendeley social network has 4.6 million registered users. 3500 institutions make use of ClinicalKey to bring the latest in medical research to doctors and nurses. How can we help these users be more effective? In this talk, I give an overview of how Elsevier is employing data science to improve its services from recommendation systems, to natural language processing and analytics. While data science is changing how Elsevier serves researchers, it’s also changing research practice itself. In that context, I discuss the impact that large amounts of open research data are having and the challenges researchers face in making use of it, in particular, in terms of data integration and reuse. We are at just beginning to see of how technology and data is changing science correspondingly this impacts how best to empower those who practice it.
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Data for Science: How Elsevier is using data science to empower researchers
1. DATA FOR SCIENCE
HOW ELSEVIER IS USING DATA SCIENCE TO EMPOWER RESEARCHERS
Paul Groth | @pgroth | pgroth.com
Disruptive Technology Director
Elsevier Labs | @elsevierlabs
European Data Forum 2016
9. BEING THE BEST RESEARCHER YOU CAN BE!
• Good researchers are on top of their game
• Large amount of research produced
• Takes time to get what you need
• Help researchers by recommending relevant research
22. CONCLUSION
• Researchers are faced with an ever growing amount of data and content
• Data Science is key to making systems that help them
• I’ve shown three Elsevier examples. Many more!
• Antonio Gulli’s codingplayground.blogspot.nl
• labs.elsevier.com
• Of course, we’re hiring
Contact: Paul Groth @pgroth
Hinweis der Redaktion
1.8 million unique authors worldwide submitted 1.3 million manuscripts to Elsevier journals
40 million reactions
75 million compounds
500 million experimental facts ,
40 million reactions
75 million compounds
500 million experimental facts ,
At Mendeley we build tools to help researchers organise and read research articles, collaborate and connect with other researchers, search and discover new research articles, etc.
815 million articles
“Mendeley Suggest” is our personalised article recommender. It is based on what users have in their libraries, and recommends other related articles.
Calculate for over 4 million users
We are building a personalised article recommender based on what users read. Input is the users’ libraries and the output is a list of articles they may want to add to their library and read. There are a number of different algorithms we can use to generate the recommendations (content-based, collaborative filtering), and this talk we’ll focus on three types of collaborative filtering algorithms (user and item-based as well as matrix factorisation).
To sum, we now have a Spark implementation of our production UB CF algorithm which performs well, and is a lot simpler to maintain and extend. There are still a few areas where we can tune and optimise further, so that could only make it faster and get bigger gains of using Spark. Depending on your data different algorithms might work better, so do experiment.
40 million reactions
75 million compounds
500 million experimental facts ,