SlideShare ist ein Scribd-Unternehmen logo
1 von 59
Developing data services
A tale from two Oregon
universities
NN/LM, Pacific Northwest Region
PNR Rendezvous | 18 June 2014
Melissa Haendel
OHSU Library
Amanda Whitmire
OSU Libraries
B.S. in Aquatic Biology, 2000
Worked in a bioluminescence laboratory
Ph.D. in Oceanography, emphasis in biological
oceanography, 2008
Dissertation study area: bio-optics; using optical tools
to study ocean ecology (N. California Current)
Post-doc in Oceanography, emphasis in biological
oceanography, 2008-2012
Study area: bio-optics; using optical tools to study
ocean ecology in low oxygen zones (N. Chile)
Assistant Professor, Data Management
Specialist, Sept. 2012 - present
About Amanda…
Not a
librarian.
B.A. in Chemistry, 1990
Modeled drug-receptor ligand binding
Ph.D. in Neuroscience, 1999,
Dissertation study area: Identification of novel genes
involved in neural development in the mouse
Post-doc, 2002-2004
Study area: Toxic effects of biocides in zebrafish and
salmon
Assistant Professor, Library, 2010 – present
Lead semantic research team
About Melissa…
Not a
librarian.
Post-doc, 2000-2002,
Study area: Role of thyroid hormone during neural
cell death in zebrafish
Post-doc, 2002-2004
Study area: Ontologies, data models, gene
nomenclature, biocuration
?
Do you have any data-related tasks or
responsibilities in your job description
or duties?
[Yes/No]
What role do you believe metadata
plays in the modern research cycle?
[big, small, none, other]
Questions
Why data management?
The researcher perspective
Why libraries?
Why bring in non-librarians?
Amanda & Melissa share their experiences
Wrap-up
image credit: http://www.flickr.com/photos/54803625@N08/8296296949/
“…the recorded factual material
commonly accepted in the
scientific community as
necessary to validate research
findings.”
Research data is:
U.S. Office of Management and Budget, Circular A-110
6
“Unlike other types of information, research
data are collected, observed, or created, for
the purposes of analysis to produce and
validate original research results.”
What is research data?
University of Edinburgh
MANTRA Research Data Management Training,
‘Research Data Explained’
7
Actions that contribute to effective
storage, use, preservation, and reuse
of data and documentation throughout
the research lifecycle.
Data management:
Why data management?
Images collected by DataONE.org
Photocourtesyofwww.carboafrica.net
Data is collected from sensors, sensor
networks, remote sensing, observations,
and more - this calls for increased attention
to data management and stewardship
Data deluge
Photocourtesyof
http://modis.gsfc.nasa.gov/
Photocourtesyof
http://www.futurlec.com
CCimagebytajaionFlickr
CCimagebyCIMMYTonFlickr
ImagecollectedbyVivHutchinson
Slide credit: http://www.dataone.org/education-modules
Federal movement toward open data
1985:
National
Research
Council
1999:
OMB
Circular
A-110
revisions
2003:
NIH Data
Sharing
Policy
2008:
NIH
Public
Access
Policy
2011: NSF
DMP
requirement
2012: NEH,
Office of
Digital
Humanities
DMP
requirement
2013:
NSF bio-
sketch
change
2013:
OSTP
memo on
public
access to
results of
federally
funded
data
More funder mandates are coming
22 Feb. 2013
The memorandum states that, “digitally formatted scientific data resulting from
unclassified research supported wholly or in part by Federal funding should be stored
and publicly accessible to search, retrieve, and analyze.” To this end, federal agencies
must create a public access plan that includes the following mandates:
• Maximize public access to data while protecting personal privacy and
confidentiality, intellectual property, and balancing costs with long-term benefits;
• Ensure that investigators create data management plans that describe strategies for
long-term preservation of and access to data;
• Costs of data management are included in proposal budgets;
• Ensure that the merits of data management plans are properly evaluated;
• Implement mechanisms to ensure that investigators comply with their data
management plans and policies;
• Promote deposition of data into publicly accessible repositories;
• Encourage private and public cooperation to improve data access and
interoperability;
• Develop and standardize approaches to data citation/attribution;
• Support training in data management best practices;
• Assess needs and strategies for the long-term preservation of data.
Journal data policies
Information propagation tales:
The researcher’s perspective
Data isn’t always what it seems
Assertion:
“β amyloid, known for its role in
injuring brain in Alzheimer’s
disease, is also produced by and
injures skeletal muscle fibres in the
muscle disease sporadic inclusion
body myositis.”
Greenberg 2009
BMJ 2009;339:b2680 doi:10.1136/bmj.b2680
All 242 papers point to 4 from same lab, and
very few to the ones with negative results
Greenberg, 2009
How do we believe what we think we
know?
 Is it true or do we just believe it because
everyone else does?
 How do we transcend “follow the leader”? What
tools can we build to help us?
How reproducible is science?
Let’s start simple.
Do we know what the ingredients were?
Journal guidelines for methods are often poor and
space is limited
“All companies from which materials were obtained should
be listed.” - A well-known journal
Reproducibility is dependent at a minimum, on
using the same resources. But…
How identifiable are resources in the
published literature?
An experiment in reproducibility
Gather journal
articles
5 domains:
Immunology
Cell biology
Neuroscience
Developmental biology
General biology
3 impact factors:
High
Medium
Low
84 Journals
248 papers
707 antibodies
104 cell lines
258 constructs
210 knockdown
reagents
437 model
organisms
Only ~50% of resources were identifiable
Vasilevsky et al, 2013, PeerJ
There is no correlation between impact factor and
resource identification
Journal Impact Factor
0 10 20 30 40
Fractionofresourcesidentified
0.0
0.2
0.4
0.6
0.8
1.0 Antibodies
Cell Lines
Constructs
Knockdown reagents
Organisms
Maybe labs are just disorganized?
Meet the Urban Lab
Meet the Urban Lab
A+ organization!
The Urban lab antibodies
Of 9 antibodies published in 5 articles, only
44% were identifiable
Percentidentifiable
0%
25%
50%
75%
100%
Commerical Ab
identifiable
Catalog number
reported
Source organism
reported
Target uniquely
identifiable
Resource information is not adequately
getting into the literature, EVEN
THOUGH IT IS READILY AVAILABLE
The problem is a lack of standards,
review, and tools
LIBRARIES CAN HELP!!!!!!
http://www.force11.org/Resource_Identification_Initiative
Numerous endorsers https://www.force11.org/RII/SignUp
Implementation of the new standard http://biosharing.org/bsg-000532
Sample citation:
Polyclonal rabbit anti-
MAPK3
antibody, Abgent, Cat#
AP7251E,
RRID:AB_2140114
1.
Research
er
submits a
manuscri
pt for
publicatio
n
2. Editor or
Publisher OR
LIBRARIA
N! asks for
inclusion of
RRID
3. Author goes to
Research
Identification
Portal to locate
RRID
4. RRID is
included
in
Methods
section
and
as
Keyword
Publishing Workflow
http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-
degree-it-not-trouble
$1.3 million grant from the Laura and John
Arnold Foundation to validate 50 landmark
cancer biology studies
Partnership between
Science Exchange,
PLoS, FigShare,
Mendelay, and some of
us scientists
Librarians can help researchers
understand:
 How to be critical of data and where it came from
 Data provenance and meeting data standards
 That there is a need to reinterpret data when new
information comes to light
 That reproducibility depends on many things, including
very basic things
 Why both retrospective and prospective efforts are
needed to ensure data quality, consistency, and utility
Amanda’s dissertation
The spectral backscattering properties of marine particles
Observations
ship-based sampling &
moored instruments
Simulation
results
scattering &
absorption of light
Experimental
optical properties of
phytoplankton cultures
Derived
variables
endless things
Compiled
observations
global oceanic bio-
optical observations
[self + from peers]
Reference
global oceanic bio-
optical observations
[NASA]
Why libraries?
OSU Libraries Digital Collections | http://oregondigital.org/u?/archives,31
image: http://www.beautiful-libraries.com/7200-1.html
Agricultural
Sciences
Engineering
Education
Business
Liberal Arts
Public Health &
Human Sciences
Veterinary
Medicine
Science
Pharmacy
Forestry
Earth, Ocean &
Atmospheric Sci.
Libraries
Libraries
http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf
“Only a small minority of academic
libraries in the United States and Canada
currently offer research data services
(RDS), but a quarter to a third of all
academic libraries are planning to offer
some services within the next two years.”
“Few academic libraries are responsible for
developing research data policies. Being
able to serve as a clearinghouse of ideas
and to provide expertise to build these
policies is an opportunity for libraries to be
members of the knowledge creation
process.”
“Reassigning existing library staff is the
most common tactic for offering RDS.”
Our experiences
http://clubads.com/photos/custom/fish-OutOfWAter.jpg
Timeline of data services at OSU
UL & library admin.
recognize need for role
of RDS on campus that
requires a dedicated FTE
late
2011
Sept.
2012
Data Management
Specialist starts
Oct.
2013
Data survey
launches
Strategic Agenda in
place*
Jan.
2013
GRAD 521
launches
Jan.
2014
*Sutton, Shan; Barber, David; Whitmire, Amanda L. (2013): Oregon
State University Libraries and Press Strategic Agenda for Research
Data Services. Oregon State University Libraries.
http://hdl.handle.net/1957/38794.
ESI
OSU Data stewardship survey
Interview by Sarah Abraham from The Noun Project
Responses to the question, “Please indicate whether or not you generate each of
the following data format(s) as a part of your research process. Select Yes or No for
each.” Color scale indicates what percentage of respondents in each college or unit
selected ‘Yes’ for each data type. The number in each tile shows the number of
faculty responses for that data type and college/unit.
Scope of Data Services at OSU
Research
Analysis of data management plans as a means to inform and empower
academic librarians in providing research data support. National Leadership
Grant LG-07-13-0328, Oct 2014 – Sept 2015
Data management plans
As a Research Tool The DART Project
Consultations
Teaching: GRAD 521
Logistical Details
• http://bit.ly/GRAD521
• All course materials on figshare
• 2 credits
• Discipline-agnostic
• Offered annually, winter quarter
Topics covered
• Overview of RDM
• Types, formats & stages of data
• RDM planning
• Storage, backup & security
• Documentation & metadata
• Legal & ethical considerations
• Sharing & reuse
• Archive and preservation
Timeline of data activities at OHSU
OHSU
library
awarded
eagle-i
late
2009
Sept.
2012
Monarch Initiative
awarded
Oct.
2013
Data survey
launches
Beyond the PDF
1K challenge award
April
2013
OHSU hiring
CRIO position
Now
ESI
NIH BD2K
program
OHSU Data stewardship survey
Interview by Sarah Abraham from The Noun Project
0%
10%
20%
30%
40%
50%
60%
Specific Uniform
Resource
Identifier (URI)
or other URL
where data is
held
Contact
information of
the data steward
Reference to a
public repository
where the data
is held
Provide
supplementary
data to the
journal
SPARQL
endpoint and/or
Linked Open
Data
Digital Object
Identifier (DOI)
I don't know Other (please
specify)
How do you reference your data when you publish,
either in the context of a journal publication, or by
direct publication of data sets?
Are there any professional community standards in your
research area regarding data management, sharing, storage,
archiving, and/or producing metadata or other descriptive
information that would apply to your research data?
Answer Instructor
Assistant
Professor,
Research Assistant
Professor, or
Assistant Scientist
Associate
Professor or
Associate
Scientist
Professor
or Senior
Scientist
Director,
Division
Head,
Department
Head
PostDoc/
ResAssoc/
PhD
Yes 1 9 5 16 6 13
No 1 8 9 15 1 10
I don't
know 1 19 13 14 4 19
Scope of Data Services at OHSU
Open houses,
Lib Guides, NIH proposals to
improve data education,
hosting fellows
New IR,
research
profiling tools
Participation in
national efforts:
BD2K, Force11, Galaxy,
Biocuration Society
Data consults,
collaborations
Consultations
NIH Big Data to Knowledge Initiative
http://bd2k.nih.gov/
1 | Can facilitate the creation of a smarter body
of literature for future research
2 | Train researchers to utilize metadata
standards to enable data reuse
3 | Facilitate researchers understanding of
available resources
Libraries, in summary…
Members from:
Oregon Health & Science University
Oregon State University
University of Oregon
University of Idaho
University of Washington
Portland State University
Reed College
Join us @ bit.ly/pnwdatalibs
Also we need a logo:
Free data science training for good suggestions!
PNW Research Data Geeks
Group
http://commons.wikimedia.org/wiki/File:DARPA_Big_Data.jpg
How do you think libraries
can best facilitate best
practices in data
management?

Weitere ähnliche Inhalte

Was ist angesagt?

Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Carole Goble
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...Susanna-Assunta Sansone
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8Scott Edmunds
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Metadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskMetadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskNico Carver
 
Preservation, Publishing, and People: A SEAD View
Preservation, Publishing, and  People: A SEAD ViewPreservation, Publishing, and  People: A SEAD View
Preservation, Publishing, and People: A SEAD ViewInna Kouper
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014Microsoft Azure for Research
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)aaroncollie
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and LibrariansJohann van Wyk
 

Was ist angesagt? (20)

Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Introduction to Research Data Management - 2016-02-03 - MPLS Division, Univer...
Introduction to Research Data Management - 2016-02-03 - MPLS Division, Univer...Introduction to Research Data Management - 2016-02-03 - MPLS Division, Univer...
Introduction to Research Data Management - 2016-02-03 - MPLS Division, Univer...
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Metadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskMetadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at Risk
 
Preservation, Publishing, and People: A SEAD View
Preservation, Publishing, and  People: A SEAD ViewPreservation, Publishing, and  People: A SEAD View
Preservation, Publishing, and People: A SEAD View
 
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
The Fourth Paradigm - Deltares Data Science Day, 31 October 2014
 
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
 
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of OxfordData Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Data Management
Data ManagementData Management
Data Management
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and Librarians
 

Ähnlich wie Developing data services: a tale from two Oregon universities

Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Larry Smarr
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureLarry Smarr
 
Univ of Miami CTSI: Citizen science seminar; Oct 2014
Univ of Miami CTSI: Citizen science seminar; Oct 2014Univ of Miami CTSI: Citizen science seminar; Oct 2014
Univ of Miami CTSI: Citizen science seminar; Oct 2014Richard Bookman
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down HerePhilip Bourne
 
The emerging biodiversity data ecosystem
The emerging biodiversity data ecosystemThe emerging biodiversity data ecosystem
The emerging biodiversity data ecosystemCyndy Parr
 
Functional Analysis & Screening Technologies Congress
Functional Analysis & Screening Technologies CongressFunctional Analysis & Screening Technologies Congress
Functional Analysis & Screening Technologies CongressJames Prudhomme
 
SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation Jackie Wirz, PhD
 
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics ResourceAB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics ResourcePhilippa Griffin
 
Obeid generic_2017-11
Obeid generic_2017-11Obeid generic_2017-11
Obeid generic_2017-11Jihad Obeid
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLDominique Roche
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...Human Variome Project
 
Data Citation Standards and Practices - Paul Uhlir - RDAP12
Data Citation Standards and Practices - Paul Uhlir - RDAP12Data Citation Standards and Practices - Paul Uhlir - RDAP12
Data Citation Standards and Practices - Paul Uhlir - RDAP12ASIS&T
 
Data citation standards and practice paul uhlir
Data citation standards and practice paul uhlirData citation standards and practice paul uhlir
Data citation standards and practice paul uhlirASIS&T
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princetonCyndy Parr
 
Cell Phones And Brain Cancer
Cell Phones And Brain CancerCell Phones And Brain Cancer
Cell Phones And Brain CancerDocJess
 

Ähnlich wie Developing data services: a tale from two Oregon universities (20)

Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
Univ of Miami CTSI: Citizen science seminar; Oct 2014
Univ of Miami CTSI: Citizen science seminar; Oct 2014Univ of Miami CTSI: Citizen science seminar; Oct 2014
Univ of Miami CTSI: Citizen science seminar; Oct 2014
 
Amlc
AmlcAmlc
Amlc
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
eScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-BrazileScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-Brazil
 
The emerging biodiversity data ecosystem
The emerging biodiversity data ecosystemThe emerging biodiversity data ecosystem
The emerging biodiversity data ecosystem
 
Functional Analysis & Screening Technologies Congress
Functional Analysis & Screening Technologies CongressFunctional Analysis & Screening Technologies Congress
Functional Analysis & Screening Technologies Congress
 
SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation
 
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics ResourceAB3ACBS 2016: EMBL Australia Bioinformatics Resource
AB3ACBS 2016: EMBL Australia Bioinformatics Resource
 
Obeid generic_2017-11
Obeid generic_2017-11Obeid generic_2017-11
Obeid generic_2017-11
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NL
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 
Data Citation Standards and Practices - Paul Uhlir - RDAP12
Data Citation Standards and Practices - Paul Uhlir - RDAP12Data Citation Standards and Practices - Paul Uhlir - RDAP12
Data Citation Standards and Practices - Paul Uhlir - RDAP12
 
Data citation standards and practice paul uhlir
Data citation standards and practice paul uhlirData citation standards and practice paul uhlir
Data citation standards and practice paul uhlir
 
Integrative Biology Summit
Integrative Biology SummitIntegrative Biology Summit
Integrative Biology Summit
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
Cell Phones And Brain Cancer
Cell Phones And Brain CancerCell Phones And Brain Cancer
Cell Phones And Brain Cancer
 

Kürzlich hochgeladen

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 

Kürzlich hochgeladen (20)

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 

Developing data services: a tale from two Oregon universities

  • 1. Developing data services A tale from two Oregon universities NN/LM, Pacific Northwest Region PNR Rendezvous | 18 June 2014 Melissa Haendel OHSU Library Amanda Whitmire OSU Libraries
  • 2. B.S. in Aquatic Biology, 2000 Worked in a bioluminescence laboratory Ph.D. in Oceanography, emphasis in biological oceanography, 2008 Dissertation study area: bio-optics; using optical tools to study ocean ecology (N. California Current) Post-doc in Oceanography, emphasis in biological oceanography, 2008-2012 Study area: bio-optics; using optical tools to study ocean ecology in low oxygen zones (N. Chile) Assistant Professor, Data Management Specialist, Sept. 2012 - present About Amanda… Not a librarian.
  • 3. B.A. in Chemistry, 1990 Modeled drug-receptor ligand binding Ph.D. in Neuroscience, 1999, Dissertation study area: Identification of novel genes involved in neural development in the mouse Post-doc, 2002-2004 Study area: Toxic effects of biocides in zebrafish and salmon Assistant Professor, Library, 2010 – present Lead semantic research team About Melissa… Not a librarian. Post-doc, 2000-2002, Study area: Role of thyroid hormone during neural cell death in zebrafish Post-doc, 2002-2004 Study area: Ontologies, data models, gene nomenclature, biocuration ?
  • 4. Do you have any data-related tasks or responsibilities in your job description or duties? [Yes/No] What role do you believe metadata plays in the modern research cycle? [big, small, none, other] Questions
  • 5. Why data management? The researcher perspective Why libraries? Why bring in non-librarians? Amanda & Melissa share their experiences Wrap-up image credit: http://www.flickr.com/photos/54803625@N08/8296296949/
  • 6. “…the recorded factual material commonly accepted in the scientific community as necessary to validate research findings.” Research data is: U.S. Office of Management and Budget, Circular A-110 6
  • 7. “Unlike other types of information, research data are collected, observed, or created, for the purposes of analysis to produce and validate original research results.” What is research data? University of Edinburgh MANTRA Research Data Management Training, ‘Research Data Explained’ 7
  • 8. Actions that contribute to effective storage, use, preservation, and reuse of data and documentation throughout the research lifecycle. Data management:
  • 10. Images collected by DataONE.org
  • 11. Photocourtesyofwww.carboafrica.net Data is collected from sensors, sensor networks, remote sensing, observations, and more - this calls for increased attention to data management and stewardship Data deluge Photocourtesyof http://modis.gsfc.nasa.gov/ Photocourtesyof http://www.futurlec.com CCimagebytajaionFlickr CCimagebyCIMMYTonFlickr ImagecollectedbyVivHutchinson Slide credit: http://www.dataone.org/education-modules
  • 12. Federal movement toward open data 1985: National Research Council 1999: OMB Circular A-110 revisions 2003: NIH Data Sharing Policy 2008: NIH Public Access Policy 2011: NSF DMP requirement 2012: NEH, Office of Digital Humanities DMP requirement 2013: NSF bio- sketch change 2013: OSTP memo on public access to results of federally funded data
  • 13. More funder mandates are coming 22 Feb. 2013
  • 14. The memorandum states that, “digitally formatted scientific data resulting from unclassified research supported wholly or in part by Federal funding should be stored and publicly accessible to search, retrieve, and analyze.” To this end, federal agencies must create a public access plan that includes the following mandates: • Maximize public access to data while protecting personal privacy and confidentiality, intellectual property, and balancing costs with long-term benefits; • Ensure that investigators create data management plans that describe strategies for long-term preservation of and access to data; • Costs of data management are included in proposal budgets; • Ensure that the merits of data management plans are properly evaluated; • Implement mechanisms to ensure that investigators comply with their data management plans and policies; • Promote deposition of data into publicly accessible repositories; • Encourage private and public cooperation to improve data access and interoperability; • Develop and standardize approaches to data citation/attribution; • Support training in data management best practices; • Assess needs and strategies for the long-term preservation of data.
  • 16. Information propagation tales: The researcher’s perspective
  • 17. Data isn’t always what it seems
  • 18. Assertion: “β amyloid, known for its role in injuring brain in Alzheimer’s disease, is also produced by and injures skeletal muscle fibres in the muscle disease sporadic inclusion body myositis.” Greenberg 2009
  • 19. BMJ 2009;339:b2680 doi:10.1136/bmj.b2680 All 242 papers point to 4 from same lab, and very few to the ones with negative results Greenberg, 2009
  • 20. How do we believe what we think we know?  Is it true or do we just believe it because everyone else does?  How do we transcend “follow the leader”? What tools can we build to help us?
  • 21. How reproducible is science? Let’s start simple. Do we know what the ingredients were?
  • 22. Journal guidelines for methods are often poor and space is limited “All companies from which materials were obtained should be listed.” - A well-known journal Reproducibility is dependent at a minimum, on using the same resources. But…
  • 23. How identifiable are resources in the published literature? An experiment in reproducibility Gather journal articles 5 domains: Immunology Cell biology Neuroscience Developmental biology General biology 3 impact factors: High Medium Low 84 Journals 248 papers 707 antibodies 104 cell lines 258 constructs 210 knockdown reagents 437 model organisms
  • 24. Only ~50% of resources were identifiable Vasilevsky et al, 2013, PeerJ
  • 25. There is no correlation between impact factor and resource identification Journal Impact Factor 0 10 20 30 40 Fractionofresourcesidentified 0.0 0.2 0.4 0.6 0.8 1.0 Antibodies Cell Lines Constructs Knockdown reagents Organisms
  • 26. Maybe labs are just disorganized?
  • 27. Meet the Urban Lab Meet the Urban Lab
  • 28. A+ organization! The Urban lab antibodies
  • 29. Of 9 antibodies published in 5 articles, only 44% were identifiable Percentidentifiable 0% 25% 50% 75% 100% Commerical Ab identifiable Catalog number reported Source organism reported Target uniquely identifiable
  • 30. Resource information is not adequately getting into the literature, EVEN THOUGH IT IS READILY AVAILABLE The problem is a lack of standards, review, and tools LIBRARIES CAN HELP!!!!!!
  • 32. Sample citation: Polyclonal rabbit anti- MAPK3 antibody, Abgent, Cat# AP7251E, RRID:AB_2140114 1. Research er submits a manuscri pt for publicatio n 2. Editor or Publisher OR LIBRARIA N! asks for inclusion of RRID 3. Author goes to Research Identification Portal to locate RRID 4. RRID is included in Methods section and as Keyword Publishing Workflow
  • 34. $1.3 million grant from the Laura and John Arnold Foundation to validate 50 landmark cancer biology studies Partnership between Science Exchange, PLoS, FigShare, Mendelay, and some of us scientists
  • 35. Librarians can help researchers understand:  How to be critical of data and where it came from  Data provenance and meeting data standards  That there is a need to reinterpret data when new information comes to light  That reproducibility depends on many things, including very basic things  Why both retrospective and prospective efforts are needed to ensure data quality, consistency, and utility
  • 36. Amanda’s dissertation The spectral backscattering properties of marine particles Observations ship-based sampling & moored instruments Simulation results scattering & absorption of light Experimental optical properties of phytoplankton cultures Derived variables endless things Compiled observations global oceanic bio- optical observations [self + from peers] Reference global oceanic bio- optical observations [NASA]
  • 37. Why libraries? OSU Libraries Digital Collections | http://oregondigital.org/u?/archives,31
  • 39. Agricultural Sciences Engineering Education Business Liberal Arts Public Health & Human Sciences Veterinary Medicine Science Pharmacy Forestry Earth, Ocean & Atmospheric Sci. Libraries
  • 41. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/Tenopir_Birch_Allard.pdf “Only a small minority of academic libraries in the United States and Canada currently offer research data services (RDS), but a quarter to a third of all academic libraries are planning to offer some services within the next two years.” “Few academic libraries are responsible for developing research data policies. Being able to serve as a clearinghouse of ideas and to provide expertise to build these policies is an opportunity for libraries to be members of the knowledge creation process.” “Reassigning existing library staff is the most common tactic for offering RDS.”
  • 43. Timeline of data services at OSU UL & library admin. recognize need for role of RDS on campus that requires a dedicated FTE late 2011 Sept. 2012 Data Management Specialist starts Oct. 2013 Data survey launches Strategic Agenda in place* Jan. 2013 GRAD 521 launches Jan. 2014 *Sutton, Shan; Barber, David; Whitmire, Amanda L. (2013): Oregon State University Libraries and Press Strategic Agenda for Research Data Services. Oregon State University Libraries. http://hdl.handle.net/1957/38794. ESI
  • 44. OSU Data stewardship survey Interview by Sarah Abraham from The Noun Project
  • 45. Responses to the question, “Please indicate whether or not you generate each of the following data format(s) as a part of your research process. Select Yes or No for each.” Color scale indicates what percentage of respondents in each college or unit selected ‘Yes’ for each data type. The number in each tile shows the number of faculty responses for that data type and college/unit.
  • 46. Scope of Data Services at OSU
  • 47. Research Analysis of data management plans as a means to inform and empower academic librarians in providing research data support. National Leadership Grant LG-07-13-0328, Oct 2014 – Sept 2015 Data management plans As a Research Tool The DART Project
  • 49. Teaching: GRAD 521 Logistical Details • http://bit.ly/GRAD521 • All course materials on figshare • 2 credits • Discipline-agnostic • Offered annually, winter quarter Topics covered • Overview of RDM • Types, formats & stages of data • RDM planning • Storage, backup & security • Documentation & metadata • Legal & ethical considerations • Sharing & reuse • Archive and preservation
  • 50. Timeline of data activities at OHSU OHSU library awarded eagle-i late 2009 Sept. 2012 Monarch Initiative awarded Oct. 2013 Data survey launches Beyond the PDF 1K challenge award April 2013 OHSU hiring CRIO position Now ESI NIH BD2K program
  • 51. OHSU Data stewardship survey Interview by Sarah Abraham from The Noun Project
  • 52. 0% 10% 20% 30% 40% 50% 60% Specific Uniform Resource Identifier (URI) or other URL where data is held Contact information of the data steward Reference to a public repository where the data is held Provide supplementary data to the journal SPARQL endpoint and/or Linked Open Data Digital Object Identifier (DOI) I don't know Other (please specify) How do you reference your data when you publish, either in the context of a journal publication, or by direct publication of data sets?
  • 53. Are there any professional community standards in your research area regarding data management, sharing, storage, archiving, and/or producing metadata or other descriptive information that would apply to your research data? Answer Instructor Assistant Professor, Research Assistant Professor, or Assistant Scientist Associate Professor or Associate Scientist Professor or Senior Scientist Director, Division Head, Department Head PostDoc/ ResAssoc/ PhD Yes 1 9 5 16 6 13 No 1 8 9 15 1 10 I don't know 1 19 13 14 4 19
  • 54. Scope of Data Services at OHSU Open houses, Lib Guides, NIH proposals to improve data education, hosting fellows New IR, research profiling tools Participation in national efforts: BD2K, Force11, Galaxy, Biocuration Society Data consults, collaborations
  • 56. NIH Big Data to Knowledge Initiative http://bd2k.nih.gov/
  • 57. 1 | Can facilitate the creation of a smarter body of literature for future research 2 | Train researchers to utilize metadata standards to enable data reuse 3 | Facilitate researchers understanding of available resources Libraries, in summary…
  • 58. Members from: Oregon Health & Science University Oregon State University University of Oregon University of Idaho University of Washington Portland State University Reed College Join us @ bit.ly/pnwdatalibs Also we need a logo: Free data science training for good suggestions! PNW Research Data Geeks Group http://commons.wikimedia.org/wiki/File:DARPA_Big_Data.jpg
  • 59. How do you think libraries can best facilitate best practices in data management?

Hinweis der Redaktion

  1. National Network of Libraries of Medicine, Pacific Northwest Region PNR Rendezvous Here is the link to the recording of the presentation: https://webmeeting.nih.gov/p8swadmbzpo/ and to our PNR Rendezvous webpage where the recording is posted: http://nnlm.gov/pnr/training/RMLrendezvous.html Talk abstract: “While the generation or collection of large, complex research datasets is becoming easier and less expensive all the time, researchers often lack the knowledge and skills that are necessary to properly manage them. Having these skills is paramount in ensuring data quality, integrity, discoverability, integration, reproducibility, and reuse over time. Librarians have been preserving, managing and disseminating information for thousands of years. As scholarly research is increasingly carried out digitally, and products of research have expanded from primarily text-based manuscripts to include datasets, metadata, maps, software code etc., it is a natural expansion of scope for libraries to be involved in the stewardship of these materials as well. This kind of evolution requires that libraries bring in faculty with new skills and collaborate more intimately with researchers during the research data lifecycle, and this is exactly what is happening in academic libraries across the country. In this webinar, two researchers-turned-data-specialists, both based in academic libraries, will share their experiences and perspectives on the development of research data services at their respective institutions. Each will share their perspective on the important role that libraries can play in helping researchers manage, preserve, and share their data.”
  2. Adobe Connect instant polling to poll attendees (N=37). Responses: 45% - yes, have data-related tasks or duties; 90 % - metadata plays a big role in the modern research cycle
  3. Does not include, “any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This "recorded" material excludes physical objects (e.g., laboratory samples).” This narrow definition mostly takes a retrospective view of your dataset, in that it does not account for raw and intermediate that may be critical to the research process but that don’t become part of the ’final’ dataset. Data could be: Observational Experimental Simulated Derived
  4. Does not include, “any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This "recorded" material excludes physical objects (e.g., laboratory samples).” This narrow definition mostly takes a retrospective view of your dataset, in that it does not account for raw and intermediate that may be critical to the research process but that don’t become part of the ’final’ dataset. Data could be: Observational Experimental Simulated Derived
  5. Data management is a verb – it involves intentional effort and activity. The main goals of DM are preservation and reuse, for you and for others. Covers all aspects of the data lifecycle from planning digital data capture methods, whittling down, ingestion to databases, providing for access and reuse, to transformation.
  6. image: Microsoft clipart
  7. Let’s look at one important area of scientific inquiry: climate change. What scale of data integration is necessary to study global trends over geologic timescales? Slide credit: DataONE Education Module 1. http://www.dataone.org/education-modules
  8. Data are being generated in massive quantities daily. Improvements in technology enable higher precision and coverage in data acquisition and makes higher capacity systems store and migrate more data –increasing the importance of managing, integrating, and re-using data. In order to integrate these diverse datasets to answer questions of global significance, the data have to be well organized, well documented and described, preserved and accessible. It all depends of effective management of the data. Slide credit: DataONE Education Module 1. http://www.dataone.org/education-modules
  9. Slide from: Heather Coates, Data Management Lab: Session 1 Slides http://www.slideshare.net/goldenphizzwizards/data-mgmtlab-spr14mod1slides201403245
  10. 22 February 2013: The Office of Science and Technology Policy in the White House released a memorandum about expanding pubic access to the results of federally funded research. In addition to scholarly publications, federal agencies are making serious efforts to increase the sharing of research data. All federal agencies with more than $100M in R&D expenditures are subject to this memo. http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
  11. This is going to place huge additional demands on faculty who submit and review proposals – they overwhelmingly have NO IDEA what constitutes a good DMP.
  12. “PLOS is now releasing a revised Data Policy that will come into effect on March 1, 2014, in which authors will be required to include a data availability statement in all research articles published by PLOS journals … {policy language: PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. When submitting a manuscript online, authors must provide a Data Availability Statement describing compliance with PLOS’s policy. The data availability statement will be published with the article if accepted.}” Since the policy was updated in March 2014: “…more than 16,000 sets of authors have included information about data availability with their submission. We have had fewer than 10 enquiries per week to data@plos.org from authors who need advice about ‘edge cases’ of data handling and availability – fewer than 1% of authors.” http://blogs.plos.org/biologue/2014/05/30/plos-data-policy-update/
  13. Citations on statement that accumulation of β amyloid “precedes” other abnormalities in inclusion body myositis muscle. Statement as fact is supported through citation to papers that only state it as hypothesis
  14. Four most authoritative papers were from same lab, two had potentially the same data, and all lacked quantitative data as to how many affected muscle fibres were seen and a specificity of reagents for distinguishing β amyloid protein from β amyloid precursor protein.
  15. MH - notes
  16. We are working on determining how to deal with this longer term- is this a new data citation that goes alongside the paper. Needs to be in the keywords do it is mineable. Trying to figure out to deal with this in the long run.
  17. “ When an official at America’s National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.”
  18. Give background about the reproducibility initiatve. Talk about example of replication- scientific reproducibility experiment with leishmania and it being a different strain, different amidation, etc.
  19. Most research projects use and create multiple data types & formats, and produce many, many files. My own dissertation work included the generation or use of all of the data types shown here (which might help to explain why it took me 7 years to earn a Ph.D.). http://hdl.handle.net/1957/9088 This data was collected over the course of 5 years, at locations all over the Pacific and Atlantic Oceans. I never received ANY formal training in how to organize and manage all of this data. Where is all of this data now? On an external hard drive sitting in my desk. Image credit: Document by Piotrek Chuchla from The Noun Project
  20. Librarians have been preserving, managing and disseminating information for thousands of years, going all the way back to Alexandria. As scholarly research is increasingly carried out digitally, and products of research have expanded from primarily text-based manuscripts to include datasets, metadata, maps, software code etc., it is a natural expansion of scope for libraries to be involved in the stewardship of these materials, too. This kind of evolution requires that libraries bring in faculty with new skills, and that’s exactly what’s happening in academic libraries across the country.
  21. Data management is something that faculty all over campus have become aware of. As a neutral entity, the library is well positioned to address campus-wide needs, like data management. It makes sense, under the economy of scale, for a centralized unit to address a campus-wide need. We recognize that individual colleges and departments have computer support personnel and resources, and we aim to complement those resources (not duplicate them). (Switzerland metaphor swiped from the incomparable Jackie Wirz at OHSU)
  22. We aren’t here to replace the external resources that already exist to support you – we are here to act as a conduit to these resources. Our goal is to help you effectively discover, navigate and utilize these resources where appropriate, in the same way that the library has been providing this kind of support for decades.
  23. SO, what’s going on with academic libraries and data services? This ACRL white paper (2012) provides some context.
  24. I spent my first year here getting my feet under me: Participating in the DuraSpace/ARL/DLF E-Science Institute, which involved doing an environmental scan and engaging faculty and administrators in interviews. Strengthened an existing relationship with campus Information Services (IS). This experience resulted in the creation of our Strategic Agenda for Research Data Services, which really laid out my priority tasks and areas of emphasis. http://hdl.handle.net/1957/38794 Submitting an IMLS National Leadership Grant with 4 co-PIs Developing a collaboration with the Graduate School to create a credit-bearing course for graduate students in research data management (http://bit.ly/GRAD521) Trying (with limited success) to advertise the existence of library-based data services for faculty & grad students Creating a data services web site (via LibGuides, http://bit.ly/OSUData) Curating the limited number of datasets in our IR; updating metadata practices The first ¼ of 2014: All GRAD 521, all the time And, some grant stuff.
  25. Response rate was 23%, 451 completed surveys across all colleges and ranks surveyed. The goal was to get a feel for how much and what types of data are being produced on campus, what faculty are doing with it, and figure out where they need more support.
  26. Example question and responses to the OSU faculty data stewardship survey (figure created in R). What do faculty find more difficult: metadata creation, version control, finding and accessing data created by others, long-term storage, and sharing their own data. What am I going to do with the survey results? I’m working on a report, which I will share with faculty and OSU administration. Am hoping that it leads to a campus-wide conversation about data stewardship.
  27. The OSUL&P Research Data Services model. Data planning & consultation DMPs/Planning Storage & backup File organization & naming Documentation & metadata Legal/ethical considerations Sharing & reuse Archiving & preservation Data access & preservation infrastructure Data curation in our IR We offer DOIs for datasets via membership in EZID (CDL) Recommend using ORCID iDs but haven’t had much traction on this yet. NIH mandate will change this. Data management training 90-minute workshops, mostly grad students, some faculty 2-credit course launched in January 2014. GRAD 521. http://bit.ly/GRAD521 presentations at faculty/staff mtgs; invited lectures in classes Open data consortia & collaborations CUAHSI – implemented, in parntership with OSU faculty in CEOAS and Institute for Natural Resources DataONE & DataFOUR are under consideration or development
  28. Periodic surveys can be used to identify service needs on campus, but depend on useful response rates. We suggest that regular reviews of DMPs can also be a legitimate source of information regarding what researchers are up to, and where they may need support. This project aims to provide a tool for librarians to facilitate consistent, quality reviews of DMPs. Project in a nutshell: Develop a rubric for consistent evaluation of NSF DMPs Multi-university study of DMPs Identify common gaps in knowledge, skills and practice Target data support services to ameliorate gaps Website with more info. is under development. Contact Amanda or DMPResearch@oregonstate.edu with questions.
  29. Graduate students (like to meet in person) or faculty (most prefer email) Generally project or task-specific Examples: coming up with a file-naming convention and data organization strategy for a project reviewing a data management plan for a grant proposal how to share data in support of a submitted manuscript
  30. Midterm assignment: a scaled-back Data Curation Profile Final assignment: a data management plan First cohort: 11 students, including 3 faculty members; degree ranges from non-thesis MS to PhD; many disciplines Whitmire, Amanda (2014): GRAD 521 Research Data Management Syllabus and Lesson Plans. figshare. http://dx.doi.org/10.6084/m9.figshare.1003834 Whitmire, Amanda (2014): GRAD 521 Research Data Management Course Assignments. figshare. http://dx.doi.org/10.6084/m9.figshare.1003852 Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835