SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Dr. Paul A. Thiessen, NCBI




                             2013/03/21 draft
What is a “Knowledge Space”?
 May be a database
 But may be a concept not encapsulated
  in a database
                          Genes       Diseases
Literature
(PubMed)      Chemicals
             (PubChem)


                            Assays      Targets
Patents                   (PubChem)    (sequences)
             Drugs
Connecting the Spaces
   Database cross-links
                                            Assays
                                          (PubChem)
Literature
(PubMed)
                           Active
                MeSH

                                    Inactive
                                                        Targets
    Depositor                                         (sequences)
                   Chemicals
                  (PubChem)
Moving Within a Space
   Neighbors… some examples

                       Same                 Similar sets
                       parent               of screened
Similar                           Assays    chemicals
by 2D                           (PubChem)
or 3D      Chemicals
          (PubChem)


                                 Similar
                                 target
             Same                (BLAST)
             connectivity
Drug Repurposing as a Spatial
Transformation
    One possible route…


Search                    Diseases
            Drugs
                          (known)




                                     Similarity
          Diseases        Targets
         (hypothesized)
What is in PubChem
   117M Substances (SIDs)
     Information from depositors, including links
     to PubMed, sequences, structures, patents,
     etc.
   47M Compounds (CIDs)
     Derived from Substances (including links)
     Computed properties
   650k Assays (AIDs)
     ~200M test results on SIDs
     Links to target sequences
Some PubChem Statistics
   All CIDs                                      46,814,409
   Unique parents by connectivity                36,806,372
   Rule of 5                                     34,343,056
   Rule of 5 but MW 250-800                      31,483,865
   Active in any BioAssay                        824,028
   Tested in any BioAssay                        1,872,313
   Experimental 3D (mainly PDB)                  41,406
   Computed 3D (multiple confs + neighbors)      42,252,570
   Pharmacological Actions                       11,531
   Biosystems                                    9,703
   Chemical vendors                              28,852,943
   NIH Molecular Libraries                       402,076
   Patent sources                                14,512,499
   Patent links                                  5,978,538

                                           … as of 2013/03/20
What is in NCBI Entrez
   Many other databases…
       PubMed
       Protein/Nucleotide sequences
       Genes
       Biosystems (metabolic pathways)
       PDB structures (with VAST neighbors)
 Text and numeric search fields
 Cross-links
     Between databases
     Within databases (neighbors)
How Entrez Works
 Search results = list of identifiers
 Boolean operations on lists (query
  refinement)
 Links from one database to another


    PubChem    CID
    Search     List
                        Link
                                  PMID
                      to PubMed    List
    PubChem    CID
    Search     List
Limitations of Entrez
   Only text or numeric search
     Search fields hard to discover
     Search fields and defaults vary by database
     Chemical structure search, and other
     specialized algorithms, must be done
     outside Entrez
   The kicker: links are incomplete
     Only 500-10,000 ids!
     Limit also varies by database
Working Around the Limitations
   Scripting
     E-Utils, PUG SOAP/REST, etc.
     Break queries into smaller chunks
   Specialized services
     PubChem’s ID Exchange
     Classification trees (with associated IDs)
What is not in Entrez
   … as a database per se, but which may
    be imported and linked to PubChem

   Drugs
     (sort of but not really)
   Targets
     (again sort of)
 Diseases
 Patents
Some Public Sources of Information
Relevant to Drugs and Repurposing
   United States (FDA, NLM, NCBI, …)
       ClinicalTrials.gov
       NDF(-RT)
       RxNorm
       HSDB
       MeSH
       DailyMed
       PubMed, PubMed Health
       USPTO
   Europe
     ChEBI / ChEMBL
     EPO / WIPO
   Canada
     DrugBank
   Japan
     KEGG
                          … not an exhaustive list
                          … some are linked to PubChem
                          … some are works in progress
MeSH and ChEBI
   Chemical
    structure
    classification

   Biological role

   Pharmacological
    action
KEGG and DrugBank
   Drug
    classification

   Targets
Patents
   PubChem depositors
     Per SID:
      ○ Patent IDs
      ○ PubMed IDs


   Classifications
     ECLA
     IPC
     USPC
     CPC
Aside: Patent Summaries
NDF-RT
 Molecular interactions
 Drug ingredients
 Diseases (with drugs)
 Physiological effects


   Has links to MeSH
     … which leads to CIDs
NDF-RT linked to SID, CID
Classifications as Navigation
Tools
   Where are the CIDs in the tree?


 • Example: chemicals
   affecting serotonin
   transporters
   according to KEGG
Classifications for Query
Refinement
   Where are MY CIDs in the tree?


• Example: what
  diseases are linked
  by NDF to KEGG’s
  serotonin transport
  drugs?
Big Classifications…
      Some Engineering
 Required
WIPO IPC

• 72,000 tree
  nodes

• 6,000,000 CIDs

• 124,000,000
  node-CID links

Filtering on the fly:

• 22,000 CIDs
  from PDB

     … interactive!
More Space to Explore
                         Genes

  Literature                               Assays
  (PubMed)                               (PubChem)




                        Chemicals
                       (PubChem)
                                                Targets
  Patents                                      (sequences)




               Drugs                Diseases              … and beyond
Conclusions
   PubChem is…
     A very generalized system
     Based on open data
     Part of the larger Entrez collection
   We strive to…
     Make analysis across multiple knowledge
      spaces accessible and powerful
     Enable hypothesis generation for drug
      repurposing (as one scenario among many)

   Feedback is always welcome!
     info@ncbi.nlm.nih.gov
Acknowledgements

 Evan Bolton
 Steve Bryant
 Asta Gindulyte (classification front end)


   Chris Southan



                               … Thank You!

Weitere ähnliche Inhalte

Was ist angesagt?

Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...Frederik van den Broek
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?Sunghwan Kim
 
Exploiting PubChem for Drug Discovery
Exploiting PubChem for Drug DiscoveryExploiting PubChem for Drug Discovery
Exploiting PubChem for Drug DiscoverySunghwan Kim
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
 
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...Frederik van den Broek
 
Searching for patent information in PubChem
Searching for patent information in PubChem Searching for patent information in PubChem
Searching for patent information in PubChem Sunghwan Kim
 
PubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information ResourcePubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information ResourceSunghwan Kim
 
Cheminformatics Education with PubChem
Cheminformatics Education with PubChemCheminformatics Education with PubChem
Cheminformatics Education with PubChemSunghwan Kim
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AIDatabricks
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Sunghwan Kim
 
Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020crovida
 
Patent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTSPatent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTSopen_phacts
 
PubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data ChemistryPubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data ChemistrySunghwan Kim
 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...Chris Southan
 
Conference presentation from #iccs2014 in Noordwijkerhout
Conference presentation from #iccs2014 in NoordwijkerhoutConference presentation from #iccs2014 in Noordwijkerhout
Conference presentation from #iccs2014 in NoordwijkerhoutJosef Scheiber
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)Pistoia Alliance
 
Semantic Technology: The Basics
Semantic Technology: The BasicsSemantic Technology: The Basics
Semantic Technology: The BasicsPeter Berger
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologEleanor Howe
 
PubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingPubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingSunghwan Kim
 

Was ist angesagt? (20)

Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
Data-driven drug discovery for rare diseases - Tales from the trenches (CINF ...
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
 
Exploiting PubChem for Drug Discovery
Exploiting PubChem for Drug DiscoveryExploiting PubChem for Drug Discovery
Exploiting PubChem for Drug Discovery
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web Technologies
 
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
Finding novel lead compounds in pesticide discovery inspired by pharmaceutica...
 
Searching for patent information in PubChem
Searching for patent information in PubChem Searching for patent information in PubChem
Searching for patent information in PubChem
 
PubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information ResourcePubChem as an Emerging Toxicological Information Resource
PubChem as an Emerging Toxicological Information Resource
 
Cheminformatics Education with PubChem
Cheminformatics Education with PubChemCheminformatics Education with PubChem
Cheminformatics Education with PubChem
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020
 
Patent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTSPatent annotations: From SureChEMBL to Open PHACTS
Patent annotations: From SureChEMBL to Open PHACTS
 
PubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data ChemistryPubChem: A Public Chemical Information Resource for Big Data Chemistry
PubChem: A Public Chemical Information Resource for Big Data Chemistry
 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 
Conference presentation from #iccs2014 in Noordwijkerhout
Conference presentation from #iccs2014 in NoordwijkerhoutConference presentation from #iccs2014 in Noordwijkerhout
Conference presentation from #iccs2014 in Noordwijkerhout
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
 
Semantic Technology: The Basics
Semantic Technology: The BasicsSemantic Technology: The Basics
Semantic Technology: The Basics
 
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biologHowe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
Howe et al. - 2015 - BioAssay Research Database (BARD) chemical biolog
 
PubChem as a resource for chemical information training
PubChem as a resource for chemical information trainingPubChem as a resource for chemical information training
PubChem as a resource for chemical information training
 

Andere mochten auch

Присадки к дизельным топливам
Присадки к дизельным топливамПрисадки к дизельным топливам
Присадки к дизельным топливамKirill Kudrin
 
The Walking Dead Survival Guide for Marketers
The Walking Dead Survival Guide for MarketersThe Walking Dead Survival Guide for Marketers
The Walking Dead Survival Guide for MarketersMarketo
 
Soft ideation action templates_soft25_checklist
Soft ideation action templates_soft25_checklistSoft ideation action templates_soft25_checklist
Soft ideation action templates_soft25_checklistThe Innovation Lab
 
Leren Goed Geregeld
Leren Goed GeregeldLeren Goed Geregeld
Leren Goed GeregeldJoël Bruijn
 
Preparing Your Students to Secure and Succeed in a Corporate Internship
Preparing Your Students to Secure and Succeed in a Corporate InternshipPreparing Your Students to Secure and Succeed in a Corporate Internship
Preparing Your Students to Secure and Succeed in a Corporate InternshipNAFCareerAcads
 
Campus to Corporate brochure
Campus to Corporate brochureCampus to Corporate brochure
Campus to Corporate brochureFirdaus Panthaky
 
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012Epic Research Limited
 
Russian Internet Week 2012 slides
Russian Internet Week 2012 slidesRussian Internet Week 2012 slides
Russian Internet Week 2012 slidesAlexander Semeonov
 
Blends that Work for Compliance Training | Kineo
Blends that Work for Compliance Training | Kineo Blends that Work for Compliance Training | Kineo
Blends that Work for Compliance Training | Kineo KineoPacific
 
Sentient-agency-creds compressed
Sentient-agency-creds compressedSentient-agency-creds compressed
Sentient-agency-creds compressedMartin Sylvester
 
8 Epic Quora Questions on Productivity and Time Management
8 Epic Quora Questions on Productivity and Time Management8 Epic Quora Questions on Productivity and Time Management
8 Epic Quora Questions on Productivity and Time ManagementBrightpod
 
The Line Between Media and Brands is Blurring Fast
The Line Between Media and Brands is Blurring FastThe Line Between Media and Brands is Blurring Fast
The Line Between Media and Brands is Blurring FastHubSpot
 
The Future is Now: Preparing our Learners to be the Leaders of Tomorrow
The Future is Now: Preparing our Learners to be the Leaders of TomorrowThe Future is Now: Preparing our Learners to be the Leaders of Tomorrow
The Future is Now: Preparing our Learners to be the Leaders of TomorrowDiana Rendina
 
Tangible Storytelling + Play + Learning
Tangible Storytelling + Play + LearningTangible Storytelling + Play + Learning
Tangible Storytelling + Play + LearningErin Brockette Reilly
 

Andere mochten auch (20)

Присадки к дизельным топливам
Присадки к дизельным топливамПрисадки к дизельным топливам
Присадки к дизельным топливам
 
The Walking Dead Survival Guide for Marketers
The Walking Dead Survival Guide for MarketersThe Walking Dead Survival Guide for Marketers
The Walking Dead Survival Guide for Marketers
 
Soft ideation action templates_soft25_checklist
Soft ideation action templates_soft25_checklistSoft ideation action templates_soft25_checklist
Soft ideation action templates_soft25_checklist
 
Leren Goed Geregeld
Leren Goed GeregeldLeren Goed Geregeld
Leren Goed Geregeld
 
Preparing Your Students to Secure and Succeed in a Corporate Internship
Preparing Your Students to Secure and Succeed in a Corporate InternshipPreparing Your Students to Secure and Succeed in a Corporate Internship
Preparing Your Students to Secure and Succeed in a Corporate Internship
 
Campus to Corporate brochure
Campus to Corporate brochureCampus to Corporate brochure
Campus to Corporate brochure
 
Presentacion marco conceptual
Presentacion marco conceptualPresentacion marco conceptual
Presentacion marco conceptual
 
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012
DAILY AGRI REPORT BY EPIC RESEARCH-17 APRIL 2012
 
Russian Internet Week 2012 slides
Russian Internet Week 2012 slidesRussian Internet Week 2012 slides
Russian Internet Week 2012 slides
 
Global Risk Report 2006
Global Risk Report 2006Global Risk Report 2006
Global Risk Report 2006
 
伊甸園
伊甸園伊甸園
伊甸園
 
Blends that Work for Compliance Training | Kineo
Blends that Work for Compliance Training | Kineo Blends that Work for Compliance Training | Kineo
Blends that Work for Compliance Training | Kineo
 
Sentient-agency-creds compressed
Sentient-agency-creds compressedSentient-agency-creds compressed
Sentient-agency-creds compressed
 
8 Epic Quora Questions on Productivity and Time Management
8 Epic Quora Questions on Productivity and Time Management8 Epic Quora Questions on Productivity and Time Management
8 Epic Quora Questions on Productivity and Time Management
 
The Line Between Media and Brands is Blurring Fast
The Line Between Media and Brands is Blurring FastThe Line Between Media and Brands is Blurring Fast
The Line Between Media and Brands is Blurring Fast
 
The Future is Now: Preparing our Learners to be the Leaders of Tomorrow
The Future is Now: Preparing our Learners to be the Leaders of TomorrowThe Future is Now: Preparing our Learners to be the Leaders of Tomorrow
The Future is Now: Preparing our Learners to be the Leaders of Tomorrow
 
Revista Orange Oct 2016
Revista Orange Oct 2016Revista Orange Oct 2016
Revista Orange Oct 2016
 
RAE
RAERAE
RAE
 
Digital Storytelling
Digital StorytellingDigital Storytelling
Digital Storytelling
 
Tangible Storytelling + Play + Learning
Tangible Storytelling + Play + LearningTangible Storytelling + Play + Learning
Tangible Storytelling + Play + Learning
 

Ähnlich wie Exploring Chemical and Biological Knowledge Spaces with PubChem

Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAGopen_phacts
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1BigData_Europe
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyChris Southan
 
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5Pasteur Institute User Story - Cheminfo Stories 2020 Day 5
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5ChemAxon
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance
 
Revolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyRevolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyChris Southan
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...open_phacts
 
Systems Immunology -- 2014
Systems Immunology -- 2014Systems Immunology -- 2014
Systems Immunology -- 2014Yannick Pouliot
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdfnedalalazzwy
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Albert pujol reingeneering the human biology
Albert pujol   reingeneering the human biologyAlbert pujol   reingeneering the human biology
Albert pujol reingeneering the human biologyAlbert Pujol Torras
 

Ähnlich wie Exploring Chemical and Biological Knowledge Spaces with PubChem (20)

Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1
 
SLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
SLAS Screen Design and Assay Technology SIG: SLAS2013 PresentationSLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
SLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
 
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5Pasteur Institute User Story - Cheminfo Stories 2020 Day 5
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
 
Practical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS projectPractical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS project
 
Biospace Libraries
Biospace LibrariesBiospace Libraries
Biospace Libraries
 
Mining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposingMining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposing
 
Revolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyRevolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and Biology
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 
Systems Immunology -- 2014
Systems Immunology -- 2014Systems Immunology -- 2014
Systems Immunology -- 2014
 
Drug design
Drug designDrug design
Drug design
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Albert pujol reingeneering the human biology
Albert pujol   reingeneering the human biologyAlbert pujol   reingeneering the human biology
Albert pujol reingeneering the human biology
 

Kürzlich hochgeladen

historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu Medical University
 
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio..."Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio...Sujoy Dasgupta
 
Pregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfPregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfMedicoseAcademics
 
Adenomyosis or Fibroid- making right diagnosis
Adenomyosis or Fibroid- making right diagnosisAdenomyosis or Fibroid- making right diagnosis
Adenomyosis or Fibroid- making right diagnosisSujoy Dasgupta
 
BENIGN BREAST DISEASE
BENIGN BREAST DISEASE BENIGN BREAST DISEASE
BENIGN BREAST DISEASE Mamatha Lakka
 
Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Vaikunthan Rajaratnam
 
concept of total quality management (TQM).
concept of total quality management (TQM).concept of total quality management (TQM).
concept of total quality management (TQM).kishan singh tomar
 
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...bkling
 
Basic structure of hair and hair growth cycle.pptx
Basic structure of hair and hair growth cycle.pptxBasic structure of hair and hair growth cycle.pptx
Basic structure of hair and hair growth cycle.pptxkomalt2001
 
Female Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyFemale Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyMedicoseAcademics
 
General_Studies_Presentation_Health_and_Wellbeing
General_Studies_Presentation_Health_and_WellbeingGeneral_Studies_Presentation_Health_and_Wellbeing
General_Studies_Presentation_Health_and_WellbeingAnonymous
 
Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024Peter Embi
 
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfSGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfHongBiThi1
 
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...Ganesan Yogananthem
 
Using Data Visualization in Public Health Communications
Using Data Visualization in Public Health CommunicationsUsing Data Visualization in Public Health Communications
Using Data Visualization in Public Health Communicationskatiequigley33
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxNaveenkumar267201
 
World-TB-Day-2023_Presentation_English.pptx
World-TB-Day-2023_Presentation_English.pptxWorld-TB-Day-2023_Presentation_English.pptx
World-TB-Day-2023_Presentation_English.pptxsumanchaulagain3
 
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.aarjukhadka22
 
blood bank management system project report
blood bank management system project reportblood bank management system project report
blood bank management system project reportNARMADAPETROLEUMGAS
 

Kürzlich hochgeladen (20)

historyofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusanguhistoryofpsychiatryinindia. Senthil Thirusangu
historyofpsychiatryinindia. Senthil Thirusangu
 
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio..."Radical excision of DIE in subferile women with deep infiltrating endometrio...
"Radical excision of DIE in subferile women with deep infiltrating endometrio...
 
Pregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdfPregnacny, Parturition, and Lactation.pdf
Pregnacny, Parturition, and Lactation.pdf
 
Adenomyosis or Fibroid- making right diagnosis
Adenomyosis or Fibroid- making right diagnosisAdenomyosis or Fibroid- making right diagnosis
Adenomyosis or Fibroid- making right diagnosis
 
BENIGN BREAST DISEASE
BENIGN BREAST DISEASE BENIGN BREAST DISEASE
BENIGN BREAST DISEASE
 
Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.Generative AI in Health Care a scoping review and a persoanl experience.
Generative AI in Health Care a scoping review and a persoanl experience.
 
concept of total quality management (TQM).
concept of total quality management (TQM).concept of total quality management (TQM).
concept of total quality management (TQM).
 
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
Moving Forward After Uterine Cancer Treatment: Surveillance Strategies, Testi...
 
Basic structure of hair and hair growth cycle.pptx
Basic structure of hair and hair growth cycle.pptxBasic structure of hair and hair growth cycle.pptx
Basic structure of hair and hair growth cycle.pptx
 
Female Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before PregnancyFemale Reproductive Physiology Before Pregnancy
Female Reproductive Physiology Before Pregnancy
 
General_Studies_Presentation_Health_and_Wellbeing
General_Studies_Presentation_Health_and_WellbeingGeneral_Studies_Presentation_Health_and_Wellbeing
General_Studies_Presentation_Health_and_Wellbeing
 
Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024Clinical Research Informatics Year-in-Review 2024
Clinical Research Informatics Year-in-Review 2024
 
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdfSGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
SGK LEUKEMIA KINH DÒNG BẠCH CÂU HẠT HAY.pdf
 
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
QUESTIONS & ANSWERS FOR QUALITY ASSURANCE, RADIATIONBIOLOGY& RADIATION HAZARD...
 
Using Data Visualization in Public Health Communications
Using Data Visualization in Public Health CommunicationsUsing Data Visualization in Public Health Communications
Using Data Visualization in Public Health Communications
 
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptxBreast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
Breast cancer -ONCO IN MEDICAL AND SURGICAL NURSING.pptx
 
World-TB-Day-2023_Presentation_English.pptx
World-TB-Day-2023_Presentation_English.pptxWorld-TB-Day-2023_Presentation_English.pptx
World-TB-Day-2023_Presentation_English.pptx
 
Cone beam CT: concepts and applications.pptx
Cone beam CT: concepts and applications.pptxCone beam CT: concepts and applications.pptx
Cone beam CT: concepts and applications.pptx
 
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
Bulimia nervosa ( Eating Disorders) Mental Health Nursing.
 
blood bank management system project report
blood bank management system project reportblood bank management system project report
blood bank management system project report
 

Exploring Chemical and Biological Knowledge Spaces with PubChem

  • 1. Dr. Paul A. Thiessen, NCBI 2013/03/21 draft
  • 2. What is a “Knowledge Space”?  May be a database  But may be a concept not encapsulated in a database Genes Diseases Literature (PubMed) Chemicals (PubChem) Assays Targets Patents (PubChem) (sequences) Drugs
  • 3. Connecting the Spaces  Database cross-links Assays (PubChem) Literature (PubMed) Active MeSH Inactive Targets Depositor (sequences) Chemicals (PubChem)
  • 4. Moving Within a Space  Neighbors… some examples Same Similar sets parent of screened Similar Assays chemicals by 2D (PubChem) or 3D Chemicals (PubChem) Similar target Same (BLAST) connectivity
  • 5. Drug Repurposing as a Spatial Transformation  One possible route… Search Diseases Drugs (known) Similarity Diseases Targets (hypothesized)
  • 6. What is in PubChem  117M Substances (SIDs)  Information from depositors, including links to PubMed, sequences, structures, patents, etc.  47M Compounds (CIDs)  Derived from Substances (including links)  Computed properties  650k Assays (AIDs)  ~200M test results on SIDs  Links to target sequences
  • 7. Some PubChem Statistics  All CIDs 46,814,409  Unique parents by connectivity 36,806,372  Rule of 5 34,343,056  Rule of 5 but MW 250-800 31,483,865  Active in any BioAssay 824,028  Tested in any BioAssay 1,872,313  Experimental 3D (mainly PDB) 41,406  Computed 3D (multiple confs + neighbors) 42,252,570  Pharmacological Actions 11,531  Biosystems 9,703  Chemical vendors 28,852,943  NIH Molecular Libraries 402,076  Patent sources 14,512,499  Patent links 5,978,538 … as of 2013/03/20
  • 8. What is in NCBI Entrez  Many other databases…  PubMed  Protein/Nucleotide sequences  Genes  Biosystems (metabolic pathways)  PDB structures (with VAST neighbors)  Text and numeric search fields  Cross-links  Between databases  Within databases (neighbors)
  • 9. How Entrez Works  Search results = list of identifiers  Boolean operations on lists (query refinement)  Links from one database to another PubChem CID Search List Link PMID to PubMed List PubChem CID Search List
  • 10. Limitations of Entrez  Only text or numeric search  Search fields hard to discover  Search fields and defaults vary by database  Chemical structure search, and other specialized algorithms, must be done outside Entrez  The kicker: links are incomplete  Only 500-10,000 ids!  Limit also varies by database
  • 11. Working Around the Limitations  Scripting  E-Utils, PUG SOAP/REST, etc.  Break queries into smaller chunks  Specialized services  PubChem’s ID Exchange  Classification trees (with associated IDs)
  • 12. What is not in Entrez  … as a database per se, but which may be imported and linked to PubChem  Drugs  (sort of but not really)  Targets  (again sort of)  Diseases  Patents
  • 13. Some Public Sources of Information Relevant to Drugs and Repurposing  United States (FDA, NLM, NCBI, …)  ClinicalTrials.gov  NDF(-RT)  RxNorm  HSDB  MeSH  DailyMed  PubMed, PubMed Health  USPTO  Europe  ChEBI / ChEMBL  EPO / WIPO  Canada  DrugBank  Japan  KEGG … not an exhaustive list … some are linked to PubChem … some are works in progress
  • 14. MeSH and ChEBI  Chemical structure classification  Biological role  Pharmacological action
  • 15. KEGG and DrugBank  Drug classification  Targets
  • 16. Patents  PubChem depositors  Per SID: ○ Patent IDs ○ PubMed IDs  Classifications  ECLA  IPC  USPC  CPC
  • 18. NDF-RT  Molecular interactions  Drug ingredients  Diseases (with drugs)  Physiological effects  Has links to MeSH  … which leads to CIDs
  • 19. NDF-RT linked to SID, CID
  • 20. Classifications as Navigation Tools  Where are the CIDs in the tree? • Example: chemicals affecting serotonin transporters according to KEGG
  • 21. Classifications for Query Refinement  Where are MY CIDs in the tree? • Example: what diseases are linked by NDF to KEGG’s serotonin transport drugs?
  • 22. Big Classifications… Some Engineering Required WIPO IPC • 72,000 tree nodes • 6,000,000 CIDs • 124,000,000 node-CID links Filtering on the fly: • 22,000 CIDs from PDB … interactive!
  • 23. More Space to Explore Genes Literature Assays (PubMed) (PubChem) Chemicals (PubChem) Targets Patents (sequences) Drugs Diseases … and beyond
  • 24. Conclusions  PubChem is…  A very generalized system  Based on open data  Part of the larger Entrez collection  We strive to…  Make analysis across multiple knowledge spaces accessible and powerful  Enable hypothesis generation for drug repurposing (as one scenario among many)  Feedback is always welcome!  info@ncbi.nlm.nih.gov
  • 25. Acknowledgements  Evan Bolton  Steve Bryant  Asta Gindulyte (classification front end)  Chris Southan  … Thank You!