SlideShare ist ein Scribd-Unternehmen logo
1 von 71
Accessing small molecule data using ChEBI  Janna Hastings, Duncan Hull and Nico Adams Programmatic Access to Biological Databases (Perl) 22-26 February 2010 @ EBI
Overview ,[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Introduction to ChEBI Block 1
Small Molecules within Bioinformatics Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems
Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems Small Molecules within Bioinformatics Small molecules Small molecules Small molecules Small molecules Small molecules
Small molecules participate in all  the processes of life
Signaling γ-aminobutyric acid  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Metabolism ,[object Object],[object Object],[object Object],[object Object],[object Object],Adenosine 5'-triphosphate
Enzymes ,[object Object],[object Object],[object Object],[object Object],clavulanic acid  (ChEBI:48947) acts as a suicide  inhibitor of  bacterial β-lactamase  enzymes
Pathways http://www.genome.jp/kegg-bin/highlight_pathway?scale=1.0&map=map00231&keyword=tryptophan
Systems biology ,[object Object],[object Object],D-enantiomer: sweet L-enantiomer: bitter
Drug design ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Drug types 2003 - 2009 'Small molecules' in various shades of blue (http://chembl.blogspot.com/)
Getting the chemistry right ,[object Object],[object Object],[object Object],[object Object],http://www.drugbank.ca/drugs/DB01041
Small molecule data sources Deposition-driven publicly available compound repository,  containing more than 25 million unique structures.  http://pubchem.ncbi.nlm.nih.gov/ http://www.chemspider.com/ Automatic aggregation of publicly available chemistry data  with crowdsourced annotation. http://www.ebi.ac.uk/chebi/ Manually annotated database and ontology
Small molecule annotations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Chemicals - ChEBI Visualisation caffeine 1,3,7-trimethylxanthine  methyltheobromine  Nomenclature Formula: C8H10N4O2 Charge:  0  Mass:  194.19 Chemical data metabolite CNS stimulant trimethylxanthines Ontology MSDchem: CFF KEGG DRUG: D00528 Database Xrefs Chemical Informatics InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES CN1C(=O)N(C)c2ncn(C)c2C1=O
What is ChEBI? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
ChEBI home page ChEBI – Chemical Entities of Biological Interest 25.02.10
How is ChEBI maintained? ,[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
ChEBI entries contain ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
ChEBI entry view ChEBI – Chemical Entities of Biological Interest 25.02.10
Automatic Cross-references ChEBI – Chemical Entities of Biological Interest 25.02.10
Chemical Structures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Molfile format ChEBI – Chemical Entities of Biological Interest 25.02.10
Time for Exercises
Searching and browsing ChEBI Block 2
[object Object],Simple text search ChEBI – Chemical Entities of Biological Interest 25.02.10 Wildcard: * Enter any text
Advanced text search ChEBI – Chemical Entities of Biological Interest 25.02.10 Narrow to category AND, OR and BUT NOT
Structure search ChEBI – Chemical Entities of Biological Interest 25.02.10 Search options Structure drawing tools
Search Results  ChEBI – Chemical Entities of Biological Interest 25.02.10 Click to go to entry page Hover-over for search menu
Fingerprints ,[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Fingerprints [2] ,[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 C8H9NO2   ,[object Object],cannot be a substructure of an entity which does not have  at least  8 carbon atoms, 9 hydrogen atoms…
Fingerprints [3] ,[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 water (HOH) 0-bond paths H O H 1-bond paths HO OH 2-bond paths HOH ,[object Object],Pattern Hashed bitmap  H 0000010000 O 0010000000 HO 1010000000 OH 0000100010 HOH 0000000101 Result: 1010110111
Types of structure search ,[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 InChI=1/H2O/h1H2  1010110111 0010110010 10 1 0 11 01 1 1 00 1 0 11 00 1 0 Tanimoto(a,b)  = c / (a+b-c) =  4  / ( 4 + 7 - 4 )  = 0.57 a b
Browse via Periodic Table ChEBI – Chemical Entities of Biological Interest 25.02.10 Molecular entities / Elements
Navigate via links in ontology ChEBI – Chemical Entities of Biological Interest 25.02.10 Click to follow links
Time for Exercises
Understanding the ChEBI ontology Block 3
Annotation of bioinformatics data ,[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 ,[object Object],[object Object],[object Object],[object Object]
The ChEBI ontology ,[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 ( R ) -adrenaline
Molecular structure ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
Role ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
ChEBI ontology relationships ,[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Viewing ChEBI ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
Viewing ChEBI ontology [2] ChEBI – Chemical Entities of Biological Interest 25.02.10 Tree view
Browsing ChEBI ontology (OLS) ChEBI – Chemical Entities of Biological Interest 25.02.10 Browse the ontology Ontology Lookup Service (OLS):  http://www.ebi.ac.uk/ontology-lookup/
Ontology Lookup Service ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
OBO Foundry “ The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain.” ChEBI – Chemical Entities of Biological Interest 25.02.10
Time for Exercises
Download and programmatic access Block 4
ChEBI domain model ChEBI – Chemical Entities of Biological Interest 25.02.10 Self-referencing - merging
Compound IDs and Merging ,[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 only the main accession of a merged group is displayed Navigated accession: CHEBI:5585 Main accession: CHEBI:15377
Compound IDs and Merging [2] ChEBI – Chemical Entities of Biological Interest 25.02.10 Additional acc Parent ID This compound ID = additional acc ID STATUS CHEBI_ACCN SOURCE PARENT_ID NAME DEFINITION 15377 C CHEBI:15377 ChEBI null water null 5585 C CHEBI:5585 KEGG 15377 null null ID COMPOUND ACCN_NUMBER TYPE STATUS SOURCE URL_ABBR 16213 5585 C00001 KEGG accn C KEGG KEGG 17314 5585 7732-18-5 CAS Registry C KEGG null
Downloading ChEBI flavours ChEBI – Chemical Entities of Biological Interest 25.02.10 ,[object Object],[object Object],[object Object]
Downloading ChEBI ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
OBO File Format ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 General header information Synonym types used in terms Root terms Relationships to other terms
SDF File Lite format ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 Entries separated by $$$$
SDF File complete format ChEBI – Chemical Entities of Biological Interest 25.02.10 Entries separated by $$$$
Flat-file tab and comma delimited ChEBI – Chemical Entities of Biological Interest 25.02.10 ,[object Object],[object Object],[object Object],[object Object],[object Object]
Table dumps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Web services ,[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10 User application
The ChEBI web service ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Web service client object model ChEBI – Chemical Entities of Biological Interest 25.02.10 getLiteEntity getCompleteEntity getOntology (Parents and Children)
Methods and parameters (1) ChEBI – Chemical Entities of Biological Interest 25.02.10
Methods and parameters (2) ChEBI – Chemical Entities of Biological Interest 25.02.10
Methods and parameters (3) ChEBI – Chemical Entities of Biological Interest 25.02.10
Time for Exercises
For more information ,[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Acknowledgements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ChEBI – Chemical Entities of Biological Interest 25.02.10
Thank you

Weitere ähnliche Inhalte

Was ist angesagt? (20)

Fasta
FastaFasta
Fasta
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Motifs and domains
Motifs and domainsMotifs and domains
Motifs and domains
 
proteomics
 proteomics proteomics
proteomics
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Proteomic databases
Proteomic databasesProteomic databases
Proteomic databases
 
ZINC database
ZINC databaseZINC database
ZINC database
 
Data mining
Data miningData mining
Data mining
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Bioinformatics ppt
Bioinformatics pptBioinformatics ppt
Bioinformatics ppt
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
ENTREZ.ppt
ENTREZ.pptENTREZ.ppt
ENTREZ.ppt
 
CATH
CATHCATH
CATH
 
Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
 

Andere mochten auch

From Open Access to Open Science
From Open Access to Open ScienceFrom Open Access to Open Science
From Open Access to Open ScienceNatalia Manola
 
Authenticating Scientists with OpenID
Authenticating Scientists with OpenIDAuthenticating Scientists with OpenID
Authenticating Scientists with OpenIDDuncan Hull
 
Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Duncan Hull
 
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome CampusBibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome CampusDuncan Hull
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Crossref
 
Treballderecerca 090301103205-phpapp01
Treballderecerca 090301103205-phpapp01Treballderecerca 090301103205-phpapp01
Treballderecerca 090301103205-phpapp01rossana fernandez
 
Communicating food: Foodways as a map of meanings
Communicating food: Foodways as a map of meanings Communicating food: Foodways as a map of meanings
Communicating food: Foodways as a map of meanings comfoodforhealth
 
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...Contactlab
 
Guide débattre-autrement-animafac-2012
Guide débattre-autrement-animafac-2012Guide débattre-autrement-animafac-2012
Guide débattre-autrement-animafac-2012Jamaity
 
Innovation Boot Camp: OALT/ABO Conference 2012
Innovation Boot Camp: OALT/ABO Conference 2012Innovation Boot Camp: OALT/ABO Conference 2012
Innovation Boot Camp: OALT/ABO Conference 2012M.J. D'Elia
 
Wie Facebook den Handel ausschalten will
Wie Facebook den Handel ausschalten willWie Facebook den Handel ausschalten will
Wie Facebook den Handel ausschalten willRoger L. Basler de Roca
 
9. naturbasert reiseliv nature travels presentation trondelag 2
9. naturbasert reiseliv   nature travels presentation trondelag 29. naturbasert reiseliv   nature travels presentation trondelag 2
9. naturbasert reiseliv nature travels presentation trondelag 2Trøndelag Reiseliv
 
Marca Enkarterri Green
Marca Enkarterri GreenMarca Enkarterri Green
Marca Enkarterri GreenÁlvaro Fierro
 

Andere mochten auch (20)

From Open Access to Open Science
From Open Access to Open ScienceFrom Open Access to Open Science
From Open Access to Open Science
 
Authenticating Scientists with OpenID
Authenticating Scientists with OpenIDAuthenticating Scientists with OpenID
Authenticating Scientists with OpenID
 
Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia Improving the troubled relationship between Scientists and Wikipedia
Improving the troubled relationship between Scientists and Wikipedia
 
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome CampusBibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15
 
OWL and OBO
OWL and OBOOWL and OBO
OWL and OBO
 
Treballderecerca 090301103205-phpapp01
Treballderecerca 090301103205-phpapp01Treballderecerca 090301103205-phpapp01
Treballderecerca 090301103205-phpapp01
 
Communicating food: Foodways as a map of meanings
Communicating food: Foodways as a map of meanings Communicating food: Foodways as a map of meanings
Communicating food: Foodways as a map of meanings
 
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...
BTO 2013: Analizzare il comportamento del consumatore, offrire un percorso pe...
 
Guide débattre-autrement-animafac-2012
Guide débattre-autrement-animafac-2012Guide débattre-autrement-animafac-2012
Guide débattre-autrement-animafac-2012
 
Humor
HumorHumor
Humor
 
Primero aprende y sólo después enseña
Primero aprende y sólo después enseñaPrimero aprende y sólo después enseña
Primero aprende y sólo después enseña
 
Innovation Boot Camp: OALT/ABO Conference 2012
Innovation Boot Camp: OALT/ABO Conference 2012Innovation Boot Camp: OALT/ABO Conference 2012
Innovation Boot Camp: OALT/ABO Conference 2012
 
Lenguaje
LenguajeLenguaje
Lenguaje
 
Wie Facebook den Handel ausschalten will
Wie Facebook den Handel ausschalten willWie Facebook den Handel ausschalten will
Wie Facebook den Handel ausschalten will
 
9. naturbasert reiseliv nature travels presentation trondelag 2
9. naturbasert reiseliv   nature travels presentation trondelag 29. naturbasert reiseliv   nature travels presentation trondelag 2
9. naturbasert reiseliv nature travels presentation trondelag 2
 
Productos Sonya de Forever Living
Productos Sonya de Forever LivingProductos Sonya de Forever Living
Productos Sonya de Forever Living
 
El pueblo 8
El pueblo 8El pueblo 8
El pueblo 8
 
Marca Enkarterri Green
Marca Enkarterri GreenMarca Enkarterri Green
Marca Enkarterri Green
 
SFSG Newsletter What Is Your Competitive Edge
SFSG Newsletter What Is Your Competitive EdgeSFSG Newsletter What Is Your Competitive Edge
SFSG Newsletter What Is Your Competitive Edge
 

Ähnlich wie Accessing small molecule data using ChEBI

II-SDV 2017: The "International Chemical Ontology Network"
II-SDV 2017: The "International Chemical Ontology Network" II-SDV 2017: The "International Chemical Ontology Network"
II-SDV 2017: The "International Chemical Ontology Network" Dr. Haxel Consult
 
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsBioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsProf. Wim Van Criekinge
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technologyrendevilla
 
In-silico Drug designing
In-silico Drug designing In-silico Drug designing
In-silico Drug designing Vikas Sinhmar
 
Biological database....pptx
Biological database....pptxBiological database....pptx
Biological database....pptxGunjitSetia1
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyJanna Hastings
 
Revolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyRevolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyChris Southan
 
Session 1 part 3
Session 1 part 3Session 1 part 3
Session 1 part 3plmiami
 
Proteins in 3D, NMC 2009
Proteins in 3D, NMC 2009Proteins in 3D, NMC 2009
Proteins in 3D, NMC 2009mollywoggly
 
Using multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small moleculesUsing multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small moleculesJanna Hastings
 
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verification
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verificationISMB2011 Tutorial: Biomedical Ontologies for data integration and verification
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verificationMichel Dumontier
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練 2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練 Abner Huang
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptxrnath286
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Sachin Kumar
 
louisa_bellis_small_molecules_copenhagen_roadshow.pptx
louisa_bellis_small_molecules_copenhagen_roadshow.pptxlouisa_bellis_small_molecules_copenhagen_roadshow.pptx
louisa_bellis_small_molecules_copenhagen_roadshow.pptxdrzyp
 

Ähnlich wie Accessing small molecule data using ChEBI (20)

II-SDV 2017: The "International Chemical Ontology Network"
II-SDV 2017: The "International Chemical Ontology Network" II-SDV 2017: The "International Chemical Ontology Network"
II-SDV 2017: The "International Chemical Ontology Network"
 
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsBioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technology
 
In-silico Drug designing
In-silico Drug designing In-silico Drug designing
In-silico Drug designing
 
Biological database....pptx
Biological database....pptxBiological database....pptx
Biological database....pptx
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
 
Revolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyRevolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and Biology
 
ChemSpider – An Online Database and Registration System Linking the Web
ChemSpider – An Online Database and  Registration System Linking the WebChemSpider – An Online Database and  Registration System Linking the Web
ChemSpider – An Online Database and Registration System Linking the Web
 
Session 1 part 3
Session 1 part 3Session 1 part 3
Session 1 part 3
 
Seminar NIEHS
Seminar NIEHSSeminar NIEHS
Seminar NIEHS
 
Proteins in 3D, NMC 2009
Proteins in 3D, NMC 2009Proteins in 3D, NMC 2009
Proteins in 3D, NMC 2009
 
Using multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small moleculesUsing multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small molecules
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verification
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verificationISMB2011 Tutorial: Biomedical Ontologies for data integration and verification
ISMB2011 Tutorial: Biomedical Ontologies for data integration and verification
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練 2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
louisa_bellis_small_molecules_copenhagen_roadshow.pptx
louisa_bellis_small_molecules_copenhagen_roadshow.pptxlouisa_bellis_small_molecules_copenhagen_roadshow.pptx
louisa_bellis_small_molecules_copenhagen_roadshow.pptx
 

Mehr von Duncan Hull

Why study plants?
Why study plants?Why study plants?
Why study plants?Duncan Hull
 
Embedding employability in the Computer Science curriculum
Embedding employability in the Computer Science curriculumEmbedding employability in the Computer Science curriculum
Embedding employability in the Computer Science curriculumDuncan Hull
 
Wikipedia at the Royal Society: The Good, the Bad and the Ugly
Wikipedia at the Royal Society: The Good, the Bad and the UglyWikipedia at the Royal Society: The Good, the Bad and the Ugly
Wikipedia at the Royal Society: The Good, the Bad and the UglyDuncan Hull
 
OWL-XML-Summer-School-09
OWL-XML-Summer-School-09OWL-XML-Summer-School-09
OWL-XML-Summer-School-09Duncan Hull
 
The Invisible Scientist
The Invisible ScientistThe Invisible Scientist
The Invisible ScientistDuncan Hull
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ NettabDuncan Hull
 
The Year of Blogging Dangerously
The Year of Blogging DangerouslyThe Year of Blogging Dangerously
The Year of Blogging DangerouslyDuncan Hull
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
Defrosting the Digital Library: A survey of bibliographic tools for the next ...Defrosting the Digital Library: A survey of bibliographic tools for the next ...
Defrosting the Digital Library: A survey of bibliographic tools for the next ...Duncan Hull
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)Duncan Hull
 
Chemical named entity recognition and literature mark-up
Chemical named entity recognition and literature mark-upChemical named entity recognition and literature mark-up
Chemical named entity recognition and literature mark-upDuncan Hull
 
Chemoinformatics and information management
Chemoinformatics and information managementChemoinformatics and information management
Chemoinformatics and information managementDuncan Hull
 
Text mining tools for semantically enriching scientific literature
Text mining tools for semantically enriching scientific literatureText mining tools for semantically enriching scientific literature
Text mining tools for semantically enriching scientific literatureDuncan Hull
 
Issues for metabolomics and
Issues for metabolomics and Issues for metabolomics and
Issues for metabolomics and Duncan Hull
 
Adding Meaning To Your Data
Adding Meaning To Your DataAdding Meaning To Your Data
Adding Meaning To Your DataDuncan Hull
 
Web of Science: REST or SOAP?
Web of Science: REST or SOAP?Web of Science: REST or SOAP?
Web of Science: REST or SOAP?Duncan Hull
 
If Web Services are the Answer, What's The Question
If Web Services are the Answer, What's The QuestionIf Web Services are the Answer, What's The Question
If Web Services are the Answer, What's The QuestionDuncan Hull
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 

Mehr von Duncan Hull (20)

Why study plants?
Why study plants?Why study plants?
Why study plants?
 
Embedding employability in the Computer Science curriculum
Embedding employability in the Computer Science curriculumEmbedding employability in the Computer Science curriculum
Embedding employability in the Computer Science curriculum
 
Wikipedia at the Royal Society: The Good, the Bad and the Ugly
Wikipedia at the Royal Society: The Good, the Bad and the UglyWikipedia at the Royal Society: The Good, the Bad and the Ugly
Wikipedia at the Royal Society: The Good, the Bad and the Ugly
 
How to Blog
How to BlogHow to Blog
How to Blog
 
OWL-XML-Summer-School-09
OWL-XML-Summer-School-09OWL-XML-Summer-School-09
OWL-XML-Summer-School-09
 
The Invisible Scientist
The Invisible ScientistThe Invisible Scientist
The Invisible Scientist
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
 
The Year of Blogging Dangerously
The Year of Blogging DangerouslyThe Year of Blogging Dangerously
The Year of Blogging Dangerously
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
Defrosting the Digital Library: A survey of bibliographic tools for the next ...Defrosting the Digital Library: A survey of bibliographic tools for the next ...
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
Chemical named entity recognition and literature mark-up
Chemical named entity recognition and literature mark-upChemical named entity recognition and literature mark-up
Chemical named entity recognition and literature mark-up
 
Chemoinformatics and information management
Chemoinformatics and information managementChemoinformatics and information management
Chemoinformatics and information management
 
Text mining tools for semantically enriching scientific literature
Text mining tools for semantically enriching scientific literatureText mining tools for semantically enriching scientific literature
Text mining tools for semantically enriching scientific literature
 
Issues for metabolomics and
Issues for metabolomics and Issues for metabolomics and
Issues for metabolomics and
 
Adding Meaning To Your Data
Adding Meaning To Your DataAdding Meaning To Your Data
Adding Meaning To Your Data
 
Web of Science: REST or SOAP?
Web of Science: REST or SOAP?Web of Science: REST or SOAP?
Web of Science: REST or SOAP?
 
If Web Services are the Answer, What's The Question
If Web Services are the Answer, What's The QuestionIf Web Services are the Answer, What's The Question
If Web Services are the Answer, What's The Question
 
Myexperiment
MyexperimentMyexperiment
Myexperiment
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 

Accessing small molecule data using ChEBI

  • 1. Accessing small molecule data using ChEBI Janna Hastings, Duncan Hull and Nico Adams Programmatic Access to Biological Databases (Perl) 22-26 February 2010 @ EBI
  • 2.
  • 4. Small Molecules within Bioinformatics Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems
  • 5. Literature Nucleotide sequences Genomes Expressions Protein sequences Protein domains, families 3D structures Enzymes Small molecules Pathways Systems Small Molecules within Bioinformatics Small molecules Small molecules Small molecules Small molecules Small molecules
  • 6. Small molecules participate in all the processes of life
  • 7.
  • 8.
  • 9.
  • 11.
  • 12.
  • 13. Drug types 2003 - 2009 'Small molecules' in various shades of blue (http://chembl.blogspot.com/)
  • 14.
  • 15. Small molecule data sources Deposition-driven publicly available compound repository, containing more than 25 million unique structures. http://pubchem.ncbi.nlm.nih.gov/ http://www.chemspider.com/ Automatic aggregation of publicly available chemistry data with crowdsourced annotation. http://www.ebi.ac.uk/chebi/ Manually annotated database and ontology
  • 16.
  • 17. Chemicals - ChEBI Visualisation caffeine 1,3,7-trimethylxanthine methyltheobromine Nomenclature Formula: C8H10N4O2 Charge: 0 Mass: 194.19 Chemical data metabolite CNS stimulant trimethylxanthines Ontology MSDchem: CFF KEGG DRUG: D00528 Database Xrefs Chemical Informatics InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES CN1C(=O)N(C)c2ncn(C)c2C1=O
  • 18.
  • 19. ChEBI home page ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 20.
  • 21.
  • 22. ChEBI entry view ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 23. Automatic Cross-references ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 24.
  • 25. Molfile format ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 27. Searching and browsing ChEBI Block 2
  • 28.
  • 29. Advanced text search ChEBI – Chemical Entities of Biological Interest 25.02.10 Narrow to category AND, OR and BUT NOT
  • 30. Structure search ChEBI – Chemical Entities of Biological Interest 25.02.10 Search options Structure drawing tools
  • 31. Search Results ChEBI – Chemical Entities of Biological Interest 25.02.10 Click to go to entry page Hover-over for search menu
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. Browse via Periodic Table ChEBI – Chemical Entities of Biological Interest 25.02.10 Molecular entities / Elements
  • 37. Navigate via links in ontology ChEBI – Chemical Entities of Biological Interest 25.02.10 Click to follow links
  • 39. Understanding the ChEBI ontology Block 3
  • 40.
  • 41.
  • 42. Molecular structure ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 43. Role ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 44.
  • 45. Viewing ChEBI ontology ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 46. Viewing ChEBI ontology [2] ChEBI – Chemical Entities of Biological Interest 25.02.10 Tree view
  • 47. Browsing ChEBI ontology (OLS) ChEBI – Chemical Entities of Biological Interest 25.02.10 Browse the ontology Ontology Lookup Service (OLS): http://www.ebi.ac.uk/ontology-lookup/
  • 48.
  • 49. OBO Foundry “ The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain.” ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 51. Download and programmatic access Block 4
  • 52. ChEBI domain model ChEBI – Chemical Entities of Biological Interest 25.02.10 Self-referencing - merging
  • 53.
  • 54. Compound IDs and Merging [2] ChEBI – Chemical Entities of Biological Interest 25.02.10 Additional acc Parent ID This compound ID = additional acc ID STATUS CHEBI_ACCN SOURCE PARENT_ID NAME DEFINITION 15377 C CHEBI:15377 ChEBI null water null 5585 C CHEBI:5585 KEGG 15377 null null ID COMPOUND ACCN_NUMBER TYPE STATUS SOURCE URL_ABBR 16213 5585 C00001 KEGG accn C KEGG KEGG 17314 5585 7732-18-5 CAS Registry C KEGG null
  • 55.
  • 56.
  • 57.
  • 58.
  • 59. SDF File complete format ChEBI – Chemical Entities of Biological Interest 25.02.10 Entries separated by $$$$
  • 60.
  • 61.
  • 62.
  • 63.
  • 64. Web service client object model ChEBI – Chemical Entities of Biological Interest 25.02.10 getLiteEntity getCompleteEntity getOntology (Parents and Children)
  • 65. Methods and parameters (1) ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 66. Methods and parameters (2) ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 67. Methods and parameters (3) ChEBI – Chemical Entities of Biological Interest 25.02.10
  • 69.
  • 70.

Hinweis der Redaktion

  1. Databases - ChEBI
  2. Time taken to perform a full substructure search increases exponentially with the number of atoms. So, running the full search against the entire database is an intractable problem.
  3. Molecular formula provides a crude heuristic for narrowing the number of search candidates in a substructure search. Fingerprints are a much more powerful device.
  4. An algorithm generates patterns for each atom, each bonded group of two atoms, three… up to 8 bonds long. Each pattern is then hashed into a bit string, and the hashed results are all then added together using the logical OR relationship to create the final fingerprint.
  5. Identity search is subject to the limitations of InChI uniqueness, however, in general, identity search will find exactly the structure you have entered, if it exists in the database. For substructure searching, the fingerprint is used to narrow the range of search candidates from the database based on the fingerprint property that all bits set in the substructure fingerprint, are also set in the structure fingerprint. For similarity, the Tanimoto coefficient is calculated from the fingerprints based on T = c/(a + b – c).
  6. When trying to retrieve the compound accession from a data item such as a database accession or compound name, the relevant entry in the Compounds table must also be retrieved and the parent_id field examined. If the parent_id is not empty, then it links to the compound containing the primary identifier for this merged group of entities. There are more ID’s than just one, for a given compound,