Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinformatics: from phylogeny to Twitter"
1. Networks in genomics and bioinformatics: from
phylogeny to Twitter
ISCB2012
July 12, 2012
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Friday, July 13, 12
2. Networks in genomics and bioinformatics: from
phylogeny to Twitter
ISCB2012
July 12, 2012
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Friday, July 13, 12
3. A meandering path and lessons “learned”
ISCB2012
July 12, 2012
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Friday, July 13, 12
11. Phylogenomics of Novelty
Origin of New
Functions and
Processes
•New genes
•Changes in old genes
•Changes in pathways
Friday, July 13, 12
12. Phylogenomics of Novelty
Origin of New
Functions and
Processes
•New genes
•Changes in old genes
•Changes in pathways
Friday, July 13, 12
13. Phylogenomics of Novelty
Origin of New Genome
Functions and Dynamics
Processes
•New genes
•Changes in old genes
•Changes in pathways
Friday, July 13, 12
14. Phylogenomics of Novelty
Origin of New Genome
Functions and Dynamics
Processes
•Evolvability
•New genes •Repair and
•Changes in old genes recombination processes
•Changes in pathways •Intragenomic variation
Friday, July 13, 12
15. Phylogenomics of Novelty
Origin of New Genome
Functions and Dynamics
Processes
•Evolvability
•New genes •Repair and
•Changes in old genes recombination processes
•Changes in pathways •Intragenomic variation
Friday, July 13, 12
16. Phylogenomics of Novelty
Origin of New Genome
Functions and Dynamics
Processes
•Evolvability
•New genes •Repair and
•Changes in old genes recombination processes
•Changes in pathways •Intragenomic variation
Species Evolution
Friday, July 13, 12
17. Phylogenomics of Novelty
Origin of New Genome
Functions and Dynamics
Processes
•Evolvability
•New genes •Repair and
•Changes in old genes recombination processes
•Changes in pathways •Intragenomic variation
Species Evolution
•Phylogenetic history
•Vertical vs. horizontal descent
•Needed to track gain/loss of
processes, infer convergence
Friday, July 13, 12
18. Undergrad Lesson 1:
Be prepared for random events
• Gould’s class b/c planned on not majoring
in Biology
• RMBL via backpacking trip
• Geology library job w/ Nabokov collection
b/c went to wrong building
• Discovering Colleen Cavanaugh’s lab via
street encounter
Friday, July 13, 12
21. Grad school lesson I:
find right people to work with
• Went to work on butterfly population biology
and phylogeny
• Advisor and I did not see eye to eye
• Despite great subject for me (combined
phylogeny, molecular evolution, RMBL, etc),
chose not to join lab
• Did many rotations …
• Picked final lab in part b/c advisor was right
match
Friday, July 13, 12
22. Grad school lesson II:
never too late to change
• Wanted to combine DNA repair studies and
molecular evolution
• I: Thymineless death
• II: Adaptive mutation
• III: Repair in archaea
Friday, July 13, 12
24. Grad school lesson II:
never too late to change
• Wanted to combine DNA repair studies and
molecular evolution
• I: Thymineless death
• II: Adaptive mutation
• III: Repair in archaea
• IV: Bioinformatics and genome analysis …
Friday, July 13, 12
25. Grad school lesson III:
Get others to do your work
• Interested in RecA structure function
relationships
• Using phylogeny to look for correlated
substitutions in RecA structure, like
done with rRNA
• But not enough sequences …
Friday, July 13, 12
27. Shotgun Sequencing Allows Use of Alternative
Anchors (e.g., RecA)
Venter et al., 2004
Friday, July 13, 12
28. Grad school lesson IV:
Stealing is good
• Phylogenetic perspective in
bioinformatics missing
Friday, July 13, 12
29. “Nothing in biology makes sense
except in the light of evolution.”
T. H. Dobzhansky (1973)
Friday, July 13, 12
30. Evolutionary Perspective and
Comparative Biology
• Comparative biology is the analysis of
differences and similarities between
species.
• An evolutionary perspective is useful in
such studies because this allows one to
focus not just on the levels and degrees of
similarity or difference but on how and why
similarities and differences came to be.
Friday, July 13, 12
31. Phylogenomics
• Lots of sequences being produced with no
functions associated with them
• Much debate in community about how to
predict functions
Friday, July 13, 12
32. Predicting Function
• Identification of motifs
• Homology/similarity based methods
• Highest hit
• Top hits
• Clusters of orthologous groups
• HMM models
• Structural threading and modeling
• Evolutionary reconstructions
Friday, July 13, 12
37. Phylogenetic Prediction of
Function
• Many powerful and automated similarity based
methods for assigning genes to protein families
• COGs
• PFAM HMM searches
• Some limitations of similarity based methods can be
overcome by phylogenetic approaches
• Automated methods now available
• Sean Eddy
• Steven Brenner
• Kimmen Sjölander
• But …
Friday, July 13, 12
40. Career Lesson I:
Build on what you know
• Phylogenetic approaches to genomics
• Genomics of endosymbionts
• Genomic studies of communities
• Analysis of DNA repair genes in genome
sequences
• Phylogenomics of halophilic archaea
• GEBA
• Phylogenetic metagenomics
• ...
Friday, July 13, 12
44. DNA Repair Genes in D. radiodurans
Process Genes in D. radiodur a n s
Nucleotide Excision Repair UvrABCD, UvrA2
Base Excision Repair AlkA, Ung, Ung2, GT, MutM, MutY-Nths,
MPG
AP Endonuclease Xth
Mismatch Excision Repair MutS, MutL
Recombination
Initiation RecFJNRQ, SbcCD, RecD
Recombinase RecA
Migration and resolution RuvABC, RecG
Replication PolA, PolC, PolX, phage Pol
Ligation DnlJ
dNTP pools, cleanup MutTs, RRase
Other LexA, RadA, HepA, UVDE, MutS2
Friday, July 13, 12
45. Problem ...
• List of DNA repair gene homologs in
D. radiodurans genome is not
significantly different from other
bacterial genomes of the similar size
Friday, July 13, 12
46. Repair Studies in Different Species
(via Medline searches as of 1998)
Humans 7028
E. coli 3926
S. cerevisiae 988
Drosophila 387
B. subtilits 284
S. pombe 116
Xenopus 56
C. elegans 25
A. thaliana 20
Methanogens 16
Haloferax 5
Giardia 0
Friday, July 13, 12
47. Proteobacteria
TM6
OS-K
~40 Phyla of
Acidobacteria
Termite Group
OP8
Bacteria
Nitrospira
Bacteroides
Chlorobi
Fibrobacteres
Marine GroupA
WS3
Gemmimonas
Firmicutes
Fusobacteria
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes 0.1
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae Tree based on
Thermudesulfobacteria
Thermotogae Hugenholtz (2002)
OP1 with some
OP11 modifications.
Friday, July 13, 12
48. Proteobacteria
TM6
OS-K
Acidobacteria Most DNA
Termite Group
OP8
Nitrospira metabolism
Bacteroides
Chlorobi
Fibrobacteres
studies in
Marine GroupA
WS3
Gemmimonas
two Phyla
Firmicutes
Fusobacteria
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes 0.1
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae Tree based on
Thermudesulfobacteria
Thermotogae Hugenholtz (2002)
OP1 with some
OP11 modifications.
Friday, July 13, 12
49. Proteobacteria
TM6
OS-K
Acidobacteria Deinococcus
Termite Group
OP8
Nitrospira is very distant
Bacteroides
Chlorobi
Fibrobacteres
from well
Marine GroupA
WS3
Gemmimonas
studied
Firmicutes
Fusobacteria groups
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes 0.1
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae Tree based on
Thermudesulfobacteria
Thermotogae Hugenholtz (2002)
OP1 with some
OP11 modifications.
Friday, July 13, 12
53. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
phyla of
Acidobacteria
Termite Group
OP8
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
WS3
Gemmimonas
Firmicutes
Fusobacteria
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on Hugenholtz,
OP11 2002
Friday, July 13, 12
54. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group phyla of
OP8
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
• Most genomes
WS3
Gemmimonas from three
Firmicutes
Fusobacteria phyla
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on Hugenholtz,
OP11 2002
Friday, July 13, 12
55. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group phyla of
OP8
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
• Most genomes
WS3
Gemmimonas from three
Firmicutes
Fusobacteria phyla
Actinobacteria
OP9
Cyanobacteria
• Some studies
Synergistes
Deferribacteres
Chrysiogenetes
in other phyla
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
Chloroflexi
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on Hugenholtz,
OP11 2002
Friday, July 13, 12
56. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group phyla of
OP8
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
• Most genomes
WS3
Gemmimonas from three
Firmicutes
Fusobacteria phyla
Actinobacteria
OP9
Cyanobacteria
• Some other
Synergistes
Deferribacteres
Chrysiogenetes
phyla are only
NKB19
Verrucomicrobia sparsely
Chlamydia
OP3
Planctomycetes
sampled
Spriochaetes
Coprothmermobacter • Same trend in
OP10
Thermomicrobia
Chloroflexi
Eukaryotes
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on Hugenholtz,
OP11 2002
Friday, July 13, 12
57. As of 2002 Proteobacteria
TM6
OS-K
• At least 40
Acidobacteria
Termite Group phyla of
OP8
Nitrospira
Bacteroides
bacteria
Chlorobi
Fibrobacteres
Marine GroupA
• Most genomes
WS3
Gemmimonas from three
Firmicutes
Fusobacteria phyla
Actinobacteria
OP9
Cyanobacteria
• Some other
Synergistes
Deferribacteres
Chrysiogenetes
phyla are only
NKB19
Verrucomicrobia sparsely
Chlamydia
OP3
Planctomycetes
sampled
Spriochaetes
Coprothmermobacter • Same trend in
OP10
Thermomicrobia
Chloroflexi
Viruses
TM7
Deinococcus-Thermus
Dictyoglomus
Aquificae
Thermudesulfobacteria
Thermotogae
OP1 Based on Hugenholtz,
OP11 2002
Friday, July 13, 12
59. GEBA
http://www.jgi.doe.gov/programs/GEBA/pilot.html
Friday, July 13, 12
60. rRNA Tree of Life
Bacteria
Archaea
Eukaryotes
Figure from Barton, Eisen et al. “Evolution”,
CSHL Press. 2007.
Based on tree from Pace 1997 Science
276:734-740
Friday, July 13, 12
62. PD: Genomes + GEBA
From Wu
et al. 2009
Nature
462,
1056-1060
Friday, July 13, 12
63. PD: Isolates
From Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12
64. rRNA Tree of Life
Bacteria
Archaea
??????
Eukaryotes
Figure from Barton, Eisen et al. “Evolution”,
CSHL Press. 2007. Wu et al. (2011) PLoS ONE 6(3):
e18011. doi:10.1371/
Based on tree from Pace 1997 Science journal.pone.0018011
276:734-740
Friday, July 13, 12
65. ????
Phage
Phage
????
Thaumarchaeot
Friday, July 13, 12
66. GEBA uncultured
Number of SAGs from Candidate Phyla
406
1
OD1
OP1
OP3
SAR
Site A: Hydrothermal vent 4 1 - -
Site B: Gold Mine 6 13 2 -
Site C: Tropical gyres (Mesopelagic) - - - 2
Site D: Tropical gyres (Photic zone) 1 - - -
Sample collections at 4 additional sites are underway.
Phil Hugenholtz
56
Friday, July 13, 12
68. Non homology functional
• Many genes have homologs in other
species but no homologs have ever been
studied experimentally
• Non-homology methods can make
functional predictions for these
Friday, July 13, 12
69. Phylogenetic profiling basis
• Microbial genes are lost rapidly when not
maintained by selection
• Genes can be acquired by lateral transfer
• Frequently gain and loss occurs for entire
pathways/processes
• Thus might be able to use correlated
presence/absence information to identify
genes with similar functions
Friday, July 13, 12
70. Non-Homology Predictions:
Phylogenetic Profiling
• Step 1: Search all genes in
organisms of interest against
all other genomes
• Ask: Yes or No, is each gene
found in each other species
• Cluster genes by distribution
patterns (profiles)
Friday, July 13, 12
71. Carboxydothermus
hydrogenoformans
• Isolated from a Russian hotspring
• Thermophile (grows at 80°C)
• Anaerobic
• Grows very efficiently on CO (Carbon
Monoxide)
• Produces hydrogen gas
• Low GC Gram positive (Firmicute)
• Genome Determined (Wu et al. 2005
PLoS Genetics 1: e65. )
Friday, July 13, 12
78. Protein Family Rarefaction Curves
• Take data set of multiple complete
genomes
• Identify all protein families using MCL
• Plot # of genomes vs. # of protein families
Friday, July 13, 12
79. Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12
80. Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12
81. Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12
82. Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12
83. Wu et al. 2009 Nature 462, 1056-1060
Friday, July 13, 12