SlideShare ist ein Scribd-Unternehmen logo
1 von 18
 "Phylogenetics" is the study or estimation of the evolutionary history that 
underlies that biological diversity. 
 The results of phylogenetic analysis are usually presented as a collection of nodes 
and branches. That is, a tree 
 In such tree, taxa that are closely related in an evolutionary sense appear close to 
each other, and taxa that are distantly related are in different (far) branches of 
the trees 
 Phylogenetic trees are also important for multiple sequence alignment
 Trees may be rooted or unrooted. 
 Rooted trees reflect the most 
basal ancestor of the tree in 
question. 
 Unrooted trees do not imply a 
known ancestral root. 
 There are competing techniques 
for rooting a tree; one of the 
most common methods is 
through the use of an 
"outgroup" . 
 An outgroup is a species that 
have unambiguously separated 
early from the other species 
being considered. 
B
 Multiple sequence alignment can be viewed as an extension of pairwise sequence 
alignment, but the complexity of the computation grows exponentially with the 
number of sequences. 
 MSA applies both to nucleotide and amino acid sequences 
 One of the most essential tools in molecular biology that is used since 1987. 
 MSA can help us to reveal biological facts about proteins, like analysis of the 
secondary/tertiary structure. 
 MSA helps us to do a phylogenetic analysis of the sequences so as to construct 
evolutionary trees.
 Exhaustive search: 
extension of DP to multiple dimensions. 
 Progressive alignment: compute tree of sequences, based on hierarchical 
clustering, and then merge closest first, greedily. E.g. ClustalW 
 Block-based global alignment find highly conserved regions and then grow 
alignment around these regions. E.g. BLAST 
 Iterative search: based on genetic algorithm search. 
• Local alignments 
 Profile analysis 
 Block analysis 
 Patterns searching and/or Statistical methods
VTISCTGSSSNIGAG-NHVKWYQQLPG 
VTISCTGTSSNIGS--ITVNWYQQLPG 
LRLSCSSSGFIFSS--YAMYWVRQAPG 
LSLTCTVSGTSFDD--YYSTWVRQPPG 
PEVTCVVVDVSHEDPQVKFNWYVDG-- 
ATLVCLISDFYPGA--VTVAWKADS-- 
AALGCLVKDYFPEP--VTVSWNSG--- 
VSLTCLVKGFYPSD--IAVEWWSNG--
 Alignment of 2 sequences is represented as a 
2-row matrix 
 In a similar way, we represent alignment of 3 
sequences as a 3-row matrix 
A T _ G C G _ 
A _ C G T _ A 
A T C A C _ A 
 Score: more conserved columns, better alignment
 Align 3 sequences: ATGC, AATC,ATGC 
0 1 1 2 3 4 
A -- T G C 
0 1 2 3 3 4 
A A T -- C 
0 0 1 2 3 4 
-- A T G C 
x coordinate 
y coordinate 
z coordinate 
• Resulting path in (x,y,z) space: 
(0,0,0)(1,1,0)(1,2,1) (2,3,2) (3,3,3) (4,4,4)
C (i-1,j-1) C (i-1,j) 
C (i,j-1) 
In 2-D, 3 edges 
in each unit 
square 
In 3-D, 7 edges 
in each unit cube 
C(i-1,j-1,k-1) C(i-1,j,k-1) 
C(i-1,j-1,k) 
C(i,j-1,k) 
C (i-1,j,k) 
C(i,j,k) 
C(i,j-1,k-1) C(i,j,k-1) 
Enumerate all possibilities and choose the best one
 For three sequences of length n, the run time is proportional to the 
number of edges in the 3-D grid. i. e 7n . 
 For a k-way alignment, build a k-dimensional Manhattan graph 
with 
k 
 n nodes 
k k 
k 
 Most nodes have 2 -1 incoming edges 
 Runtime: 0(2 n ) 
 Consider 2 protein sequences of 100 amino acids in length. 
 If it takes 1002 (103) seconds to exhaustively align these sequences, then it will 
take 104 seconds to align 3 sequences, 105 to align 4 sequences, etc. 
 It will take ~1021 seconds to align 20 sequences. One year is ~3x107 seconds. The 
age of the visible universe is ~.4x1018 seconds.
 Greedy method follows the problem solving heuristic of 
making the locally optimal choice at each stage of k 
sequences with the hope of finding a global optimum to 
an alignment of of k-1 sequences/profiles. 
u1= ACGTACGTACGT… 
u2 = TTAATTAATTAA… 
u3 = ACTACTACTACT… 
… 
uk = CCGGCCGGCCGG 
u1= ACg/tTACg/tTACg/cT… 
u2 = TTAATTAATTAA… 
… 
uk = CCGGCCGGCCGG… 
k 
k-1
• Consider these 4 sequences 
s1 GATTCA 
s2 GTCTGA 
s3 GATATT 
s4 GTCAGC
4 
• There are = 6 possible alignments 2 
s2 GTCTGA 
s4 GTCAGC (score = 2) 
s1 GAT-TCA 
s2 G-TCTGA (score = 1) 
s1 GAT-TCA 
s3 GATAT-T (score 
s1 GATTCA-- 
s4 G—T-CAGC(score = 0) 
Match= +1 
Mismatch/gap= -1 
s2 G-TCTGA 
s3 GATAT-T (score = -1) 
s3 GAT-ATT 
= 1) s4 G-TCAGC 
(score = -1)
s2 and s4 are closest; combine: 
s2 GTCTGA 
s4 GTCAGC 
s2,4 GTCt/aGa/c 
(profile) 
new set of 3 sequences: 
s1 
s3 
s2,4 
GATTCA 
GATATT 
GTCt/aGa/c
s1 
s3 
s2,4 
GATTCA 
GATATT 
GTCt/aGa/c 
s1 GATTC- - A 
s2,4 G -T -CTGA 
(score = 0) 
s3 GATATT - 
s2,4 G -TCTGA 
(score = -1) 
s1 and s2,4 are closest; combine: 
s1 GATTC- - A 
S2,4 G -T -CTGA S1,2,4 Ga/-Tt/-ct/-g/-A 
s3 
S1,2,4 
GATATT 
Ga/-Tt/-ct/-g/-A 
s3 GATAT –T- - 
S1,2,4 GAT-TCTGA 
(score = 1) 
S1,2,3,4 GATa/-Tc/-Tg/-a/- 
Final Alignment:
 Computationally complex 
 If msa includes matches, mismatches and gaps and also 
accounts the degree of variation then msa can be applied 
to only a few sequences 
 Difficult to score 
 Multiple comparison necessary in each column of the msa for a 
cumulative score 
 Placement of gaps and scoring of substitution is more difficult 
 Difficulty increases with diversity 
 Relatively easy for a set of closely related sequences 
 Identifying the correct ancestry relationships for a set 
of distantly related sequences is more challenging 
 Even difficult if some members are more alike compared 
to others
 EMBL-EBI 
 http://www.ebi.ac.uk/clustalw/ 
 BCM Search Launcher: Multiple Alignment 
 http://dot.imgen.bcm.tmc.edu:9331/multi-align/multi-align.html 
 Multiple Sequence Alignment for Proteins (Wash. U. St. Louis) 
 http://www.ibc.wustl.edu/service/msa/ 
web.warwick.ac.uk/telri/Bioinfo/ 
http://science.marshall.edu/murraye/ 
http://www.cs.iastate.edu/~cs544/Lectures/
Msa & rooted/unrooted tree

Weitere ähnliche Inhalte

Andere mochten auch

Application of Gauss,Green and Stokes Theorem
Application of Gauss,Green and Stokes TheoremApplication of Gauss,Green and Stokes Theorem
Application of Gauss,Green and Stokes TheoremSamiul Ehsan
 
Practical applications of limits
Practical applications of limitsPractical applications of limits
Practical applications of limitsmichael ocampo
 
Limits and continuity powerpoint
Limits and continuity powerpointLimits and continuity powerpoint
Limits and continuity powerpointcanalculus
 
Application of calculus in everyday life
Application of calculus in everyday lifeApplication of calculus in everyday life
Application of calculus in everyday lifeMohamed Ibrahim
 
Calculus in real life
Calculus in real lifeCalculus in real life
Calculus in real lifeSamiul Ehsan
 

Andere mochten auch (8)

Histogram
HistogramHistogram
Histogram
 
Radioactivity
RadioactivityRadioactivity
Radioactivity
 
Application of Gauss,Green and Stokes Theorem
Application of Gauss,Green and Stokes TheoremApplication of Gauss,Green and Stokes Theorem
Application of Gauss,Green and Stokes Theorem
 
Practical applications of limits
Practical applications of limitsPractical applications of limits
Practical applications of limits
 
Limits and continuity powerpoint
Limits and continuity powerpointLimits and continuity powerpoint
Limits and continuity powerpoint
 
Application of calculus in everyday life
Application of calculus in everyday lifeApplication of calculus in everyday life
Application of calculus in everyday life
 
Calculus in real life
Calculus in real lifeCalculus in real life
Calculus in real life
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Ähnlich wie Msa & rooted/unrooted tree

20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07Computer Science Club
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignmentSanaym
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...IOSR Journals
 
timeSeriesClassificationLDA
timeSeriesClassificationLDAtimeSeriesClassificationLDA
timeSeriesClassificationLDAKellen Betts
 
Presentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali ShahPresentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali Shahguest5de83e
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programmingNixon Mendez
 
Dynamic_Prog_Analysis_poster2
Dynamic_Prog_Analysis_poster2Dynamic_Prog_Analysis_poster2
Dynamic_Prog_Analysis_poster2Vineetha Vishnu
 
Bounded Approaches in Radio Labeling Square Grids -- Dev Ananda
Bounded Approaches in Radio Labeling Square Grids -- Dev AnandaBounded Approaches in Radio Labeling Square Grids -- Dev Ananda
Bounded Approaches in Radio Labeling Square Grids -- Dev AnandaDev Ananda
 
Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaTraditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaLake Como School of Advanced Studies
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...ijcseit
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ijcseit
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...ijcseit
 

Ähnlich wie Msa & rooted/unrooted tree (20)

Ch06 multalign
Ch06 multalignCh06 multalign
Ch06 multalign
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...An Efficient Biological Sequence Compression Technique Using  LUT and Repeat ...
An Efficient Biological Sequence Compression Technique Using LUT and Repeat ...
 
timeSeriesClassificationLDA
timeSeriesClassificationLDAtimeSeriesClassificationLDA
timeSeriesClassificationLDA
 
Presentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali ShahPresentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali Shah
 
Alignments
AlignmentsAlignments
Alignments
 
Biological sequences analysis
Biological sequences analysisBiological sequences analysis
Biological sequences analysis
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programming
 
Dynamic_Prog_Analysis_poster2
Dynamic_Prog_Analysis_poster2Dynamic_Prog_Analysis_poster2
Dynamic_Prog_Analysis_poster2
 
Bounded Approaches in Radio Labeling Square Grids -- Dev Ananda
Bounded Approaches in Radio Labeling Square Grids -- Dev AnandaBounded Approaches in Radio Labeling Square Grids -- Dev Ananda
Bounded Approaches in Radio Labeling Square Grids -- Dev Ananda
 
Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto EstradaTraditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
Traditional vs Nontraditional Methods for Network Analytics - Ernesto Estrada
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES
 
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
A COMPARATIVE ANALYSIS OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT APPROACHES ...
 
Sequence alignment belgaum
Sequence alignment belgaumSequence alignment belgaum
Sequence alignment belgaum
 
Bioinformatics lesson
Bioinformatics lessonBioinformatics lesson
Bioinformatics lesson
 
Bioinformatics lesson
Bioinformatics lessonBioinformatics lesson
Bioinformatics lesson
 

Kürzlich hochgeladen

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 

Kürzlich hochgeladen (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 

Msa & rooted/unrooted tree

  • 1.
  • 2.  "Phylogenetics" is the study or estimation of the evolutionary history that underlies that biological diversity.  The results of phylogenetic analysis are usually presented as a collection of nodes and branches. That is, a tree  In such tree, taxa that are closely related in an evolutionary sense appear close to each other, and taxa that are distantly related are in different (far) branches of the trees  Phylogenetic trees are also important for multiple sequence alignment
  • 3.  Trees may be rooted or unrooted.  Rooted trees reflect the most basal ancestor of the tree in question.  Unrooted trees do not imply a known ancestral root.  There are competing techniques for rooting a tree; one of the most common methods is through the use of an "outgroup" .  An outgroup is a species that have unambiguously separated early from the other species being considered. B
  • 4.  Multiple sequence alignment can be viewed as an extension of pairwise sequence alignment, but the complexity of the computation grows exponentially with the number of sequences.  MSA applies both to nucleotide and amino acid sequences  One of the most essential tools in molecular biology that is used since 1987.  MSA can help us to reveal biological facts about proteins, like analysis of the secondary/tertiary structure.  MSA helps us to do a phylogenetic analysis of the sequences so as to construct evolutionary trees.
  • 5.  Exhaustive search: extension of DP to multiple dimensions.  Progressive alignment: compute tree of sequences, based on hierarchical clustering, and then merge closest first, greedily. E.g. ClustalW  Block-based global alignment find highly conserved regions and then grow alignment around these regions. E.g. BLAST  Iterative search: based on genetic algorithm search. • Local alignments  Profile analysis  Block analysis  Patterns searching and/or Statistical methods
  • 6. VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWWSNG--
  • 7.  Alignment of 2 sequences is represented as a 2-row matrix  In a similar way, we represent alignment of 3 sequences as a 3-row matrix A T _ G C G _ A _ C G T _ A A T C A C _ A  Score: more conserved columns, better alignment
  • 8.  Align 3 sequences: ATGC, AATC,ATGC 0 1 1 2 3 4 A -- T G C 0 1 2 3 3 4 A A T -- C 0 0 1 2 3 4 -- A T G C x coordinate y coordinate z coordinate • Resulting path in (x,y,z) space: (0,0,0)(1,1,0)(1,2,1) (2,3,2) (3,3,3) (4,4,4)
  • 9. C (i-1,j-1) C (i-1,j) C (i,j-1) In 2-D, 3 edges in each unit square In 3-D, 7 edges in each unit cube C(i-1,j-1,k-1) C(i-1,j,k-1) C(i-1,j-1,k) C(i,j-1,k) C (i-1,j,k) C(i,j,k) C(i,j-1,k-1) C(i,j,k-1) Enumerate all possibilities and choose the best one
  • 10.  For three sequences of length n, the run time is proportional to the number of edges in the 3-D grid. i. e 7n .  For a k-way alignment, build a k-dimensional Manhattan graph with k  n nodes k k k  Most nodes have 2 -1 incoming edges  Runtime: 0(2 n )  Consider 2 protein sequences of 100 amino acids in length.  If it takes 1002 (103) seconds to exhaustively align these sequences, then it will take 104 seconds to align 3 sequences, 105 to align 4 sequences, etc.  It will take ~1021 seconds to align 20 sequences. One year is ~3x107 seconds. The age of the visible universe is ~.4x1018 seconds.
  • 11.  Greedy method follows the problem solving heuristic of making the locally optimal choice at each stage of k sequences with the hope of finding a global optimum to an alignment of of k-1 sequences/profiles. u1= ACGTACGTACGT… u2 = TTAATTAATTAA… u3 = ACTACTACTACT… … uk = CCGGCCGGCCGG u1= ACg/tTACg/tTACg/cT… u2 = TTAATTAATTAA… … uk = CCGGCCGGCCGG… k k-1
  • 12. • Consider these 4 sequences s1 GATTCA s2 GTCTGA s3 GATATT s4 GTCAGC
  • 13. 4 • There are = 6 possible alignments 2 s2 GTCTGA s4 GTCAGC (score = 2) s1 GAT-TCA s2 G-TCTGA (score = 1) s1 GAT-TCA s3 GATAT-T (score s1 GATTCA-- s4 G—T-CAGC(score = 0) Match= +1 Mismatch/gap= -1 s2 G-TCTGA s3 GATAT-T (score = -1) s3 GAT-ATT = 1) s4 G-TCAGC (score = -1)
  • 14. s2 and s4 are closest; combine: s2 GTCTGA s4 GTCAGC s2,4 GTCt/aGa/c (profile) new set of 3 sequences: s1 s3 s2,4 GATTCA GATATT GTCt/aGa/c
  • 15. s1 s3 s2,4 GATTCA GATATT GTCt/aGa/c s1 GATTC- - A s2,4 G -T -CTGA (score = 0) s3 GATATT - s2,4 G -TCTGA (score = -1) s1 and s2,4 are closest; combine: s1 GATTC- - A S2,4 G -T -CTGA S1,2,4 Ga/-Tt/-ct/-g/-A s3 S1,2,4 GATATT Ga/-Tt/-ct/-g/-A s3 GATAT –T- - S1,2,4 GAT-TCTGA (score = 1) S1,2,3,4 GATa/-Tc/-Tg/-a/- Final Alignment:
  • 16.  Computationally complex  If msa includes matches, mismatches and gaps and also accounts the degree of variation then msa can be applied to only a few sequences  Difficult to score  Multiple comparison necessary in each column of the msa for a cumulative score  Placement of gaps and scoring of substitution is more difficult  Difficulty increases with diversity  Relatively easy for a set of closely related sequences  Identifying the correct ancestry relationships for a set of distantly related sequences is more challenging  Even difficult if some members are more alike compared to others
  • 17.  EMBL-EBI  http://www.ebi.ac.uk/clustalw/  BCM Search Launcher: Multiple Alignment  http://dot.imgen.bcm.tmc.edu:9331/multi-align/multi-align.html  Multiple Sequence Alignment for Proteins (Wash. U. St. Louis)  http://www.ibc.wustl.edu/service/msa/ web.warwick.ac.uk/telri/Bioinfo/ http://science.marshall.edu/murraye/ http://www.cs.iastate.edu/~cs544/Lectures/