SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Dynamic Programming:
basics and case studies
Houston Machine Learning Meetup
11/16/2019
Dynamic Programming: name and story
• Richard Bellman coined the term “Dynamic Programming”
Bellman autobiography
“The face of Wilson (the secretory of defense) would turn red, and he would get
violent if people used the term RESEARCH in his presence. You can imagine how he
felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson
and the Air Force from the fact that I was really doing MATHEMATICS inside the
RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I
wanted to get across the idea that this was DYNAMIC, this was multistage, this was
time-varying…. I thought dynamic programming was a good name. It was something
not even a Congressman could object to..."
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by recursion
public int fib(int N) {
if (n == 0 || n == 1) { return n; }
return fib(N – 1) + fib(N – 2);
}
Time complexity: O(N) = 2^N
Recursion tree of Fibonacci sequence
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3 5
Fibonacci sequence
• Recursion:
• F(n) = F(n – 1) + F(n – 2)
• Starts from n
• When computing F(n), F(n-1) and F(n-2) is not known yet
• DP:
• F(n) = F(n – 1) + F(n – 2)
• Starts from 0 and 1
• When computing F(n), F(n-1) and F(n-2) has been stored in array
• Dynamic programming: partial result stored to save time
Longest common subsequence
• To find the longest subsequence common to two or more sequences
• String1: “AGCAT”
• String2: “GAC”
• Common subsequence: “A”, “C”, “G”, “AC”, “GA”,
• LCS: “AC”, or “GA”
• To use a table to find LCS:
• First column: string1(“AGCAT”)
• First row: string2(“GAC”)
• Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
Longest common subsequence
Longest common subsequence
Longest common subsequence
Longest common subsequence
Wildcard matching
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]
- * a * b
- T T F F F
a
d
c
a
b
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T
b
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Longest common subsequence and wildcard
matching
• DP starts from initial condition to the end of string:
• From left to right at each row
• From top to bottom at each cloumn
• State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to
table[i][j]
• Each time: move forward by one step
• State at each is the global optimum of that step
• Table (or diagram) is the best tool to simulate the processing
Matrix chain multiplication
• Multiple two matrices: A(10 x 100) and B(100 x 5)
• OUT[p][r] += A[p][q] * B[q][r]
• Computation = 10 x 100 x 5
• Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50)
• ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500
• (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000
• ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar
computation
Matrix chain multiplication
• How to optimize the chain multiplication of matrices ( A1, A2, A3, ….
An)
• DP induction rule:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
• Status:
• M[i, j]: the min number of computations for the matrices (i to j) multiplication
• S[i, j]: the last-layer break-point for M[i, j]
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
(A1 (A2 A3)) ((A4 A5) A6)
Matrix chain multiplication: DP solution
• State hard to define:
• M[i, j]
• S[i, j]
• State transition complicated:
• By row and column not work
• From previous state to current state by the matrices length (Induction rule)
Framework of dynamic programming
• Three key components of dynamic programming algorithm:
• Definition of state
• Initial condition (base)
• Induction rule (state transition)
• Induction rule: difficult to find
• 1D/2D table for the thinking process
What is part of speech tagging?
• Identify parts of the speech (syntactic categories):
This is a simple sentence
DET VB DET ADJ NOUN
• POS tagging is a first step towards syntactic analysis (sematic analysis)
• Faster than full parsing
• Text classification and word disambiguation
• How to decide the correct label:
• Word to be labeled: chair is probably a noun
• Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this
word is more likely to be a verb
• Hidden Markov models can be used to work on this problem
Why is POS tagging hard?
• Ambiguity
glass of water/NOUN vs. water/VERB the plants
lie/VERB down vs. tell a lie/NOUN
wind/VERB down vs. a mighty wind/NOUN(homographs)
How about time flies like an arrow?
• Sparse data:
• Words we haven’t seen before
• Word-Tag pairs we haven’t seen before
Example transition probabilities
• Probabilities estimated from tagged WSJ corpus:
• Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28
• Modal verbs (MD) nearly always followed by bare verbs (VB).
• Adjectives (JJ) are often followed by nouns (NN).
Example output probabilities
• Probabilities estimated from tagged WSJ corpus:
• 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032
• About half of determiners (DT) are the.
• the can also be a proper noun.
Hidden Markov Chain
• A set of states (tags)
• An output alphabet (words)
• Initial state (beginning of sentence)
• State transition probabilities ( P(ti|ti-1) )
• Symbol emission probabilities ( P(wi|ti) )
Hidden Markov Chain
• Model the tagging process:
• Sentence: W = (w1, w2, … wn)
• Tags T = (t1, t2, …, tn)
• Joint probability: P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
• Example:
• This/DET is/VB a/DET simple/JJ sentence/NN
• Add begin(<s>) and end-of-sentence (</s>):
P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
= P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ)
P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ)
P(sentence|NN)
Computation estimation of POS
• Suppose we have C possible tags for each of the n words in the
sentence
• There are C^n possible tag sequences: the number grows
exponentially in the length n
• Viterbi algorithm: use dynamic programming to solve it
Viterbi algorithm:
• Target: argmaxT P(T|W)
• Intuition: best path of length (i) at state of t must include best path of
length (i-1) to the previous state
• Use a table to store the partial result:
• TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at
state i
• Fill in columns from left to right, the max is over each possible previous t’
V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
Viterbi algorithm: case study
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: all tagged
Dynamic programming: take-home message
• Why fast: use memory to store partial result
• DP algorithm component: state definition, initial condition, and
induction rule
• Solve DP problem with a table
Top ten DP problems
• Longest common subsequence
• Shortest common subsequence
• Longest increasing subsequence
• Edit distance
• Matrix chain multiplication
• 0-1 knapsack problem
• Partition problem
• Rod cutting
• Coin change problem
• Word break problem
Reference
• http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s
lides.pdf
• https://en.wikipedia.org/wiki/Dynamic_programming
• https://medium.com/@codingfreak/top-10-dynamic-programming-
problems-5da486eeb360
• https://leetcode.com/problems/wildcard-matching/description/
• https://en.wikipedia.org/wiki/Longest_common_subsequence_probl
em

Weitere ähnliche Inhalte

Was ist angesagt?

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rulemath267
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-pointsmath267
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!kenbot
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotientmath265
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Mohd. Noor Abdul Hamid
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functionstoni dimella
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 SearchKhiem Ho
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integralsmath267
 
22 double integrals
22 double integrals22 double integrals
22 double integralsmath267
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionsmath260
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressionsalg1testreview
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionsHeather Scott
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notestoni dimella
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]indu thakur
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionscannout
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsKeigo Nitadori
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and rootsmath260
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivativemath265
 

Was ist angesagt? (20)

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rule
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-points
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotient
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functions
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 Search
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integrals
 
22 double integrals
22 double integrals22 double integrals
22 double integrals
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functions
 
Metric space
Metric spaceMetric space
Metric space
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressions
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notes
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulations
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and roots
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivative
 

Ähnlich wie Basics of Dynamic programming

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithmTrector Rancor
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonshin
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdfSrinivasaReddyPolamR
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Charles Martin
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimizationg3_nittala
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programJyotiprakashMishra18
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..KarthikeyaLanka1
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in PythonValerio Maggio
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 
Applied 20S January 7, 2009
Applied 20S January 7, 2009Applied 20S January 7, 2009
Applied 20S January 7, 2009Darren Kuropatwa
 

Ähnlich wie Basics of Dynamic programming (20)

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithm
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdf
 
Laplace_1.ppt
Laplace_1.pptLaplace_1.ppt
Laplace_1.ppt
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Q
QQ
Q
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and program
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Ch8a
Ch8aCh8a
Ch8a
 
Applied 20S January 7, 2009
Applied 20S January 7, 2009Applied 20S January 7, 2009
Applied 20S January 7, 2009
 

Mehr von Yan Xu

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingYan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for businessYan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangYan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGoYan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkYan Xu
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 

Mehr von Yan Xu (20)

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 

Kürzlich hochgeladen

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 

Kürzlich hochgeladen (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 

Basics of Dynamic programming

  • 1. Dynamic Programming: basics and case studies Houston Machine Learning Meetup 11/16/2019
  • 2. Dynamic Programming: name and story • Richard Bellman coined the term “Dynamic Programming” Bellman autobiography “The face of Wilson (the secretory of defense) would turn red, and he would get violent if people used the term RESEARCH in his presence. You can imagine how he felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson and the Air Force from the fact that I was really doing MATHEMATICS inside the RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I wanted to get across the idea that this was DYNAMIC, this was multistage, this was time-varying…. I thought dynamic programming was a good name. It was something not even a Congressman could object to..."
  • 3. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2)
  • 4. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by recursion public int fib(int N) { if (n == 0 || n == 1) { return n; } return fib(N – 1) + fib(N – 2); } Time complexity: O(N) = 2^N Recursion tree of Fibonacci sequence
  • 5. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1
  • 6. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1
  • 7. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2
  • 8. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3
  • 9. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3 5
  • 10. Fibonacci sequence • Recursion: • F(n) = F(n – 1) + F(n – 2) • Starts from n • When computing F(n), F(n-1) and F(n-2) is not known yet • DP: • F(n) = F(n – 1) + F(n – 2) • Starts from 0 and 1 • When computing F(n), F(n-1) and F(n-2) has been stored in array • Dynamic programming: partial result stored to save time
  • 11. Longest common subsequence • To find the longest subsequence common to two or more sequences • String1: “AGCAT” • String2: “GAC” • Common subsequence: “A”, “C”, “G”, “AC”, “GA”, • LCS: “AC”, or “GA” • To use a table to find LCS: • First column: string1(“AGCAT”) • First row: string2(“GAC”) • Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
  • 16. Wildcard matching • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] - * a * b - T T F F F a d c a b
  • 17. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T b • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
  • 18. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 19. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 20. Longest common subsequence and wildcard matching • DP starts from initial condition to the end of string: • From left to right at each row • From top to bottom at each cloumn • State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to table[i][j] • Each time: move forward by one step • State at each is the global optimum of that step • Table (or diagram) is the best tool to simulate the processing
  • 21. Matrix chain multiplication • Multiple two matrices: A(10 x 100) and B(100 x 5) • OUT[p][r] += A[p][q] * B[q][r] • Computation = 10 x 100 x 5 • Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50) • ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500 • (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000 • ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar computation
  • 22. Matrix chain multiplication • How to optimize the chain multiplication of matrices ( A1, A2, A3, …. An) • DP induction rule:
  • 23. Matrix chain multiplication: DP solution • Six matrices multiplication: • Status: • M[i, j]: the min number of computations for the matrices (i to j) multiplication • S[i, j]: the last-layer break-point for M[i, j]
  • 24. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 25. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 26. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 27. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 28. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 29. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 30. Matrix chain multiplication: DP solution • Six matrices multiplication: (A1 (A2 A3)) ((A4 A5) A6)
  • 31. Matrix chain multiplication: DP solution • State hard to define: • M[i, j] • S[i, j] • State transition complicated: • By row and column not work • From previous state to current state by the matrices length (Induction rule)
  • 32. Framework of dynamic programming • Three key components of dynamic programming algorithm: • Definition of state • Initial condition (base) • Induction rule (state transition) • Induction rule: difficult to find • 1D/2D table for the thinking process
  • 33. What is part of speech tagging? • Identify parts of the speech (syntactic categories): This is a simple sentence DET VB DET ADJ NOUN • POS tagging is a first step towards syntactic analysis (sematic analysis) • Faster than full parsing • Text classification and word disambiguation • How to decide the correct label: • Word to be labeled: chair is probably a noun • Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this word is more likely to be a verb • Hidden Markov models can be used to work on this problem
  • 34. Why is POS tagging hard? • Ambiguity glass of water/NOUN vs. water/VERB the plants lie/VERB down vs. tell a lie/NOUN wind/VERB down vs. a mighty wind/NOUN(homographs) How about time flies like an arrow? • Sparse data: • Words we haven’t seen before • Word-Tag pairs we haven’t seen before
  • 35. Example transition probabilities • Probabilities estimated from tagged WSJ corpus: • Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28 • Modal verbs (MD) nearly always followed by bare verbs (VB). • Adjectives (JJ) are often followed by nouns (NN).
  • 36. Example output probabilities • Probabilities estimated from tagged WSJ corpus: • 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032 • About half of determiners (DT) are the. • the can also be a proper noun.
  • 37. Hidden Markov Chain • A set of states (tags) • An output alphabet (words) • Initial state (beginning of sentence) • State transition probabilities ( P(ti|ti-1) ) • Symbol emission probabilities ( P(wi|ti) )
  • 38. Hidden Markov Chain • Model the tagging process: • Sentence: W = (w1, w2, … wn) • Tags T = (t1, t2, …, tn) • Joint probability: P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) • Example: • This/DET is/VB a/DET simple/JJ sentence/NN • Add begin(<s>) and end-of-sentence (</s>): P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) = P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ) P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ) P(sentence|NN)
  • 39. Computation estimation of POS • Suppose we have C possible tags for each of the n words in the sentence • There are C^n possible tag sequences: the number grows exponentially in the length n • Viterbi algorithm: use dynamic programming to solve it
  • 40. Viterbi algorithm: • Target: argmaxT P(T|W) • Intuition: best path of length (i) at state of t must include best path of length (i-1) to the previous state • Use a table to store the partial result: • TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at state i • Fill in columns from left to right, the max is over each possible previous t’ V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
  • 42. Viterbi algorithm: case study • W = the doctor is in.
  • 43. Viterbi algorithm: case study • W = the doctor is in.
  • 44. Viterbi algorithm: case study • W = the doctor is in.
  • 45. Viterbi algorithm: case study • W = the doctor is in.
  • 46. Viterbi algorithm: case study • W = the doctor is in.
  • 47. Viterbi algorithm: case study • W = the doctor is in.
  • 48. Viterbi algorithm: case study • W = the doctor is in.
  • 50. Dynamic programming: take-home message • Why fast: use memory to store partial result • DP algorithm component: state definition, initial condition, and induction rule • Solve DP problem with a table
  • 51. Top ten DP problems • Longest common subsequence • Shortest common subsequence • Longest increasing subsequence • Edit distance • Matrix chain multiplication • 0-1 knapsack problem • Partition problem • Rod cutting • Coin change problem • Word break problem
  • 52. Reference • http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s lides.pdf • https://en.wikipedia.org/wiki/Dynamic_programming • https://medium.com/@codingfreak/top-10-dynamic-programming- problems-5da486eeb360 • https://leetcode.com/problems/wildcard-matching/description/ • https://en.wikipedia.org/wiki/Longest_common_subsequence_probl em