SlideShare ist ein Scribd-Unternehmen logo
1 von 21
TREE
PRUNING
BY SHIVANGI GUPTA
OVERVIEW
 Decision Tree
 Why Tree Pruning?
 Types of Tree pruning
 Reduced Error pruning
 Comparision
 References
INTRODUCTION
 Decision trees are made to classify the item
set.
 While classifying we meet with 2 problems
1. Underfitting .
2. Overfitting .
 Underfitting problem arises when both the
“training errors and test errors are large”
 This happens when the developed model is
made very simple.
 Overfitting problem arises when
“training errors are small but test errors are
large”
OVERFITTING
 Overfitting results in decision trees that are more
complex than necessary.
 Training error no longer provides a good estimate
of how well the tree will perform on previously
unseen records.
 Need new ways for estimating errors.
How to address overfitting ?
“Tree Pruning”
WHAT IS PRUNING?
 The process of adjusting Decision Tree to minimize
“misclassification error” is called pruning .
 Pruning can be done in 2 ways
1. Prepruning.
2.Postpruning.
PREPRUNING
 Prepruning is the halting of subtree construction at
some node after checking some measures.
 These measures can be Information gain, Gini
index,etc.
 If partitioning the tuple at a node would result in a
split that falls below a prespecified threshold, then
pruning is done.
 Early stopping- Pre-pruning may stop the growth
process prematurely.
POSTPRUNING
 Grow decision tree to its entirety.
 Trim the nodes of the decision tree in a
bottom-up fashion.Postpruning is done by
replacing the node with leaf.
 If error improves after trimming, replace sub-
tree by a leaf node.
REDUCED ERROR PRUNING
 The idea is to hold out some of the available instances—the
“pruning set” after the tree is built.
 Prune the tree until the classification error on these independent
instances starts to increase.
 These pruning set are not used for building the decision tree,
they provide a less biased estimate of its error rate on future
instances than the training data.
 Reduced error pruning is done in bottom up fashion.
 Criteria:
If error of parent is lesser than its child then prune the tree else
not .
i.e if Parent (error)< Child(error) then “Prune”
else don’t Prune
EXAMPLE
Pruning set
STEPS
 In each tree, the number of instances in the pruning data
that are misclassified by the individual nodes are given in
parentheses.
 Assuming that the tree is traversed left-to-right.
 The pruning procedure first considers for removal the
subtree attached to node 3.
 Because the subtree’s error on the pruning data (1 error)
exceeds the error of node 3 itself (0errors), node 3 is
converted to a leaf.
 Next, node 6 is replaced by a leaf for the same reason
 Having processed both of its successors, the pruning
procedure then considers node 2 for deletion.
However, because the subtree attached to node 2
makes fewer mistakes (0 errors) than node 2 itself (1
error), the subtree remains in place.
 Next, the subtree extending from node 9 is
considered for pruning, resulting in a leaf
 In the last step, node 1 is considered for pruning,
leaving the tree unchanged.
COMPARISION
 Prepruning is faster than post pruning since it don’t need to
wait for complete construction of decision tree.
 But still Post-pruning is preferable to pre-pruning because of
“interaction effect”.
 These are the efects which arise after interaction of several
attributes.
 Prepruning suppresses growth by evaluating each attribute
individually, and so might overlook effects that are due to the
interaction of several attributes and stop too early. Post-
pruning, on the other hand, avoids this problem because
interaction effects are visible in the fully grown tree.

Weitere ähnliche Inhalte

Was ist angesagt?

backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMilind Gokhale
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-SystemsVenkata Sreeram
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating SystemsRitu Ranjan Shrivastwa
 
I.BEST FIRST SEARCH IN AI
I.BEST FIRST SEARCH IN AII.BEST FIRST SEARCH IN AI
I.BEST FIRST SEARCH IN AIvikas dhakane
 
Association rule mining
Association rule miningAssociation rule mining
Association rule miningAcad
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMvikas dhakane
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and RegressionMegha Sharma
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 

Was ist angesagt? (20)

Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating Systems
 
I.BEST FIRST SEARCH IN AI
I.BEST FIRST SEARCH IN AII.BEST FIRST SEARCH IN AI
I.BEST FIRST SEARCH IN AI
 
predicate logic example
predicate logic examplepredicate logic example
predicate logic example
 
And or graph
And or graphAnd or graph
And or graph
 
Predicate logic
 Predicate logic Predicate logic
Predicate logic
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Ontology engineering
Ontology engineering Ontology engineering
Ontology engineering
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
search strategies in artificial intelligence
search strategies in artificial intelligencesearch strategies in artificial intelligence
search strategies in artificial intelligence
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Scheduling algorithms
Scheduling algorithmsScheduling algorithms
Scheduling algorithms
 

Ähnlich wie Tree pruning

22 Machine Learning Feature Selection
22 Machine Learning Feature Selection22 Machine Learning Feature Selection
22 Machine Learning Feature SelectionAndres Mendez-Vazquez
 
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...INFOGAIN PUBLICATION
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest Rupak Roy
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lectureShreyas S K
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: InsuranceGregg Barrett
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012Salford Systems
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018digitalzombie
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 

Ähnlich wie Tree pruning (19)

22 Machine Learning Feature Selection
22 Machine Learning Feature Selection22 Machine Learning Feature Selection
22 Machine Learning Feature Selection
 
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...
Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...
 
Maths Behind Models.pptx
Maths Behind Models.pptxMaths Behind Models.pptx
Maths Behind Models.pptx
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
 
Ensembles.pdf
Ensembles.pdfEnsembles.pdf
Ensembles.pdf
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
 
Issues in DTL.pptx
Issues in DTL.pptxIssues in DTL.pptx
Issues in DTL.pptx
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
 
Decision trees
Decision treesDecision trees
Decision trees
 
10 best practices in operational analytics
10 best practices in operational analytics 10 best practices in operational analytics
10 best practices in operational analytics
 
data mining.pptx
data mining.pptxdata mining.pptx
data mining.pptx
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 

Kürzlich hochgeladen

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 

Kürzlich hochgeladen (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 

Tree pruning

  • 2. OVERVIEW  Decision Tree  Why Tree Pruning?  Types of Tree pruning  Reduced Error pruning  Comparision  References
  • 3. INTRODUCTION  Decision trees are made to classify the item set.  While classifying we meet with 2 problems 1. Underfitting . 2. Overfitting .
  • 4.  Underfitting problem arises when both the “training errors and test errors are large”  This happens when the developed model is made very simple.  Overfitting problem arises when “training errors are small but test errors are large”
  • 5.
  • 6. OVERFITTING  Overfitting results in decision trees that are more complex than necessary.  Training error no longer provides a good estimate of how well the tree will perform on previously unseen records.  Need new ways for estimating errors.
  • 7.
  • 8. How to address overfitting ? “Tree Pruning”
  • 9. WHAT IS PRUNING?  The process of adjusting Decision Tree to minimize “misclassification error” is called pruning .  Pruning can be done in 2 ways 1. Prepruning. 2.Postpruning.
  • 10. PREPRUNING  Prepruning is the halting of subtree construction at some node after checking some measures.  These measures can be Information gain, Gini index,etc.  If partitioning the tuple at a node would result in a split that falls below a prespecified threshold, then pruning is done.  Early stopping- Pre-pruning may stop the growth process prematurely.
  • 11. POSTPRUNING  Grow decision tree to its entirety.  Trim the nodes of the decision tree in a bottom-up fashion.Postpruning is done by replacing the node with leaf.  If error improves after trimming, replace sub- tree by a leaf node.
  • 12. REDUCED ERROR PRUNING  The idea is to hold out some of the available instances—the “pruning set” after the tree is built.  Prune the tree until the classification error on these independent instances starts to increase.  These pruning set are not used for building the decision tree, they provide a less biased estimate of its error rate on future instances than the training data.  Reduced error pruning is done in bottom up fashion.  Criteria: If error of parent is lesser than its child then prune the tree else not . i.e if Parent (error)< Child(error) then “Prune” else don’t Prune
  • 15. STEPS  In each tree, the number of instances in the pruning data that are misclassified by the individual nodes are given in parentheses.  Assuming that the tree is traversed left-to-right.  The pruning procedure first considers for removal the subtree attached to node 3.  Because the subtree’s error on the pruning data (1 error) exceeds the error of node 3 itself (0errors), node 3 is converted to a leaf.  Next, node 6 is replaced by a leaf for the same reason
  • 16.  Having processed both of its successors, the pruning procedure then considers node 2 for deletion. However, because the subtree attached to node 2 makes fewer mistakes (0 errors) than node 2 itself (1 error), the subtree remains in place.  Next, the subtree extending from node 9 is considered for pruning, resulting in a leaf  In the last step, node 1 is considered for pruning, leaving the tree unchanged.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. COMPARISION  Prepruning is faster than post pruning since it don’t need to wait for complete construction of decision tree.  But still Post-pruning is preferable to pre-pruning because of “interaction effect”.  These are the efects which arise after interaction of several attributes.  Prepruning suppresses growth by evaluating each attribute individually, and so might overlook effects that are due to the interaction of several attributes and stop too early. Post- pruning, on the other hand, avoids this problem because interaction effects are visible in the fully grown tree.