Tree pruning

•Als PPTX, PDF herunterladen•

15 gefällt mir•28,955 views

The process of adjusting Decision Tree to minimize “misclassification error” is called pruning. It is of 2 types prepruning and post pruning.

Ingenieurwesen

OVERVIEW
 Decision Tree
 Why Tree Pruning?
 Types of Tree pruning
 Reduced Error pruning
 Comparision
 References

INTRODUCTION
 Decision trees are made to classify the item
set.
 While classifying we meet with 2 problems
1. Underfitting .
2. Overfitting .

 Underfitting problem arises when both the
“training errors and test errors are large”
 This happens when the developed model is
made very simple.
 Overfitting problem arises when
“training errors are small but test errors are
large”

OVERFITTING
 Overfitting results in decision trees that are more
complex than necessary.
 Training error no longer provides a good estimate
of how well the tree will perform on previously
unseen records.
 Need new ways for estimating errors.

How to address overfitting ?
“Tree Pruning”

WHAT IS PRUNING?
 The process of adjusting Decision Tree to minimize
“misclassification error” is called pruning .
 Pruning can be done in 2 ways
1. Prepruning.
2.Postpruning.

PREPRUNING
 Prepruning is the halting of subtree construction at
some node after checking some measures.
 These measures can be Information gain, Gini
index,etc.
 If partitioning the tuple at a node would result in a
split that falls below a prespecified threshold, then
pruning is done.
 Early stopping- Pre-pruning may stop the growth
process prematurely.

POSTPRUNING
 Grow decision tree to its entirety.
 Trim the nodes of the decision tree in a
bottom-up fashion.Postpruning is done by
replacing the node with leaf.
 If error improves after trimming, replace sub-
tree by a leaf node.

REDUCED ERROR PRUNING
 The idea is to hold out some of the available instances—the
“pruning set” after the tree is built.
 Prune the tree until the classification error on these independent
instances starts to increase.
 These pruning set are not used for building the decision tree,
they provide a less biased estimate of its error rate on future
instances than the training data.
 Reduced error pruning is done in bottom up fashion.
 Criteria:
If error of parent is lesser than its child then prune the tree else
not .
i.e if Parent (error)< Child(error) then “Prune”
else don’t Prune

STEPS
 In each tree, the number of instances in the pruning data
that are misclassified by the individual nodes are given in
parentheses.
 Assuming that the tree is traversed left-to-right.
 The pruning procedure first considers for removal the
subtree attached to node 3.
 Because the subtree’s error on the pruning data (1 error)
exceeds the error of node 3 itself (0errors), node 3 is
converted to a leaf.
 Next, node 6 is replaced by a leaf for the same reason

 Having processed both of its successors, the pruning
procedure then considers node 2 for deletion.
However, because the subtree attached to node 2
makes fewer mistakes (0 errors) than node 2 itself (1
error), the subtree remains in place.
 Next, the subtree extending from node 9 is
considered for pruning, resulting in a leaf
 In the last step, node 1 is considered for pruning,
leaving the tree unchanged.

COMPARISION
 Prepruning is faster than post pruning since it don’t need to
wait for complete construction of decision tree.
 But still Post-pruning is preferable to pre-pruning because of
“interaction effect”.
 These are the efects which arise after interaction of several
attributes.
 Prepruning suppresses growth by evaluating each attribute
individually, and so might overlook effects that are due to the
interaction of several attributes and stop too early. Post-
pruning, on the other hand, avoids this problem because
interaction effects are visible in the fully grown tree.

Weitere ähnliche Inhalte

Was ist angesagt?

Replication in Distributed SystemsKavya Barnadhya Hazarika

backpropagation in neural networksAkash Goel

Bayes Belief NetworksSai Kumar Kodam

Decision Tree LearningMilind Gokhale

DeadLock in Operating-SystemsVenkata Sreeram

Process synchronization in Operating SystemsRitu Ranjan Shrivastwa

I.BEST FIRST SEARCH IN AIvikas dhakane

predicate logic exampleSHUBHAM KUMAR GUPTA

And or graphAli A Jalil

Predicate logicHarini Balamurugan

Naive bayesAshraf Uddin

Association rule miningAcad

Decision TreesStudent

Ontology engineering Aliabbas Petiwala

3.2 partitioning methodsKrish_ver2

search strategies in artificial intelligenceHanif Ullah (Gold Medalist)

I. AO* SEARCH ALGORITHMvikas dhakane

Classification and RegressionMegha Sharma

2.2 decision treeKrish_ver2

Scheduling algorithmsChankey Pathak

Was ist angesagt? (20)

Replication in Distributed Systems

backpropagation in neural networks

Bayes Belief Networks

Decision Tree Learning

DeadLock in Operating-Systems

Process synchronization in Operating Systems

I.BEST FIRST SEARCH IN AI

predicate logic example

And or graph

Predicate logic

Naive bayes

Association rule mining

Decision Trees

Ontology engineering

3.2 partitioning methods

search strategies in artificial intelligence

I. AO* SEARCH ALGORITHM

Classification and Regression

2.2 decision tree

Scheduling algorithms

Ähnlich wie Tree pruning

22 Machine Learning Feature SelectionAndres Mendez-Vazquez

Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...INFOGAIN PUBLICATION

Maths Behind Models.pptxMukul Kumar Singh Chauhan

Bank loan purchase modelingSaleesh Satheeshchandran

Ensembles.pdfLeonardo Auslender

Introduction to Random Forest Rupak Roy

Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics

Issues in DTL.pptxRamakrishna Reddy Bijjam

Unit 2-ML.pptxChitrachitrap

Introduction to random forest and gradient boosting methods a lectureShreyas S K

Decision treesnandini patil

10 best practices in operational analytics Decision Management Solutions

data mining.pptxKaviya452563

20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2

Applied machine learning: InsuranceGregg Barrett

Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics

TreeNet Overview - Updated October 2012Salford Systems

Random forest sgv_ai_talk_oct_2_2018digitalzombie

Understanding Bagging and BoostingMohit Rajput

Ähnlich wie Tree pruning (19)

22 Machine Learning Feature Selection

Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...

Maths Behind Models.pptx

Bank loan purchase modeling

Ensembles.pdf

Introduction to Random Forest

Random Forest Classifier in Machine Learning | Palin Analytics

Issues in DTL.pptx

Unit 2-ML.pptx

Introduction to random forest and gradient boosting methods a lecture

Decision trees

10 best practices in operational analytics

data mining.pptx

20211229120253D6323_PERT 06_ Ensemble Learning.pptx

Applied machine learning: Insurance

Decision Trees for Classification: A Machine Learning Algorithm

TreeNet Overview - Updated October 2012

Random forest sgv_ai_talk_oct_2_2018

Understanding Bagging and Boosting

Kürzlich hochgeladen

POWER SYSTEMS-1 Complete notes examplesDr. Gudipudi Nageswara Rao

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1

Artificial-Intelligence-in-Electronics (K).pptxbritheesh05

young call girls in Green Park🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000

Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxnull - The Open Security Community

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort

Architect Hassan Khalil Portfolio for 2024hassan khalil

Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066

Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N

Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis

An introduction to Semiconductor and its types.pptxPurva Nikam

An experimental study in using natural admixture as an alternative for chemic...Chandu841456

Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441

Kürzlich hochgeladen (20)

POWER SYSTEMS-1 Complete notes examples

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers

Artificial-Intelligence-in-Electronics (K).pptx

young call girls in Green Park🔝 9953056974 🔝 escort Service

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...

Call Girls Delhi {Jodhpur} 9711199012 high profile service

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service

Architect Hassan Khalil Portfolio for 2024

Software and Systems Engineering Standards: Verification and Validation of Sy...

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)

Concrete Mix Design - IS 10262-2019 - .pptx

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service

UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction

An introduction to Semiconductor and its types.pptx

An experimental study in using natural admixture as an alternative for chemic...

Instrumentation, measurement and control of bio process parameters ( Temperat...

Tree pruning

1. TREE PRUNING BY SHIVANGI GUPTA

2. OVERVIEW  Decision Tree  Why Tree Pruning?  Types of Tree pruning  Reduced Error pruning  Comparision  References

3. INTRODUCTION  Decision trees are made to classify the item set.  While classifying we meet with 2 problems 1. Underfitting . 2. Overfitting .

4.  Underfitting problem arises when both the “training errors and test errors are large”  This happens when the developed model is made very simple.  Overfitting problem arises when “training errors are small but test errors are large”

6. OVERFITTING  Overfitting results in decision trees that are more complex than necessary.  Training error no longer provides a good estimate of how well the tree will perform on previously unseen records.  Need new ways for estimating errors.

8. How to address overfitting ? “Tree Pruning”

9. WHAT IS PRUNING?  The process of adjusting Decision Tree to minimize “misclassification error” is called pruning .  Pruning can be done in 2 ways 1. Prepruning. 2.Postpruning.

10. PREPRUNING  Prepruning is the halting of subtree construction at some node after checking some measures.  These measures can be Information gain, Gini index,etc.  If partitioning the tuple at a node would result in a split that falls below a prespecified threshold, then pruning is done.  Early stopping- Pre-pruning may stop the growth process prematurely.

11. POSTPRUNING  Grow decision tree to its entirety.  Trim the nodes of the decision tree in a bottom-up fashion.Postpruning is done by replacing the node with leaf.  If error improves after trimming, replace subtree by a leaf node.

12. REDUCED ERROR PRUNING  The idea is to hold out some of the available instances—the “pruning set” after the tree is built.  Prune the tree until the classification error on these independent instances starts to increase.  These pruning set are not used for building the decision tree, they provide a less biased estimate of its error rate on future instances than the training data.  Reduced error pruning is done in bottom up fashion.  Criteria: If error of parent is lesser than its child then prune the tree else not . i.e if Parent (error)< Child(error) then “Prune” else don’t Prune

13. EXAMPLE

14. Pruning set

15. STEPS  In each tree, the number of instances in the pruning data that are misclassified by the individual nodes are given in parentheses.  Assuming that the tree is traversed left-to-right.  The pruning procedure first considers for removal the subtree attached to node 3.  Because the subtree’s error on the pruning data (1 error) exceeds the error of node 3 itself (0errors), node 3 is converted to a leaf.  Next, node 6 is replaced by a leaf for the same reason

16.  Having processed both of its successors, the pruning procedure then considers node 2 for deletion. However, because the subtree attached to node 2 makes fewer mistakes (0 errors) than node 2 itself (1 error), the subtree remains in place.  Next, the subtree extending from node 9 is considered for pruning, resulting in a leaf  In the last step, node 1 is considered for pruning, leaving the tree unchanged.

17.

18.

19.

20.

21. COMPARISION  Prepruning is faster than post pruning since it don’t need to wait for complete construction of decision tree.  But still Post-pruning is preferable to pre-pruning because of “interaction effect”.  These are the efects which arise after interaction of several attributes.  Prepruning suppresses growth by evaluating each attribute individually, and so might overlook effects that are due to the interaction of several attributes and stop too early. Post- pruning, on the other hand, avoids this problem because interaction effects are visible in the fully grown tree.

Tree pruning

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Tree pruning

Ähnlich wie Tree pruning (19)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Tree pruning