SlideShare ist ein Scribd-Unternehmen logo
1 von 46
ITEM ANALYSIS AND
VALIDATION
Mark Leonard Tan
VerenaGonzales
AnnCreiaTupasi
Ramil Cabañesas
Introduction
The teacher normally prepares a draft of
the test. Such a draft is subjected to item
analysis and validation to ensure that the final
version of the test would be useful and
functional.
Phases of preparing a test
 Try-out phase
 Item analysis phase
 Item revision phase
Item Analysis
 There are two important characteristics of an
item that will be of interest of the teacher:
 Item Difficulty
 Discrimination Index
 Item Difficulty or the difficulty of an item is
defined as the number of students who are able
to answer the item correctly divided by the total
number of students.Thus:
Item difficulty = number of students with the correct answer
Total number of students
The item difficulty is usually expressed in percentage.
Example:
What is the item difficulty index of an item if 25
students are unable to answer it correctly while 75
answered it correctly?
Here the total number of students is 100, hence,
the item difficulty index is 75/100 or 75%.
One problem with this type of difficulty
index is that it may not actually indicate
that the item is difficult or easy. A student
who does not know the subject matter will
naturally be unable to answer the item
correctly even if the question is easy. How
do we decide on the basis of this index
whether the item is too difficult or too
easy?
Range of difficulty
index
Interpretation Action
0 – 0.25 Difficult Revise or discard
0.26 – 0.75 Right difficulty retain
0.76 - above Easy Revise or discard
 Difficult items tend to discriminate between
those who know and those who does not know
the answer.
 Easy items cannot discriminate between those
two groups of students.
 We are therefore interested in deriving a
measure that will tell us whether an item can
discriminate between these two groups of
students. Such a measure is called an index of
discrimination.
An easy way to derive such a measure is to
measure how difficult an item is with
respect to those in the upper 25% of the
class and how difficult it is with respect to
those in the lower 25% of the class. If the
upper 25% of the class found the item easy
yet the lower 25% found it difficult, then
the item can discriminate properly
between these two groups.Thus:
Index of discrimination = DU – DL
Example: Obtain the index of discrimination of an
item if the upper 25% of the class had a difficulty
index of 0.60 (i.e. 60% of the upper 25% got the
correct answer) while the lower 25% of the class
had a difficulty index of 0.20.
DU = 0.60 while DL = 0.20, thus index of
discrimination = .60 - .20 = .40.
 Theoretically, the index of discrimination can
range from -1.0 (when DU =0 and DL = 1) to 1.0
(when DU = 1 and DL = 0)
 When the index of discrimination is equal to -1,
then this means that all of the lower 25% of the
students got the correct answer while all of the
upper 25% got the wrong answer. In a sense,
such an index discriminates correctly between
the two groups but the item itself is highly
questionable.
 On the other hand, if the index
discrimination is 1.0, then this means that
all of the lower 25% failed to get the correct
answer while all of the upper 25% got the
correct answer.This is a perfectly
discriminating item and is the ideal item
that should be included in the test.
 As in the case of index difficulty, we have
the following rule of thumb:
Index Range Interpretation Action
-1.0 to -.50 Can discriminate
but the item is
questionable
Discarded
-.55 to .45 Non-discriminating Revised
.46 to 1.0 Discriminating item Include
Example: Consider a multiple item choice type of
test with the ff. data were obtained:
Item Options
1
A B* C D
0 40 20 20 Total
0 15 5 0 Upper 25%
0 5 10 5 Lower 25%
The correct response is B. Let us compute the difficulty index and index of
discrimination.
Difficulty index = no. of students getting the correct answer
Total
= __40__
100
= 40%, within of a “good item”
The correct response is B. Let us compute the
difficulty index and index of discrimination:
The discrimination index can be similarly be
computed:
DU = no. of students in the upper 25% with correct response
No. of students in the upper 25%
=15/20 = .75 or 75%
DL= no. of students in lower 75% with correct response
no. of students in the lower 25%
= 5/20 = .25 or 25%
Discrimination index = DU – DL
= .75 - .25
= .50 or 50%
Thus, the item also has a “good discriminating power”.
It is also instructive to note that the distracter A
is not an effective distracter since this was never
selected by the students. Distracter C and D
appear to have a good appeal as distracters.
Basic Item Analysis
Statistics
The Michigan State University
Measurement and Evaluation Department
reports a number of item statistics which aid in
evaluating the effectiveness of an item.
Index of Difficulty – the proportional of the
total group who got the item wrong. “Thus a
high index indicates a difficult item and a low
index indicates an easy item.
Index of Discrimination – is the difference
between the proportion of the upper group who
got an item right and the proportion of the lower
group who got the item right.
More Sophisticated
Discrimination Index
 Item Discrimination refers to the ability of an
item to differentiate among students on the
basis of how well they know the material
being tested.
 A good item is one that has good
discriminating ability and has a sufficient
level of difficulty (not too difficult nor too
easy).
 At the end of the item analysis report, test items
are listed according to their degrees of difficulty
(easy, medium, hard) and discrimination (good,
fair, poor).These distributions provide a quick
overview of the test and can be used to identify
items which are not performing well and which
perhaps be improved or discarded.
The Item-Analysis Procedure for Norm
provides the following information:
1. The difficulty of an item
2. The discriminating power of an item
3. The effectiveness of each alternative
Benefits derived from Item Analysis
1. It provides useful information for class
discussion of the test.
2. It provides data which helps students improve
their learning.
3. It provides insights and skills that lead to the
preparation of better tests in the future.
Index of Difficulty

Index of Item Discriminating
Power


The discriminating power of an item is reported as
a decimal fraction; maximum discriminating power
is indicated by an index of 1.00.
Maximum discrimination is usually found at the 50
per cent level of difficulty.
0.00 – 0.20 = very difficult
0.21 – 0.80 = moderately difficult
0.81 – 1.00 = very easy
Validation
 After performing the item analysis and
revising the items which need revision, the
next step is to validate the instrument.
 The purpose of validation is to determine the
characteristics of the whole test itself,
namely, the validity and reliability of the test.
 Validation is the process of collecting and
analysing evidence to support the
meaningfulness and usefulness of the test.
Validity
 is the extent to which measures what it
purports to measure or referring to the
appropriateness, correctness, meaningfulness,
and usefulness of the specific decisions a
teacher makes based on the test results.
There are three main types of
evidences that may be
collected:
1. Content-related evidence of validity
2. Criterion-related evidence of validity
3. Construct-related evidence of validity
Content-related evidence of
validity
 refers to the content and format of the
instrument.
 How appropriate is the content?
 How comprehensive?
 Does it logically get at the intended variable?
 How adequately does the sample of items or
questions represent the content to be assessed?
Criterion-related evidence of
validity
 refers to the relationship between scores
obtained using the instrument and scores
obtained using one or more other test (often
called criterion).
 How strong is this relationship?
 How well do such scores estimate present or
predict future performance of a certain type?
Construct-related evidence of
validity
 refers to the nature of the psychological
construct or characteristic being measured by
the test.
 How well does a measure of the construct explain
differences in the behaviour of the individuals or
their performance on a certain task?
Usual procedure for
determining content validity
 Teacher write out objectives based onTOS
 Gives the objectives andTOS to 2 experts
along with a description of the test takers.
 The experts look at the objectives, read over
the items in the test and place a check mark
in front of each question or item that they
feel does NOT measure one or more
objectives.
Usual procedure for
determining content validity
 They also place a check mark in front of each
objective NOT assessed by any item in the
test.
 The teacher then rewrites any item so
checked and resubmits to experts and/or
writes new items to cover those objectives
not heretofore covered by the existing test.
Usual procedure for
determining content validity
 This continues until the experts approve all
items and also when the experts agree that
all of the objectives are sufficiently covered
by the test.
Obtaining Evidence for
criterion-related Validity
 The teacher usually compare scores on the
test in question with the scores on some
other independent criterion test which
presumably has already high validity
(concurrent validity).
 Another type of validity is called the
predictive validity wherein the test scores in
the instrument is correlated with scores on
later performance of the feelings.
Gronlunds Expectancy Table
Grade Point Average
Test Score Very Good Good Needs
Improvement
High 20 10 5
Average 10 25 5
Low 1 10 14
 The expectancy table shows that there were
20 students getting high test scores and
subsequently rated excellent in terms of their
final grades;
 And finally 14 students obtained low test
scores and were later graded as needing
improvement.
 The evidence for this particular test tends to
indicate that students getting high score on it
would be graded excellent; average scores
on it would be rated good later; and students
getting low scores on the test would be
graded needing improvement later.
Reliability
 Refers to the consistency of the scores
obtained – how consistent they are for each
individual from one administration of an
instrument to another and from one set of
items to another.
 We already have the formulas for computing
the reliability of a test; for internal
consistency, for instance, we could use the
split-half method or the Kuder-Richardson
formulae:
KR-20 or KR-21
 Reliability and validity are related concepts. If
an instrument is unreliable, it cannot yet valid
outcomes.
 As reliability improves, validity may
improve (or may not).
 However, if an instrument is shown
scientifically to be valid then it is almost
certain that it is also reliable.
 The ff. table is a standard followed by almost
universally in educational tests and
measurement:
Reliability Interpretation
.90 and above Excellent reliability; at the level of the best
standardized tests.
.80 - .90 Very good for a classroom test
.70 - .80 Good for a classroom test; in the range of most.There
are probably a few items which could be improved.
.60 - .70 Somewhat low.This test should be supplemented by
other measures (e.g., more test) for grading.
.50 - .60 Suggests need for revision of test, unless it is quite
short (ten or fewer items).The test definitely needs to
be supplemented by other measures (e.g., more tests)
for grading.
.50 or below Questionable reliability.This test should not contribute
heavily to the course grade, and it needs revision.

Weitere ähnliche Inhalte

Was ist angesagt?

Assessment of Learning Outcomes in the k to 12 program
Assessment of Learning Outcomes in the k to 12 programAssessment of Learning Outcomes in the k to 12 program
Assessment of Learning Outcomes in the k to 12 programKerwin Palpal
 
Grading and reporting
Grading and reportingGrading and reporting
Grading and reportingReynel Dan
 
Types of test questions
Types of test questionsTypes of test questions
Types of test questionsMa Tamonte
 
Curriculum models (Philippines' Curriculum Models)
Curriculum models (Philippines' Curriculum Models)Curriculum models (Philippines' Curriculum Models)
Curriculum models (Philippines' Curriculum Models)TeacherAdora
 
Interpretation of Assessment Results
Interpretation of Assessment ResultsInterpretation of Assessment Results
Interpretation of Assessment ResultsRica Joy Pontilar
 
Ed8 Assessment of Learning 2
Ed8 Assessment of Learning 2 Ed8 Assessment of Learning 2
Ed8 Assessment of Learning 2 Eddie Abug
 
Norm referenced grading system
Norm referenced grading systemNorm referenced grading system
Norm referenced grading systemobemrosalia
 
The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)iamina
 
Assessment of Learning - Multiple Choice Test
Assessment of Learning - Multiple Choice TestAssessment of Learning - Multiple Choice Test
Assessment of Learning - Multiple Choice TestXiTian Miran
 
Constructing Test Questions and the Table of Specifications (TOS)
Constructing Test Questions and the Table of Specifications (TOS)Constructing Test Questions and the Table of Specifications (TOS)
Constructing Test Questions and the Table of Specifications (TOS)Mr. Ronald Quileste, PhD
 
Chapter 2 types of assesment
Chapter 2 types of assesmentChapter 2 types of assesment
Chapter 2 types of assesmentMaritesMarasigan1
 
Laws related Education
Laws related EducationLaws related Education
Laws related EducationMariz Encabo
 
Norm or criterion referenced grading
Norm or criterion referenced gradingNorm or criterion referenced grading
Norm or criterion referenced gradingArmilyn Nadora
 
Alternative-Response Test
Alternative-Response TestAlternative-Response Test
Alternative-Response TestMD Pits
 

Was ist angesagt? (20)

Assessment of Learning Outcomes in the k to 12 program
Assessment of Learning Outcomes in the k to 12 programAssessment of Learning Outcomes in the k to 12 program
Assessment of Learning Outcomes in the k to 12 program
 
Item analysis
Item analysisItem analysis
Item analysis
 
Grading and reporting
Grading and reportingGrading and reporting
Grading and reporting
 
Types of test questions
Types of test questionsTypes of test questions
Types of test questions
 
Curriculum models (Philippines' Curriculum Models)
Curriculum models (Philippines' Curriculum Models)Curriculum models (Philippines' Curriculum Models)
Curriculum models (Philippines' Curriculum Models)
 
Interpretation of Assessment Results
Interpretation of Assessment ResultsInterpretation of Assessment Results
Interpretation of Assessment Results
 
Ed8 Assessment of Learning 2
Ed8 Assessment of Learning 2 Ed8 Assessment of Learning 2
Ed8 Assessment of Learning 2
 
preparing a TOS
preparing a TOSpreparing a TOS
preparing a TOS
 
Norm referenced grading system
Norm referenced grading systemNorm referenced grading system
Norm referenced grading system
 
Item analysis2
Item analysis2Item analysis2
Item analysis2
 
The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)
 
Assessment of learning1
Assessment of learning1Assessment of learning1
Assessment of learning1
 
Assessment of Learning - Multiple Choice Test
Assessment of Learning - Multiple Choice TestAssessment of Learning - Multiple Choice Test
Assessment of Learning - Multiple Choice Test
 
Types of test
Types of testTypes of test
Types of test
 
Constructing Test Questions and the Table of Specifications (TOS)
Constructing Test Questions and the Table of Specifications (TOS)Constructing Test Questions and the Table of Specifications (TOS)
Constructing Test Questions and the Table of Specifications (TOS)
 
Chapter 2 types of assesment
Chapter 2 types of assesmentChapter 2 types of assesment
Chapter 2 types of assesment
 
Laws related Education
Laws related EducationLaws related Education
Laws related Education
 
Norm or criterion referenced grading
Norm or criterion referenced gradingNorm or criterion referenced grading
Norm or criterion referenced grading
 
Alternative-Response Test
Alternative-Response TestAlternative-Response Test
Alternative-Response Test
 
Field study 1 episode 1
Field study 1 episode 1Field study 1 episode 1
Field study 1 episode 1
 

Andere mochten auch

Andere mochten auch (10)

Item analysis
Item analysis Item analysis
Item analysis
 
Item and Distracter Analysis
Item and Distracter AnalysisItem and Distracter Analysis
Item and Distracter Analysis
 
Item analysis ppt
Item analysis pptItem analysis ppt
Item analysis ppt
 
Item Analysis
Item AnalysisItem Analysis
Item Analysis
 
Item analysis with spss software
Item analysis with spss softwareItem analysis with spss software
Item analysis with spss software
 
Test appraisal
Test appraisalTest appraisal
Test appraisal
 
T est item analysis
T est item analysisT est item analysis
T est item analysis
 
Preparing The Table of Specification
Preparing The Table of SpecificationPreparing The Table of Specification
Preparing The Table of Specification
 
Table of Specifications (TOS) and Test Construction Review
Table of Specifications (TOS) and Test Construction ReviewTable of Specifications (TOS) and Test Construction Review
Table of Specifications (TOS) and Test Construction Review
 
Field study 5 assessment
Field study 5 assessmentField study 5 assessment
Field study 5 assessment
 

Ähnlich wie Item analysis and validation

CHAPTER 6 Assessment of Learning 1
CHAPTER 6 Assessment of Learning 1CHAPTER 6 Assessment of Learning 1
CHAPTER 6 Assessment of Learning 1FriasKentOmer
 
430660906-Item-Analysis.pptx
430660906-Item-Analysis.pptx430660906-Item-Analysis.pptx
430660906-Item-Analysis.pptxSultanRitoAnthony
 
item analysis.pptx education pnc item analysis
item analysis.pptx education pnc item analysisitem analysis.pptx education pnc item analysis
item analysis.pptx education pnc item analysisswatisheth8
 
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptx
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptxITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptx
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptxAnielofTandog
 
Development of pyschologica test construction
Development of pyschologica test constructionDevelopment of pyschologica test construction
Development of pyschologica test constructionKiran Dammani
 
Item analysis and validation
Item analysis and validation Item analysis and validation
Item analysis and validation Hazel Roquid
 
Topic 8b Item Analysis
Topic 8b Item AnalysisTopic 8b Item Analysis
Topic 8b Item AnalysisYee Bee Choo
 
evaluations Item Analysis for teachers.pdf
evaluations  Item Analysis for teachers.pdfevaluations  Item Analysis for teachers.pdf
evaluations Item Analysis for teachers.pdfBatMan752678
 
Analyzingandusingtestitemdata 101012035435-phpapp02
Analyzingandusingtestitemdata 101012035435-phpapp02Analyzingandusingtestitemdata 101012035435-phpapp02
Analyzingandusingtestitemdata 101012035435-phpapp02cezz gonzaga
 
Test standardization
Test standardizationTest standardization
Test standardizationKaye Batica
 
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptTEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptJennilynDescargar
 
Item analysis in education
Item analysis  in educationItem analysis  in education
Item analysis in educationmunsif123
 

Ähnlich wie Item analysis and validation (20)

CHAPTER 6 Assessment of Learning 1
CHAPTER 6 Assessment of Learning 1CHAPTER 6 Assessment of Learning 1
CHAPTER 6 Assessment of Learning 1
 
430660906-Item-Analysis.pptx
430660906-Item-Analysis.pptx430660906-Item-Analysis.pptx
430660906-Item-Analysis.pptx
 
item analysis.pptx education pnc item analysis
item analysis.pptx education pnc item analysisitem analysis.pptx education pnc item analysis
item analysis.pptx education pnc item analysis
 
Item
ItemItem
Item
 
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptx
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptxITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptx
ITEM-ANALYSIS-AND-VALIDATION-in-assessment-in-learning.pptx
 
Item analysis
Item analysisItem analysis
Item analysis
 
Item Analysis
Item AnalysisItem Analysis
Item Analysis
 
Analyzing and using test item data
Analyzing and using test item dataAnalyzing and using test item data
Analyzing and using test item data
 
Analyzing and using test item data
Analyzing and using test item dataAnalyzing and using test item data
Analyzing and using test item data
 
Analyzing and using test item data
Analyzing and using test item dataAnalyzing and using test item data
Analyzing and using test item data
 
Development of pyschologica test construction
Development of pyschologica test constructionDevelopment of pyschologica test construction
Development of pyschologica test construction
 
Item analysis and validation
Item analysis and validation Item analysis and validation
Item analysis and validation
 
Topic 8b Item Analysis
Topic 8b Item AnalysisTopic 8b Item Analysis
Topic 8b Item Analysis
 
evaluations Item Analysis for teachers.pdf
evaluations  Item Analysis for teachers.pdfevaluations  Item Analysis for teachers.pdf
evaluations Item Analysis for teachers.pdf
 
Analyzingandusingtestitemdata 101012035435-phpapp02
Analyzingandusingtestitemdata 101012035435-phpapp02Analyzingandusingtestitemdata 101012035435-phpapp02
Analyzingandusingtestitemdata 101012035435-phpapp02
 
Item analysis.pptx du
Item analysis.pptx duItem analysis.pptx du
Item analysis.pptx du
 
Test standardization
Test standardizationTest standardization
Test standardization
 
New item analysis
New item analysisNew item analysis
New item analysis
 
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.pptTEST ITEM ANALYSIS PRESENTATION 2022.ppt
TEST ITEM ANALYSIS PRESENTATION 2022.ppt
 
Item analysis in education
Item analysis  in educationItem analysis  in education
Item analysis in education
 

Kürzlich hochgeladen

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Kürzlich hochgeladen (20)

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 

Item analysis and validation

  • 1. ITEM ANALYSIS AND VALIDATION Mark Leonard Tan VerenaGonzales AnnCreiaTupasi Ramil Cabañesas
  • 2. Introduction The teacher normally prepares a draft of the test. Such a draft is subjected to item analysis and validation to ensure that the final version of the test would be useful and functional.
  • 3. Phases of preparing a test  Try-out phase  Item analysis phase  Item revision phase
  • 4. Item Analysis  There are two important characteristics of an item that will be of interest of the teacher:  Item Difficulty  Discrimination Index
  • 5.  Item Difficulty or the difficulty of an item is defined as the number of students who are able to answer the item correctly divided by the total number of students.Thus: Item difficulty = number of students with the correct answer Total number of students The item difficulty is usually expressed in percentage.
  • 6. Example: What is the item difficulty index of an item if 25 students are unable to answer it correctly while 75 answered it correctly? Here the total number of students is 100, hence, the item difficulty index is 75/100 or 75%.
  • 7. One problem with this type of difficulty index is that it may not actually indicate that the item is difficult or easy. A student who does not know the subject matter will naturally be unable to answer the item correctly even if the question is easy. How do we decide on the basis of this index whether the item is too difficult or too easy?
  • 8. Range of difficulty index Interpretation Action 0 – 0.25 Difficult Revise or discard 0.26 – 0.75 Right difficulty retain 0.76 - above Easy Revise or discard
  • 9.  Difficult items tend to discriminate between those who know and those who does not know the answer.  Easy items cannot discriminate between those two groups of students.  We are therefore interested in deriving a measure that will tell us whether an item can discriminate between these two groups of students. Such a measure is called an index of discrimination.
  • 10. An easy way to derive such a measure is to measure how difficult an item is with respect to those in the upper 25% of the class and how difficult it is with respect to those in the lower 25% of the class. If the upper 25% of the class found the item easy yet the lower 25% found it difficult, then the item can discriminate properly between these two groups.Thus:
  • 11. Index of discrimination = DU – DL Example: Obtain the index of discrimination of an item if the upper 25% of the class had a difficulty index of 0.60 (i.e. 60% of the upper 25% got the correct answer) while the lower 25% of the class had a difficulty index of 0.20.
  • 12. DU = 0.60 while DL = 0.20, thus index of discrimination = .60 - .20 = .40.
  • 13.  Theoretically, the index of discrimination can range from -1.0 (when DU =0 and DL = 1) to 1.0 (when DU = 1 and DL = 0)  When the index of discrimination is equal to -1, then this means that all of the lower 25% of the students got the correct answer while all of the upper 25% got the wrong answer. In a sense, such an index discriminates correctly between the two groups but the item itself is highly questionable.
  • 14.  On the other hand, if the index discrimination is 1.0, then this means that all of the lower 25% failed to get the correct answer while all of the upper 25% got the correct answer.This is a perfectly discriminating item and is the ideal item that should be included in the test.  As in the case of index difficulty, we have the following rule of thumb:
  • 15. Index Range Interpretation Action -1.0 to -.50 Can discriminate but the item is questionable Discarded -.55 to .45 Non-discriminating Revised .46 to 1.0 Discriminating item Include
  • 16. Example: Consider a multiple item choice type of test with the ff. data were obtained: Item Options 1 A B* C D 0 40 20 20 Total 0 15 5 0 Upper 25% 0 5 10 5 Lower 25% The correct response is B. Let us compute the difficulty index and index of discrimination.
  • 17. Difficulty index = no. of students getting the correct answer Total = __40__ 100 = 40%, within of a “good item” The correct response is B. Let us compute the difficulty index and index of discrimination:
  • 18. The discrimination index can be similarly be computed: DU = no. of students in the upper 25% with correct response No. of students in the upper 25% =15/20 = .75 or 75% DL= no. of students in lower 75% with correct response no. of students in the lower 25% = 5/20 = .25 or 25% Discrimination index = DU – DL = .75 - .25 = .50 or 50% Thus, the item also has a “good discriminating power”.
  • 19. It is also instructive to note that the distracter A is not an effective distracter since this was never selected by the students. Distracter C and D appear to have a good appeal as distracters.
  • 20. Basic Item Analysis Statistics The Michigan State University Measurement and Evaluation Department reports a number of item statistics which aid in evaluating the effectiveness of an item. Index of Difficulty – the proportional of the total group who got the item wrong. “Thus a high index indicates a difficult item and a low index indicates an easy item.
  • 21. Index of Discrimination – is the difference between the proportion of the upper group who got an item right and the proportion of the lower group who got the item right.
  • 22. More Sophisticated Discrimination Index  Item Discrimination refers to the ability of an item to differentiate among students on the basis of how well they know the material being tested.  A good item is one that has good discriminating ability and has a sufficient level of difficulty (not too difficult nor too easy).
  • 23.  At the end of the item analysis report, test items are listed according to their degrees of difficulty (easy, medium, hard) and discrimination (good, fair, poor).These distributions provide a quick overview of the test and can be used to identify items which are not performing well and which perhaps be improved or discarded.
  • 24. The Item-Analysis Procedure for Norm provides the following information: 1. The difficulty of an item 2. The discriminating power of an item 3. The effectiveness of each alternative
  • 25. Benefits derived from Item Analysis 1. It provides useful information for class discussion of the test. 2. It provides data which helps students improve their learning. 3. It provides insights and skills that lead to the preparation of better tests in the future.
  • 27. Index of Item Discriminating Power 
  • 28.
  • 29. The discriminating power of an item is reported as a decimal fraction; maximum discriminating power is indicated by an index of 1.00. Maximum discrimination is usually found at the 50 per cent level of difficulty. 0.00 – 0.20 = very difficult 0.21 – 0.80 = moderately difficult 0.81 – 1.00 = very easy
  • 30. Validation  After performing the item analysis and revising the items which need revision, the next step is to validate the instrument.  The purpose of validation is to determine the characteristics of the whole test itself, namely, the validity and reliability of the test.  Validation is the process of collecting and analysing evidence to support the meaningfulness and usefulness of the test.
  • 31. Validity  is the extent to which measures what it purports to measure or referring to the appropriateness, correctness, meaningfulness, and usefulness of the specific decisions a teacher makes based on the test results.
  • 32. There are three main types of evidences that may be collected: 1. Content-related evidence of validity 2. Criterion-related evidence of validity 3. Construct-related evidence of validity
  • 33. Content-related evidence of validity  refers to the content and format of the instrument.  How appropriate is the content?  How comprehensive?  Does it logically get at the intended variable?  How adequately does the sample of items or questions represent the content to be assessed?
  • 34. Criterion-related evidence of validity  refers to the relationship between scores obtained using the instrument and scores obtained using one or more other test (often called criterion).  How strong is this relationship?  How well do such scores estimate present or predict future performance of a certain type?
  • 35. Construct-related evidence of validity  refers to the nature of the psychological construct or characteristic being measured by the test.  How well does a measure of the construct explain differences in the behaviour of the individuals or their performance on a certain task?
  • 36. Usual procedure for determining content validity  Teacher write out objectives based onTOS  Gives the objectives andTOS to 2 experts along with a description of the test takers.  The experts look at the objectives, read over the items in the test and place a check mark in front of each question or item that they feel does NOT measure one or more objectives.
  • 37. Usual procedure for determining content validity  They also place a check mark in front of each objective NOT assessed by any item in the test.  The teacher then rewrites any item so checked and resubmits to experts and/or writes new items to cover those objectives not heretofore covered by the existing test.
  • 38. Usual procedure for determining content validity  This continues until the experts approve all items and also when the experts agree that all of the objectives are sufficiently covered by the test.
  • 39. Obtaining Evidence for criterion-related Validity  The teacher usually compare scores on the test in question with the scores on some other independent criterion test which presumably has already high validity (concurrent validity).  Another type of validity is called the predictive validity wherein the test scores in the instrument is correlated with scores on later performance of the feelings.
  • 40. Gronlunds Expectancy Table Grade Point Average Test Score Very Good Good Needs Improvement High 20 10 5 Average 10 25 5 Low 1 10 14
  • 41.  The expectancy table shows that there were 20 students getting high test scores and subsequently rated excellent in terms of their final grades;  And finally 14 students obtained low test scores and were later graded as needing improvement.
  • 42.  The evidence for this particular test tends to indicate that students getting high score on it would be graded excellent; average scores on it would be rated good later; and students getting low scores on the test would be graded needing improvement later.
  • 43. Reliability  Refers to the consistency of the scores obtained – how consistent they are for each individual from one administration of an instrument to another and from one set of items to another.
  • 44.  We already have the formulas for computing the reliability of a test; for internal consistency, for instance, we could use the split-half method or the Kuder-Richardson formulae: KR-20 or KR-21
  • 45.  Reliability and validity are related concepts. If an instrument is unreliable, it cannot yet valid outcomes.  As reliability improves, validity may improve (or may not).  However, if an instrument is shown scientifically to be valid then it is almost certain that it is also reliable.
  • 46.  The ff. table is a standard followed by almost universally in educational tests and measurement: Reliability Interpretation .90 and above Excellent reliability; at the level of the best standardized tests. .80 - .90 Very good for a classroom test .70 - .80 Good for a classroom test; in the range of most.There are probably a few items which could be improved. .60 - .70 Somewhat low.This test should be supplemented by other measures (e.g., more test) for grading. .50 - .60 Suggests need for revision of test, unless it is quite short (ten or fewer items).The test definitely needs to be supplemented by other measures (e.g., more tests) for grading. .50 or below Questionable reliability.This test should not contribute heavily to the course grade, and it needs revision.

Hinweis der Redaktion

  1. Why should the bright ones get the wrong answer while the poor ones got the right answer?