2. Our Test Statistics
Chi-square
Gamma
Z and t Statistics
Correlation coefficient and the R2 value
Significance and the p-value
1-sided and 2-sided tests
2
3. Variable Types
• Nominal, Ordinal, and Interval
Nominal Example: Type of Anesthetic (e.g. Local, General,
and Regional)
Ordinal Example: Hospital Trauma Levels (e.g. Level-1, Level-
2, Level-3)
Interval Example: Drug Dosage (e.g. mg of substance)
• There are multiple kinds of tests for each type of variable
depending on assumptions being made and what is being
tested for (e.g. means, variances, association, etc.)
Our focus is on ordinal variables
and testing for associations
3
4. c 2 Test of Independence
Does not measure strength of an association, but, rather, it
indicates how much evidence there is that the variables are or are
not independent
The c2 statistic ranges from 0 to ∞ (always ≥ 0)
The larger the c2 value, the more confident we can be that the
variables are not independence
Example: Assume we have two ordinal variables (Hospital Size
and the Relative Cost of the Laryngoscope being used):
H0: There is no association between variables (c2 value is near zero)
H1: There is an association between variables
4
6. The c 2 Test Statistic
Sample 2
53 47 18
77 69 26
149 132 49
(45%) (40%) (15%)
“Observed Value”
“Expected Value”
Remember: This number does not reflect the strength of an association.
It only provides evidence that an association actually exists.
6
7. The c 2 Distribution
Okay…so what does c2 = 83 tell us?
With 4 degrees of freedom (d.f.) and an a = 0.10, our
benchmark (critical value) for dependence is 7.78
0.2
0.18
0.16
0.14
0.12
0.1
0.08
Area = a = 0.10
0.06
0.04
0.02
0
0 5 7.78 10 15 20 25 30 35 40
d.f. = 4
7
8. The c 2 Distribution
H0: There is no association between variables (c2 value is near zero)
H1: There is an association between variables
We reject H0 if our c2 value is > our “benchmark” value of 7.78
0.2
0.18
0.16
0.14
0.12
0.1
0.08
Area = a = 0.10
0.06
0.04
0.02
0
0 5 7.78 10 15 20 25 30 35 40
d.f. = 4
8
9. The c2 Distribution
0.18
3x3
0.16
0.14
0.12 Increasing d.f.
0.1
0.08
Increasing d.f. 0.06
0.04
0.02
0
5x5
-0.02
0 10 20 30 40
d.f. = 5 d.f. = 10 d.f. = 20
Degrees of Freedom = (# of rows – 1) x (# of columns – 1)
9
10. So, how do we determine the strength and
direction of an established association
between variables?
10
13. The Gamma Value (g)
Appropriate for ordinal-by-ordinal data
Generally more powerful test statistic than the c2 test
statistic (i.e. we are able to detect significant
differences with greater ease)
Because ordinal variables can be ordered, we can talk
about the direction of association between them (i.e.
positive or negative)
13
14. Measure of Association
C = Concordant Pairs
D = Discordant Pairs
lies between -1 and 1, where a value of 0
means that there is no relation between the two
ordinal variables and | | = 1 represents the
strongest associations.
14
15. Testing Independence with g
H0: There is no association between variables (g value is near zero)
H1: There is an association between variables
g follows a normal distribution such that our z statistic
becomes:
z=
s.e.
Comparing our z statistic with a, we can determine if
the variables are statistically independent
Standard Error (s.e.) = sample standard deviation / √ n
15
16. Testing Independence with g
Assuming an a of 0.10 then our benchmark is z0 = 1.285
0.045
0.04
0.035
0.03
0.025 Area = a = 0.10
0.02
0.015
0.01
0.005
0
-4 -3 -2 -1 0 1 1.2852 3 4
If z > z0 = 1.285 then reject H0
Rejection of the H0 means that the variables are not independent
(see previous slide)
Using g to test for independence does not rely on the degrees of freedom
16
17. Why do we not just use g and
forget about c2 since g is a more
powerful statistic?
17
18. c 2 and g
An ordinal measure of association (i.e. g ) may equal 0
when the variables are actually statistically dependent
c2 = 437, g = 0
18
20. Association vs. Causation
Causation can often be obscure or counterintuitive so
how do we tell the difference from Association?
Performing controlled comparisons
Increasing the variable resolutions (i.e. the number
of data points)
Sequence Analysis (time becomes a variable)
20