p-values
• The calculated probability values corresponding to a particular test statistic given some set of circumstances
– Sample size
– Degrees of freedom
– Etc
• Calculated for us by SPSS
• P<0.05 generally considered significant
– Corresponds to alpha level
χ and
Other Non-Parametric Tests
2
Parametric vs Non-Parametric
Tests
• Parametric tests
– used when making a generalization about at least one parameter (population measure)
– Generally requires
• Normal distribution
• At least one interval level data
– Or that “approaches” interval level
– More powerful than non-parametric tests
Parametric vs Non-Parametric
Tests
• Non-Parametric Tests
– Don’t test hypotheses about population parameters – Don’t require normal distributions
– May be used for all levels of measurement
• Nominal to ratio
Chi-squared (χ ) Test
2
• The basic non-parametric test
• A test of independence
– Are two categorical variables independent? or – Is there a relationship between two categorical variables?
Chi-squared (χ ) Hypotheses
2
• Ho: there is no relationship between categorical variables
• H1: there is a relationship between categorical variables
χ2 Assumptions and
Requirements
• May be used when:
– Both IV and DV are nominal
• Ordinal sometimes if few categories
• Interval or ratio data is sometimes grouped to form nominal or ordinal variables
– Age in years into {0-15, 16-25, 26-35}
•
•
•
•
•
Assumes random and independent sampling
Each subject must qualify for ONE cell
No assumptions made about distribution shape
No assumptions about homogeneity
Expected frequency for each cell must be >0
χ Contingency Table
2
In SPSS: Crosstabs subject has diabetes * subject had stroke Crosstabulation
Count
subject has diabetes Total
no yes subject had stroke no yes
67
10
15
8
82
18
Total
77
23
100
The 2×2 table is the simplest but… can go to n1×n2
Calculating χ
2
(Oij - Eij)2
2 = -------------- with df = (r-1)(c-1)
Eij
Where,
Oij = Observed cell frequencies
Eij = Expected cell frequencies =
Row Count x Column Count
--------------------------------------Total Count (N)
χ Distribution
2
• One-tailed
• Skewed to the right
• Similar to F distribution Calculating χ
2
subject has diabetes * subject had stroke Crosstabulation
Count
subject has diabetes no yes Total
E11
E12
E21
E22
=
=
=
=
(77X82)/100
(77X18)/100
(23X82)/100
(23X18)/100
subject had stroke no yes
67
10
15
8
82
18
=
=
=
=
63.1
13.9
18.9
4.1
Total
77
23
100
*Remember: This can be expanded to almost any reasonably sized table.
In Our Example…
(Oij - Eij)2
2 =
Eij
(67 – 63.1)2 + (10 – 13.9)2 + (15 – 18.9)2 + (8 - 4.1)2
=
=5.8
63.1
13.9
18.9
4.1
with df = (R - 1)(C - 1) = (2 - 1)(2 - 1) = 1
General rule of thumb for 2X2 table with 1 df: χ2>4 is significant
SPSS Chi-Square Results
Chi-Square Tests
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
N of Valid Cases
Value
5.700b
4.319
5.093
5.643
df
1
1
1
1
Asymp. Sig.
(2-sided)
.017
.038
.024
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
.028
.023
.018
100
a. Computed only for a 2x2 table
b. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.
14.
Fisher’s Exact Test
• Used when an expected value is <5
– Takes into account small sizes
• Can be used only with 2×2 table
– Sometime necessary to collapse cells if χ2 cannot be adequately calculated
– If cells are not collapsed, χ2 can provide estimate but NOT actual significance
Yates’ Correction
• Yates’ correction for continuity
• Used in 2X2 tables, generally when any expected cell frequency is <10
– Do not apply when expected frequencies are small • Some disagreement about its use
– Reduces power
• Provides more conservative estimate
– Sometimes desirable, particularly with small numbers Yates’ Correction χ2 = ∑
(|O-E| - 0.5)2
E
Other Non-Parametric Tests
• Used when DV is nominal or ordinal OR
• When the assumptions for more