p-values

• The calculated probability values corresponding to a particular test statistic given some set of circumstances

– Sample size

– Degrees of freedom

– Etc

• Calculated for us by SPSS

• P<0.05 generally considered significant

– Corresponds to alpha level

χ and

Other Non-Parametric Tests

2

Parametric vs Non-Parametric

Tests

• Parametric tests

– used when making a generalization about at least one parameter (population measure)

– Generally requires

• Normal distribution

• At least one interval level data

– Or that “approaches” interval level

– More powerful than non-parametric tests

Parametric vs Non-Parametric

Tests

• Non-Parametric Tests

– Don’t test hypotheses about population parameters – Don’t require normal distributions

– May be used for all levels of measurement

• Nominal to ratio

Chi-squared (χ ) Test

2

• The basic non-parametric test

• A test of independence

– Are two categorical variables independent? or – Is there a relationship between two categorical variables?

Chi-squared (χ ) Hypotheses

2

• Ho: there is no relationship between categorical variables

• H1: there is a relationship between categorical variables

χ2 Assumptions and

Requirements

• May be used when:

– Both IV and DV are nominal

• Ordinal sometimes if few categories

• Interval or ratio data is sometimes grouped to form nominal or ordinal variables

– Age in years into {0-15, 16-25, 26-35}

•

•

•

•

•

Assumes random and independent sampling

Each subject must qualify for ONE cell

No assumptions made about distribution shape

No assumptions about homogeneity

Expected frequency for each cell must be >0

χ Contingency Table

2

In SPSS: Crosstabs subject has diabetes * subject had stroke Crosstabulation

Count

subject has diabetes Total

no yes subject had stroke no yes

67

10

15

8

82

18

Total

77

23

100

The 2×2 table is the simplest but… can go to n1×n2

Calculating χ

2

(Oij - Eij)2

2 = -------------- with df = (r-1)(c-1)

Eij

Where,

Oij = Observed cell frequencies

Eij = Expected cell frequencies =

Row Count x Column Count

--------------------------------------Total Count (N)

χ Distribution

2

• One-tailed

• Skewed to the right

• Similar to F distribution Calculating χ

2

subject has diabetes * subject had stroke Crosstabulation

Count

subject has diabetes no yes Total

E11

E12

E21

E22

=

=

=

=

(77X82)/100

(77X18)/100

(23X82)/100

(23X18)/100

subject had stroke no yes

67

10

15

8

82

18

=

=

=

=

63.1

13.9

18.9

4.1

Total

77

23

100

*Remember: This can be expanded to almost any reasonably sized table.

In Our Example…

(Oij - Eij)2

2 =

Eij

(67 – 63.1)2 + (10 – 13.9)2 + (15 – 18.9)2 + (8 - 4.1)2

=

=5.8

63.1

13.9

18.9

4.1

with df = (R - 1)(C - 1) = (2 - 1)(2 - 1) = 1

General rule of thumb for 2X2 table with 1 df: χ2>4 is significant

SPSS Chi-Square Results

Chi-Square Tests

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-Linear

Association

N of Valid Cases

Value

5.700b

4.319

5.093

5.643

df

1

1

1

1

Asymp. Sig.

(2-sided)

.017

.038

.024

Exact Sig.

(2-sided)

Exact Sig.

(1-sided)

.028

.023

.018

100

a. Computed only for a 2x2 table

b. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.

14.

Fisher’s Exact Test

• Used when an expected value is <5

– Takes into account small sizes

• Can be used only with 2×2 table

– Sometime necessary to collapse cells if χ2 cannot be adequately calculated

– If cells are not collapsed, χ2 can provide estimate but NOT actual significance

Yates’ Correction

• Yates’ correction for continuity

• Used in 2X2 tables, generally when any expected cell frequency is <10

– Do not apply when expected frequencies are small • Some disagreement about its use

– Reduces power

• Provides more conservative estimate

– Sometimes desirable, particularly with small numbers Yates’ Correction χ2 = ∑

(|O-E| - 0.5)2

E

Other Non-Parametric Tests

• Used when DV is nominal or ordinal OR

• When the assumptions for more…