11. Chi-Square

Chi-square (pronounced “kai-square”) tests are used to analyze categorical data. The hypothesis-testing process remains the same, but chi-square tests examine frequencies or counts rather than means.

We will cover the following chi-square tests:

Test Variables and Design Research Question
Chi-square goodness-of-fit One categorical variable Do the observed proportions match specified expected proportions?
Chi-square test of independence Two categorical variables with independent observations Are the two variables associated?
Fisher’s exact test Two categorical variables in a 2 × 2 table with small expected frequencies Are the two variables associated when the chi-square approximation is inappropriate?
McNemar’s test Two paired binary variables Does a binary response change across two related measurements?

The key distinctions are the number of variables, whether observations are independent or paired, and whether the expected frequencies are sufficiently large.

Because these tests analyze categorical frequencies, they do not assume that a continuous variable follows a normal distribution. They do, however, have assumptions about how the observations were collected and, for some tests, how large the expected frequencies are. These tests are generally classified as nonparametric statistics.

Each test in this chapter follows the same four-step hypothesis-testing process introduced in Chapter 7: look at the data, check assumptions, perform the test, and interpret the results. What changes from test to test is the type of categorical data being analyzed, the assumptions that must be checked, and the specific output reported.

Data Setup for Chi-Square Tests

Chi-square data can be entered in two formats.

In the raw-data format, each row represents one participant or observation.

In the frequency-table format, each row represents a category or combination of categories, with a separate variable containing the number of observations. When using frequency-table data, move the frequency variable into the Counts box in the goodness-of-fit or test-of-independence analysis. The Small S Scientist has an example of running a chi-square with a frequency table.