Statistics contains lots of long, fancy words. But many of these words are just rococo ways of presenting a simple idea. “Cross tabulation” is one such word. Its true meaning is the conversion of a list of words into a table of numbers. Cross tabulation as a statistical method is easy and useful, but can be time-intensive.

What It Is

Cross tabulation is a statistical method of boiling down statistical data into a form that is easier to analyze. When you “cross tabulate” research or a data set, you take a list of categorical data and condense it into something that looks more like a table than a list. By doing so, you convert the categorical data, or non-numerical data, into a numerical form. As you likely know, statistics works with numbers. Cross tabulation is a way of turning non-numerical data into numerical data.

How to Cross Tabulate

Cross tabulating is as easy as counting. You first list your categorical variables. You might see the entry “single, unemployed” at the top. If you look further down, your list might contain entries such as “married, employed.” You would find all the possible words and list them in a table, with your categorical variables for separate ideas as individual rows and columns. The table here might have three rows and two columns: “single,” “married,” and “divorced” for rows; “unemployed” and “employed” for columns. In each cell you would write the number of times that combination appears in your list. For example, you might count 11 people who are divorced and employed in your list. You would then write the number 11 in the cell that matches the row for “divorced” and column for “employed.” If your data set is large, you might have to use a computer to cross tabulate for you because of how long it would take to do it by hand.

A Simple Example

An easy way to imagine cross tabulation is to imagine a list of eye colors and blood types. Each data point is an eye color with a blood type next to it. For example, this list could represent a section of the database of sick people in a hospital. One entry might be “Blue, B.” Cross tabulation would convert this long list of seemingly unmanageable data into a manageable form. It would make a 3 by 4 table, with rows “Brown, Blue, Green” and columns “A, B, O, AB.” In the cells are numbers representing how many of each combination are in the original list. For example, you might trace the row to “Green” and column to “O” and find the number 727. This would mean that 727 people in the hospital database have both brown eyes and blood type “O.”

Answering Questions

Turning a list of categorical data into a table of numerical data allows researchers to answer questions about the data. For example, if a hospital finds that it has 727 people with green eyes and blood type “O,” and also find that they have 767 people with brown eyes and blood type “AB,” they might question whether this 40-person difference is due to random chance. It’s possible that a non-random reason makes people with brown eyes and blood type “AB” more likely to be in that hospital list, such as an illness that more easily affects this group, or a difference in this group’s size in the population. Specifically, cross tabulation allows them to ask questions such as “Is the difference between these two groups significant or random?”