Which statistics is commonly used for testing relationship between categorical variables?

Published on January 28, 2020 by Rebecca Bevans. Revised on July 6, 2022.

Nội dung chính Show

What does a statistical test do?
When to perform a statistical test
Statistical assumptions
Types of variables
Receive feedback on language, structure and formatting
Choosing a parametric test: regression, comparison, or correlation
Regression tests
Comparison tests
Correlation tests
Choosing a nonparametric test
Flowchart: choosing a statistical test
Frequently asked questions about statistical tests
Which test is common to be used for relationship of categorical variables?
What is used to measure the relationship between to categorical variables?
What statistics do you use for categorical data?

Statistical tests are used in hypothesis testing. They can be used to:

determine whether a predictor variable has a statistically significant relationship with an outcome variable.
estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

What does a statistical test do?

Statistical tests work by calculating a test statistic– a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p-value (probability value). The p-value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

When to perform a statistical test

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment, or through observations made using probability sampling methods.

For a statistical test to be valid, your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

whether your data meets certain assumptions.
the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
Homogeneity of variance: the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
Normality of data: the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data.

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test, which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

Continuous (a.k.a ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
Discrete (a.k.a integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

Ordinal: represent data with an order (e.g. rankings).
Nominal: represent group names (e.g. brands or species names).
Binary: represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment, these are the independent and dependent variables). Consult the tables below to see which test best matches your variables.

Receive feedback on language, structure and formatting

Professional editors proofread and edit your paper by focusing on:

Academic style
Vague sentences
Grammar
Style consistency

See an example

Which statistics is commonly used for testing relationship between categorical variables?

Choosing a parametric test: regression, comparison, or correlation

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships. They can be used to estimate the effect of one or more continuous variables on another variable.

	Predictor variable	Outcome variable	Research question example
Simple linear regression	Continuous 1 predictor	Continuous 1 outcome	What is the effect of income on longevity?
Multiple linear regression	Continuous 2 or more predictors	Continuous 1 outcome	What is the effect of income and minutes of exercise per day on longevity?
Logistic regression	Continuous	Binary	What is the effect of drug dosage on the survival of a test subject?

Comparison tests

Comparison tests look for differences among group means. They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g. the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g. the average heights of children, teenagers, and adults).

	Predictor variable	Outcome variable	Research question example
Paired t-test	Categorical 1 predictor	Quantitative groups come from the same population	What is the effect of two different test prep programs on the average exam scores for students from the same class?
Independent t-test	Categorical 1 predictor	Quantitative groups come from different populations	What is the difference in average exam scores for students from two different schools?
ANOVA	Categorical 1 or more predictor	Quantitative 1 outcome	What is the difference in average pain levels among post-surgical patients given three different painkillers?
MANOVA	Categorical 1 or more predictor	Quantitative 2 or more outcome	What is the effect of flower species on petal length, petal width, and stem length?

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

	Variables	Research question example
Pearson’s r	2 continuous variables	How are latitude and temperature related?

Choosing a nonparametric test

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

	Predictor variable	Outcome variable	Use in place of…
Spearman’s r	Quantitative	Quantitative	Pearson’s r
Chi square test of independence	Categorical	Categorical	Pearson’s r
Sign test	Categorical	Quantitative	One-sample t-test
Kruskal–Wallis H	Categorical 3 or more groups	Quantitative	ANOVA
ANOSIM	Categorical 3 or more groups	Quantitative 2 or more outcome variables	MANOVA
Wilcoxon Rank-Sum test	Categorical 2 groups	Quantitative groups come from different populations	Independent t-test
Wilcoxon Signed-rank test	Categorical 2 groups	Quantitative groups come from the same population	Paired t-test

Flowchart: choosing a statistical test

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Frequently asked questions about statistical tests

What is statistical significance?

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test. Significance is usually denoted by a p-value, or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis.

When the p-value falls below the chosen alpha value, then we say the result of the test is statistically significant.

What is the difference between quantitative and categorical variables?

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

This Scribbr article

Bevans, R. (July 6, 2022). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved October 27, 2022, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

You have already voted. Thanks :-) Your vote is saved :-) Processing your vote...

Which test is common to be used for relationship of categorical variables?

A chi-square test is used when you want to see if there is a relationship between two categorical variables.

What is used to measure the relationship between to categorical variables?

Categorical variables arise commonly in many applications and the best-known association measure between two categorical variables is probably the chi-square measure, also introduced by Karl Pearson. Like the product-moment correlation coefficient, this association measure is symmetric, but it is not normalized.

What statistics do you use for categorical data?

Frequency tables, pie charts, and bar charts are the most appropriate graphical displays for categorical variables. Below are a frequency table, a pie chart, and a bar graph for data concerning Mental Health Admission numbers.

Is chi

This test is used to determine if two categorical variables are independent or if they are in fact related to one another. If two categorical variables are independent, then the value of one variable does not change the probability distribution of the other.

Which statistics is commonly used for testing relationship between categorical variables?

What does a statistical test do?

When to perform a statistical test

Statistical assumptions

Types of variables

Receive feedback on language, structure and formatting

Choosing a parametric test: regression, comparison, or correlation

Regression tests

Comparison tests

Correlation tests

Choosing a nonparametric test

Flowchart: choosing a statistical test

Frequently asked questions about statistical tests

Sources in this article

Is this article helpful?

Which test is common to be used for relationship of categorical variables?

What is used to measure the relationship between to categorical variables?

What statistics do you use for categorical data?

Is chi

Bài Viết Liên Quan

Quảng Cáo

Có thể bạn quan tâm

Toplist được quan tâm

Quảng cáo

Xem Nhiều

Quảng cáo

Chúng tôi

Điều khoản

Trợ giúp

Mạng xã hội