In statistics, researchers perform various tests to further analyze their data. Depending on the outcome researchers are looking for, the data can display one of many test statistics. These test statistics result from the specific analysis you perform when studying data. A test statistic measures the accuracy of the predicted data distribution relating to the null hypothesis you use when analyzing data samples. The statistic depends on what kind of data analysis method you use and indicates how closely your data matches the predicted distribution for the specific test you perform.
The distribution of data accounts for the frequency at which observations occur when performing statistical tests and shows the central tendency and variation. Because there are different statistical tests you can use to analyze data distribution, the central tendency and variance measures differ based on the type of hypothesis you predict for a certain test. To better understand different test statistics, it’s important to distinguish between what a null and an alternative hypothesis are:
- Null hypothesis: This hypothesis proposes that the means of two distinct sample groups are equal. When performing statistical tests, the goal becomes to either reject the null hypothesis or prove it correct.
- Alternative hypothesis: Alternative hypotheses propose that there is a significant difference between two samples and the variations between the groups result in unequal means. If you arrive at an alternative hypothesis during statistical analysis, it can indicate a rejection of the null hypothesis.
Types of test statistics
The following test statistics are some of the common applications data professionals use when performing statistical analysis.
1. T-value
The t-value is one type of test statistic that results from performing either t-tests or regression tests. Evaluating the t-value requires testing a null hypothesis where the means of both test samples are equal. If you perform a t-test or regression rest and find the means are not equal, you reject the null hypothesis for the alternative hypothesis. You can calculate a t-value using a common t-test with the formula: t = (X‾ – μ0) / (s / √n) where X‾ is the sample mean, μ0 represents the population mean, s is the standard deviation of the sample and n stands for the size of the sample.
2. Z-value
The z-value is another common test statistic where the null hypothesis suggests the means of two populations are equal. This metric goes beyond the t-value, which tests only a sample of the population. The z-score is also important for calculating the probability of a data value appearing within the normal distribution for a specific standard. This allows for the comparison of two z-values from different sample groups with varying standard deviations and mean values. To get the z-value, you can use the formula: z = (X – μ) / σ where X represents the raw data or score, μ is the mean of the population and σ is the standard deviation for the population.
3. P-value
The p-value in a statistical test helps you determine whether to reject or support the null hypothesis. It’s a metric that argues against the null hypothesis and relies on the alpha value, critical value and probability. Measuring a smaller p-value suggests the rejection of the null hypothesis, whereas a higher p-value indicates stronger evidence for supporting the null hypothesis.
The p-value is a probability measure and uses the degree of freedom and estimation based on the alpha value of a t-test. Taking the sample size n, subtract one to get the degree of freedom (n – 1). Comparing the result to a respective alpha level gives you the estimate for the p-value. It’s important to note that p-values depend on the results t-tests give you and can change according to different t-statistics.
4. F-value
An f-value is a test statistic that you can get from an analysis of variance (ANOVA). This statistical test measures the difference in means for two or more independent samples. The f-value shows the significance of the mean differences, indicating whether the variance between the groups forms a relationship. If the f-value is greater than or equal to the variation between the groups, the null hypothesis holds true. If the f-value is less than the variation between the sample groups, it rejects the null hypothesis. Calculating the f-value relies on sophisticated computations, which many data scientists perform with computer software.
5. X2 value
The X2 value comes from non-parametric correlation tests that measure whether there is a causal relationship between variables. This value can also tell you whether the two variables you want to use in a statistical analysis already display a relationship. This test statistic becomes useful when preparing variables for testing in regression analysis, as the null hypothesis for the X2 value indicates independent samples.
How to calculate a test statistic
Use the following steps to calculate common test statistics from z-tests and t-tests.
1. Find the raw scores of the populations
Assume you want to perform a z-test to determine whether the means of two populations are equal. To calculate the z-score, find the raw scores for both populations you’re evaluating. As an example, assume the raw score is 95 to which you’re comparing the populations. You would substitute this value in the z-score formula as: z = (95) – μ) / σ
2. Calculate the standard deviation of the population
Find the standard deviation of the population you’re evaluating. Find the standard deviation by calculating the square root of the variance. For instance, if you have a variance of 64 for a data set, the standard deviation would be eight, because the square root of 64 = 8. When you find the standard deviation, substitute it in the z-score formula. Using the previous example raw score, substitute a standard deviation of 16 for the σ variable in the formula: z = (95) – μ) / 16
3. Calculate the population mean
Find the population mean by adding up all data values and dividing this sum by the number of data values in the population. This value substitutes the μ variable in the formula. For example, assume you calculate all data points in your population and get a sum of 18,346. If you have 468 values in the data set, this gives you a mean of 39.2. Applying this value to the z-score formula with the previous example values results in: z = (95 – 39.2) / 16 = 3.4875
4. Evaluate the z-value
Evaluating the z-score tells you about the distribution of the data within a population you’re studying. It shows how many standard deviations away from the mean the data points spread apart. If you have a standard deviation of zero, this indicates the raw score you’re testing is equal to the mean. A positive z-value shows the raw score is higher than the average, and a negative z-value shows the raw score is less than the average.
5. Apply the t-test formula
You can use a t-test to evaluate a sample of a larger population from previous z-tests. Applying the t-value formula, determine the sample size, the sample mean and the standard deviation of the sample. Assuming you have a sample size of 50, a sample mean of 17 and a standard deviation of 3.5 for the sample. If the population mean is 78, apply the t-score formula to get: t = (X‾ – μ0) / (s / √n) = (17 – 78) / (3.5 / √50) = (-61 / 24.745) = -2.465
6. Interpret the results
Interpreting the results of the t-test gives you a t-value of -2.465. A negative t-value can indicate a change in one variable due to a reversal in direction of another. This change in direction can also show the absence of statistical significance within the differences between various samples you test. It’s also important to determine whether additional statistical tests are necessary for the type of data you’re evaluating. For instance, the z-test and t-test can be efficient for analyzing exam scores, while evaluating an f-value can require performing several types of variance analysis.