Z-test vs T-test: the differences and when to use each

Explore statistical significance using Z-test vs T-test, understand their differences, when to use them, and how to decide between T-test or Z-test for your hypothesis testing.

two light bulbs with inscriptions Z-test vs T-test

Published in Tech matters23 April 20248 min read

a data scientist

Testing is how you determine effectiveness. Whether you work as a data scientist, statistician, or software developer, to ensure quality, you must measure performance. Without tests, you could deploy flawed code, features, or data points.

With that in mind, the use cases of testing are endless. Machine learning models need statistical tests. Data analysis involves statistical tests to validate assumptions. Optimization of any kind requires evaluation. You even need to test the strength of your hypothesis before you begin an inquiry.

Let's explore two inferential statistics: the Z-test vs the T-test. That way you can understand their differences, their unique purposes, and when to use a Z-test vs T-test.

Grow your tech career with generative AI

Explore courses, discover new tools, and read the latest news.

View offer

What is hypothesis testing?

To start, imagine you have a good idea. At the moment of inception, you have no data to back up your idea. It is an unformed thought. But the idea is an excellent starting point that can launch a full investigation. We consider this starting point a hypothesis.

But what if your hypothesis is off-base? You don’t want to dive into a full-scale search if it is a pointless chase with no reward. That is a waste of resources. You need to determine if you have a workable hypothesis.

Enter hypothesis testing. It is a statistical act used to assess the viability of a hypothesis. The method discovers whether there is sufficient data to support your idea. If there is next to no significance, you do not have a very plausible hypothesis.

To confirm the validity of a hypothesis, you compare it against the status quo (also known as the null hypothesis). Your idea is something new, opposite from normal conditions (also known as the alternative hypothesis). It is zero sum: only one hypothesis between the null and alternate hypothesis can be true in a given set of parameters.

In such a comparison test, you can now determine validity. You can compare and contrast conditions to find meaningful conclusions. Whichever conditions become statistically apparent determines which hypothesis is plausible.

What is a Z-test?

A Z-test is a test statistic. It works with two groups: the sample mean and the population mean. It will test whether these two groups are different.

With a Z-test, you know the population standard deviation. That is to ensure statistical accuracy as you compare one group (the sample mean) vs the second group (the population mean). In other words, you can minimize external confounding factors with a normal distribution. In addition, a defining characteristic of a Z-test is that it works with large sample sizes (typically more than 30, so we achieve normal distribution as defined by the central limit theorem). These are two crucial criteria for using a Z-test.

Within hypothesis testing, your null hypothesis states there is no difference between the two groups your Z-test will compare. Your alternative hypothesis will state there is a difference that your Z-test will expose.

How to perform a Z-test

A Z-test occurs in the following standard format:

Formulate your hypothesis: First, define the parameters of your alternative and null hypothesis.
Choose a critical value: Second, determine what you consider a viable difference between your two groups. This threshold determines when you can say the null hypothesis should be rejected. Common levels are 0.01 (1%) or 0.05 (5%), values found to best balance Type I and Type II errors.
Collect samples: Obtain the needed data. The data must be large enough and random.
Calculate for a Z-score: Input your data into the standard Z-test statistics formula, shown below, where Z = standard score, x = observed value, mu = mean of the sample, sigma = standard deviation of the sample.

Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so different so you can reject the null hypothesis. Your alternative hypothesis (something other than the status quo) is at work, and that's worth investigating.

Examples of a Z-test

There are different variations of a Z-test. Let's explore examples of one-sample and two-sample Z-tests.

One-sample Z-test

A one-sample Z-test looks for either an increase or a decrease. There is one sample group involved, taken from a population. We want to see if there is a difference between those two means.

For example, consider a school principal who believes their students' IQ is higher than the state average. The state average is 100 points (population mean), give or take 20 (the population standard deviation). To prove this hypothesis, the principal takes 50 students (the sample size) and finds their IQ scores. To their delight, they earn an average of 110.

But does the difference offer any statistical value? The principal then plugs the numbers into a Z-test. Any Z-score greater than the critical value would state there is sufficient significance. The claim that the students have an above-average IQ is valid.

Two-sample Z-test

A two-sample test involves comparing the average of two sample groups against the population means. It is to determine a difference between two independent samples.

For example, our principal wants to compare their students' IQ scores to the school across the street. They believe their students' average IQ is higher. They don’t need to know the exact numerical increase or decrease. All they want is proof that their student's average scores are higher than the other group.

To confirm the validity of this hypothesis, the principal will search for statistical significance. They can take a 50-student sample size from their school and a 50-student sample size from the rival school. Now in possession of both sample group's average IQ (and the sample standard deviation), they hope to find a number value that is not equal. And they need them to be unequal by a significant amount.

If the test statistic comes in less than the critical value, the differences are negligible. There is not enough evidence to say the hypothesis is worth exploring, the null hypothesis is maintained. He would not have enough proof that the IQ levels between the two schools are different.

What is a T-test?

A T-test performs the same crucial function as a Z-test: determine if there is a difference between the means of two groups. If there is a significant difference, you have achieved statistical validity for your hypothesis.

However, a T-test involves a different set of factors. Most importantly, a T-test applies when you do not know the sample variance of your values. You must generalize the normal distribution (or T-distribution). Plus, there is an expectation that you do not possess all the data in a given scenario.

These conditions better match reality, as it is often hard to collect data from entire populations or always obtain a standard normal distribution. That is why T-tests are more widely applicable than Z-tests, though they operate with less precision.

How to perform a T-test

A T-test occurs in the following standard format:

Formulate your hypothesis: First, define the parameters of your null and alternative hypothesis.
Choose a critical value: Like a Z-test, determine what you consider a viable difference between your two groups.
Collect data: Obtain the needed data. One of the key differences is degrees of freedom in the samples of a T-test, so try to define the typical values and range of values in each group.
Calculate your T-score: Input your data into the T-test formula you chose. Here is a one-sample formula:

Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so far from the population mean that you likely have a useful hypothesis.

Examples of a T-test

There are several different kinds of T-tests as well. Let's go through the standard one-sample and two-sample T-tests.

One-sample T-test

A one-sample T-test looks for an increase or decrease compared to a population mean.

For example, your company just went through sales training. Now, the manager wants to know if the training helped improve sales.

Previous company sales data shows an average of $100 on each transaction from all workers. The null hypothesis would be no change. The alternative hypothesis (which you hope is significant), is that there is an improvement.

To test if there is significance, you take the sales average of 20 salesmen. That is the only available data, and you have no other data from nationwide stores. The average of that sample of salesmen in the past month is $130. We will also assume that the standard deviation is approximately equal.

With this set of factors, you can calculate your T-score with a T-test. You compare the sample result to the critical value. In addition, you assess it against the number of degrees of freedom. Since we know with smaller sample data sizes there is greater uncertainty, we allow more room for our data to vary.

After comparing, we may find a lot of significance. That means the data possesses enough strength to support our hypothesis that sales training likely impacted sales. Of course, this is an estimate, as we only assessed one factor with a small group. Sales could have risen for numerous other reasons. But with our set of assumptions, our hypothesis is valid.

Two-sample T-test

A two-sample T-test occurs the same as a two-sample Z-test and compares if two groups are equal when compared to a defined population parameter.

For example, consider English and non-native speakers. We want to see the effect of maternal language on test scores inside a country. To do that, we will offer both groups a reading test and compare those scores to the average.

Of course, finding the mean of an entire population of language speakers is impossible to procure. Still, we can make some assumptions and compare them with a smaller size. We take 15 English speakers and 15 non-native speakers and collect their results. We can decide on a critical score value on the reading test as well. If the average score on the test is not crucially different or outside the population standard deviation, our assumption failed. There is no significant difference between the groups, so the impact of maternal language is not worth investigating.

How to know when to use Z-test vs T-test

Both a Z-test and a T-test validate a hypothesis. Both are parametric tests that rely on assumptions. The key difference between Z-test and T-test is in their assumptions (e.g. population variance).

Key differences about the data used result in different applications. You want to use the appropriate tool, otherwise you won’t draw valid conclusions from your data.

So when should you use a Z-test vs a T-test? Here are some factors to consider:

Sample size: If the available sample size is small, opt for a T-test. Small sample sizes are more variable, so the greater spread of distribution and estimation of error involved with T-tests is ideal.
Knowledge of the population standard deviation: Z-tests are more precise and often simpler to execute. So if you know the standard deviation, use a Z-test.
Test purpose: If you are assessing the validity of a mean, a T-test is the best choice. If you are working with a hypothesized population proportion, go for a Z-test.
Assumption of normality: A Z-test assumes a normal distribution. This does not apply to all real-world scenarios. If you hope to validate a hypothesis that is not well-defined, opt for a T-test instead.
Type of data: You can only work within the constraints of the available data. The more information the better, but that is often not possible given testing and collecting conditions. If you have limited data describing means between groups, opt for a T-test. If you have large data sets comparing means between populations, you can use a Z-test.

Difference between Z-test and T-test: a comparative table

Knowing the key differences with each statistical test makes selecting the right tool far easier. Here is a table that can help you compare:

	T-test	Z-test
1. Purpose	Compare means of small samples (n < 30)	Compare means of large samples (n ≥ 30)
2. Assumptions	Normally distributed data, approximate normality	Normally distributed data, known population standard deviation
3. Population standard deviation	Unknown	Known
4. Sample size	Small (n < 30)	Large (n ≥ 30)
5. Test statistic	T-distribution	Standard normal distribution (Z-distribution)
6. Degrees of freedom	n1 + n2 - 2	Not applicable
7. Use case	Small sample analysis, comparing means between groups	Large sample analysis, population mean comparisons
8. One-sample vs. two-sample	Both	Usually two-sample
9. Data requirement	Raw data	Raw data
10. Complexity	Relatively more complex	Relatively simpler

Conclusion

Statistical testing lets you determine the validity of a hypothesis. You discover validity by determining if there is a significant difference between your hypothesis and the status quo. If there is, you have a possible idea worth exploring.

That process has numerous applications in the field of remote data scientist jobs available. You might want to determine the performance of an app with an A/B test. Or you might need to test if an application fits within the defined limits and compare performance metrics. Z-tests and T-tests can depict whether there is significant evidence in each of these scenarios. With that information, you can take the appropriate measures to fix bugs or optimize processes.

Z-test and T-test are helpful tools, especially for hypothesis testing. For data engineers of the future, knowledge of statistical testing will only help your work and overall career trajectory.

Are you a data scientist looking for a job? Check out our remote data scientist jobs available.