By Stephen Nelson
The chi-square distribution is commonly used to make inferences about a population variance.
If a population follows the normal distribution, you can draw a sample of size N from this distribution and form the sum of the squared standardized scores (chi-square). This random variable chi-square follows the chi-square probability distribution with n degrees of freedom (df ), where n is a positive integer equal to N-1. The degrees of freedom parameter determines the shape of the distribution. With more degrees of freedom, the skew is less.
The CHIDIST function returns the area in the upper tail of the chi-square distribution. You use the CHIDIST function the same way you would use a chi-square distribution table. The CHIDIST function uses the following syntax:
=CHIDIST (x, df)
For example, if you pull a random sample of 16 from a population and want to find the probability of a sample chi-square value (x) 25 or larger, you would enter:
The function returns the value 0.049943, meaning that a value of 25 or more should in the long run occur about five times in a hundred.
You can use the CHIINV function to create confidence interval estimates of a population variance. That is, you use the CHIDIST function if you know x and want to find the probability, and you use the CHIINV function if you have a probability and want to find x. For example, if you’re creating a product and weigh a sample of 18 units to find a sample variance of 0.36, you may want to construct a 90% confidence interval estimate of the population variance for the product. With a sample size of 18, you have 17 degrees of freedom.
To find the upper limit, enter:
To find the lower limit, enter:
These formulas return the values 8.67175 and 27.5871. Multiply the sample variance of 0.36 by the degrees of freedom and divide this product by each of the values returned from the CHIINV function to find the lower and upper limits of the confidence interval. You can take the square root of these values to establish interval estimates of the population standard deviation.
The chi-square test is used to test independence of two variables. You can use the chi-square test to determine whether there is a significant difference between observed and expected frequencies. For example, if you want to find out whether soft drink preference differs between male and female drinkers, you can construct a null hypothesis that soft drink preference is independent of the gender of the drinker, and create a worksheet range, or table, of expected results based on a sample of 93 male drinkers and 85 female drinkers. You can then create a table of the results of the actual study findings.
TIP: You can use the Microsoft Excel Fisher’s test function instead of the chi-square test for analyzing contingency tables with two rows and two columns. Fisher’s test always returns the exact P value, whereas the chi-square test returns only an approximate p value. Definitely avoid the chi-square test when the numbers in the contingency table are very small (in the single digits).
The CHITEST formula uses the following syntax:
=CHITEST (actual range, expected range)
where actual range is the data in the actual sample results table and expected range is the data from the expected results table.
The formula returns the p-value. You reject the null hypothesis if this value is less than your level of significance alpha. So if your level of significance is .05, you would reject it, but not if your level of significance is .025 or .01. The test for independence is a one-tailed test, so a level of significance of .05 corresponds with a 95% confidence level.
About the author: Seattle CPA Stephen L. Nelson wrote the bestselling book, MBA’s Guide to Microsoft Excel, from which this short article is adapted. Nelson also writes and edits the S Corporations Explained and LLCs Explained websites.
Want to read more about MS Excel tips and tutorials? Visit Hot Excel.