## Chi Square Test

Chi square test is a 'goodness-to-fit' test. In other words, how close do observations of an event come to the expected results? The chi square (%2) is determined by summing the squared value of the difference between the observed results (obs) and expected results (exp) divided by the expected results.

Connected to the process of hypothesis testing is the concept of a statistically significant result, which involves a probability value or 'p-value'. A p-value

D.N.A. Box 19.1 What does it mean to be 'statistically significant'?

reflects the probability that a variable being measured would assume a value greater than or equal to the observed value strictly by chance. In mathematical terms this can be described as P(z > zobserved). The threshold whereby a p-value is considered significant is set by an 'alpha value' (a). With the commonly used 95% confidence limit, a = 0.05 (since 100-95% is 5% or 0.05).

A variety of alpha values are used in different fields, but probably the most common is 0.05 for a 95% confidence interval around a measurement. Thus, if a p-value is > 0.05, then the test statistic and comparison are considered 'not significant'. If the p-value is computed to be between 1% and 5%, then it is generally considered 'significant' in which case the value can be denoted with an asterisk (e.g., 0.0435*). When the computed p-value is less than 1%, it is thought to be 'highly significant' and can be marked with a double asterisk (e.g., 0.00273**).

Thus, in cases where the p-value, which is the probability of obtaining an observed result or a more extreme result, is less than the conventional 0.05, we conclude that there is a 'significant relationship' between the two classification factors. However, it is important to keep in mind that the outcome of the significance testing is very much dependant on how the question is framed as part of the hypothesis testing.

Source:

The resultant chi square value is then compared against a table of numbers to see if there is a significant deviation from the 'normal' values expected. High chi-square values indicate discrepancies between observed and expected results. Different 'degrees of freedom' may be applied to data depending on the situation. Paul Lewis at the University of Connecticut has created a nice little freeware program that can quickly relate user inputted chi-square values and degrees of freedom to their p-value. This program is available at: http://lewis.eeb.uconn.edu/ lewishome/software.html.