# Statistics – Goodness of Fit

The Goodness of Fit test is used to check the sample data whether it fits from a distribution of a population. Population may have normal distribution or Weibull distribution. In simple words, it signifies that sample data represents the data correctly that we are expecting to find from actual population. Following tests are generally used by statisticians:

·        Chi-square

·        Kolmogorov-Smirnov

·        Anderson-Darling

·        Shipiro-Wilk

## Chi-square Test

The chi-square test is the most commonly used to test the goodness of fit tests and is used for discrete distributions like the binomial distribution and the Poisson distribution, whereas The Kolmogorov-Smirnov and Anderson-Darling goodness of fit tests are used for continuous distributions.

## Formula

X2=∑[(Oi−Ei)2Ei]X2=∑[(Oi−Ei)2Ei]

Where −

·        OiOi = observed value of ith level of variable.

·        EiEi = expected value of ith level of variable.

·        X2X2 = chi-squared random variable.

## Example

A toy company builts football player toys. It claims that 30% of the cards are mid-fielders, 60% defenders, and 10% are forwards. Considering a random sample of 100 toys has 50 mid-fielders, 45 defenders, and 5 forwards. Given 0.05 level of significance, can you justify company’s claim?

Solution:

### Determine Hypotheses

·        Null hypothesis H0H0 – The proportion of mid-fielders, defenders, and forwards is 30%, 60% and 10%, respectively.

·        Alternative hypothesis H1H1 – At least one of the proportions in the null hypothesis is false.

### Determine Degree of Freedom

The degrees of freedom, DF is equal to the number of levels (k) of the categorical variable minus 1: DF = k – 1. Here levels are 3. Thus

DF=k−1=3−1=2DF=k−1=3−1=2

### Determine chi-square test statistic

X2=∑[(Oi−Ei)2Ei]=[(50−30)230]+[(45−60)260]+[(5−10)210]=40030+22560+2510=13.33+3.75+2.50=19.58X2=∑[(Oi−Ei)2Ei]=[(50−30)230]+[(45−60)260]+[(5−10)210]=40030+22560+2510=13.33+3.75+2.50=19.58

### Determine p-value

P-value is the probability that a chi-square statistic,X2X2 having 2 degrees of freedom is more extreme than 19.58. Use the Chi-Square Distribution Calculator to find P(X2>19.58)=0.0001P(X2>19.58)=0.0001.

### Interpret results

As the P-value (0.0001) is quite less than the significance level (0.05), the null hypothesis can not be accepted. Thus company claim is invalid.