Goodness of Fit Test for normal and poisson distribution

Meaning of Goodness of fit test:

We find out which distribution fits the sample data the most. And this is achieved using chi-square distribution (Snedecor and Cochran, 1989).

How to apply:
There are 4 steps to follow:

State the hypothesis: Data follows a distribution or not
Criteria to reject null hypothesis: if Χ² > Χ²(k,1-α) then reject null hypothesis.
Analyze sample data: Compute the chi-square value using below formula:

∑(Oi- Ei)²/Ei : Oi is observed frequency and Ei is expected frequency

Interpret the results: Declare the results after comparing the values of Χ² and Χ²(k,1-α), where k is degree of freedom and α is significance level.

Degree of Freedom:
It is = n - 1 - m
m: number of parameter in the distribution. So in case of normal distribution m is 2 (μ,α) and in case of poisson dist. m is = 1 (λ).

Example 1: Goodness of fit test for Normal Distribution

Year wise data is given about number of car accidents, find out whether given data follows normal distribution, α is 5% ? (In question only first two columns will be given). Sample size = 12

Answer:

Step 1: Stating Hypothesis

Null Hypothesis(H0): Data follows normal distribution
Alternative Hypothesis(Ha): Data do not follow normal distribution

Step 2: Criteria to reject null hypothesis:
if Χ² > Χ²(k,1-α) then reject null hypothesis.

Step 3: Analyze sample data:
Compute the last 4 columns of the given table.

YEAR	Oi	Ei	Oi - Ei	(Oi − Ei)^2	(Oi − Ei)^2/Ei
1978	164	146.4	17.6	309.76	2.116
1979	142	146.4	-4.4	19.36	0.132
1980	153	146.4	6.6	43.56	0.298
1981	171	146.4	24.6	605.16	4.134
1982	171	146.4	24.6	605.16	4.134
1983	148	146.4	1.6	2.56	0.017
1984	136	146.4	-10.4	108.16	0.739
1985	133	146.4	-13.4	179.56	1.227
1986	138	146.4	-8.4	70.56	0.482
1987	132	146.4	-14.4	207.36	1.416
1988	145	146.4	-1.4	1.96	0.013
1989	124	146.4	-22.4	501.76	3.427

Rest of the columns are computed as:
Ei = Total(ΣOi)/sample size = 1757 / 12 = 146.4 and rest are obvious
Sum of last column (Χ²)= 18.135

Now find out value of Χ²(k,1-α) from table where k is 12 - 1 - 2 = 9 and 1 - α = 0.95, the value is 16.92 highlighted in below table, (it will be provided in the exam).

Step 4: Interpret the results
As we can see that Χ² > Χ²(k,1-α) is true so we will reject the null hypothesis and declare that given sample data do not follow the normal distribution.

Example 2: Goodness of fit test for Poisson Distribution

Number of arrivals per minute at a bank located in the central business district of a city. Suppose that the actual arrivals per minute were observed in 200 one-minute periods over the course of a week. The results are summarized in Table below, find out whether the given data follows a Poisson distribution or not (α = 5%) ?, Expected Frequency should be > 1.

ARRIVALS	FREQUENCY
0	14
1	31
2	47
3	41
4	29
5	21
6	10
7	5
8	2
9 or more	0

Answer:

Step 1: Stating Hypothesis

H0: The number of arrivals per minute follows a Poisson distribution
H1: The number of arrivals per minute does not follow a Poisson distribution

Step 2: Criteria to reject null hypothesis:
if Χ² > Χ²(k,1-α) then reject null hypothesis.

Step 3: Analyze sample data:
Since the Poisson distribution has one parameter, its mean λ which can be computed from data given using the below formula:

X-bar = (∑(fi * mi))/∑fi

So when we compute about value from the data we get λ = (X-bar) = 580/200 = 2.90

ARRIVALS	FREQUENCY	fi*mi
0	14	0
1	31	31
2	47	94
3	41	123
4	29	116
5	21	105
6	10	60
7	5	35
8	2	16
9 or more	0	0
Total	200	580

Find the probabilities from the tables of the Poisson distribution table. Frequency of X successes
(X = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or more) can be determined.
The theoretical frequency for each is obtained by multiplying the appropriate Poisson probability by the sample size n. These results are summarized in Table below: (n=200)

For poisson distribution use formula : P(X=x) = (e^-λ*λⁱ)/i!

n=200 (Given in the example)

ARRIVALS	FREQUENCY	PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9	THEORETICAL FREQUENCY = n*P(X)
0	14	0.055	11
1	31	0.1596	31.92
2	47	0.2314	46.28
3	41	0.2237	44.74
4	29	0.1622	32.44
5	21	0.094	18.8
6	10	0.0455	9.1
7	5	0.0188	3.76
8	2	0.0068	1.36
9 or more	0	0.003	0.6

Observe from Table above that the theoretical frequency of 9 or more arrivals is less than 1.0.
In order to have all categories contain a frequency of 1.0 or greater, the category 9 or more is combined with the category of 8 arrivals as below:

ARRIVALS	FREQUENCY	PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9	THEORETICAL FREQUENCY = n*P(X)
0	14	0.055	11
1	31	0.1596	31.92
2	47	0.2314	46.28
3	41	0.2237	44.74
4	29	0.1622	32.44
5	21	0.094	18.8
6	10	0.0455	9.1
7	5	0.0188	3.76
8	2	0.0068	1.96

Now we will apply chi-square test for determining whether the data follow Poisson probability distribution is computed using below formula:

∑(Oi - Ei)²/Ei : Oi is observed frequency and Ei is expected frequency

k (degree of freedom) = n - 1 - m = 9 - 1 - 1 = 7,
Why n is 9 cause we have arrivals (0-8), we have combined 9 or more to 8 to have all the theoretical frequencies > 1
And m is 1 as Poisson distribution has only 1 parameter that is λ.

ARRIVALS	FREQUENCY (Observed)	PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9	THEORETICAL FREQUENCY = n*P(X)	Oi - Ei	(Oi - Ei)^2	(Oi - Ei)^2/Ei
0	14	0.055	11	3.00	9	0.818181818
1	31	0.1596	31.92	-0.92	0.8464	0.026516291
2	47	0.2314	46.28	0.72	0.5184	0.011201383
3	41	0.2237	44.74	-3.74	13.9876	0.312641931
4	29	0.1622	32.44	-3.44	11.8336	0.364784217
5	21	0.094	18.8	2.20	4.84	0.257446809
6	10	0.0455	9.1	0.90	0.81	0.089010989
7	5	0.0188	3.76	1.24	1.5376	0.40893617
8	2	0.0068	1.96	0.04	0.0016	0.000816327
					Total	2.28954

Now find out value of Χ²(k,1-α) from table where k is 9 - 1 - 1 = 7 and 1 - α = 0.95, the value is 14.07 highlighted in below table, (it will be provided in the exam).

Step 4: Interpret the results

since χ2 = 2.28954 < 14.07, So the decision is accept H0.

There is insufficient evidence to conclude that the arrivals per minute do not fit a Poisson distribution or fit a Poisson Distribution.

Example 3:
The manager of a computer network has collected data on the number of times that service has been interrupted on each day over the past 500 days. The results are as follows:

INTERRUPTIONS PER DAY	NUMBER OF DAYS
0	160
1	175
2	86
3	41
4	18
5	12
6	8
Total	500

Does the distribution of service interruptions follow a Poisson distribution? (Use the 0.01 level of significance.)

Example 4:
A random sample of 500 long distance telephone calls revealed the following distribution of call length (in minutes)

Length in Minutes	Frequency
0–under 5	48
5–under 10	84
10–under 15	164
15–under 20	126
20–under 25	50
25–under 30	28
Total	500

At the 0.05 level of significance, does call length follow a normal distribution?

Programming Bits

Search This Blog

Goodness of Fit Test for normal and poisson distribution

Labels

Comments

Post a Comment

Popular posts from this blog

Reliable User Datagram Protocol

Bounding Rank of Fibonacci Heap