Meaning of Goodness of fit test:
We find out which distribution fits the sample data the most. And this is achieved using chi-square distribution (Snedecor and Cochran, 1989).
How to apply:
There are 4 steps to follow:
There is insufficient evidence to conclude that the arrivals per minute do not fit a Poisson distribution or fit a Poisson Distribution.
Example 3:
The manager of a computer network has collected data on the number of times that service has been interrupted on each day over the past 500 days. The results are as follows:
We find out which distribution fits the sample data the most. And this is achieved using chi-square distribution (Snedecor and Cochran, 1989).
How to apply:
There are 4 steps to follow:
- State the hypothesis: Data follows a distribution or not
- Criteria to reject null hypothesis: if Χ2 > Χ2(k,1-α) then reject null hypothesis.
- Analyze sample data: Compute the chi-square value using below formula:
- ∑(Oi- Ei)2/Ei : Oi is observed frequency and Ei is expected frequency
- Interpret the results: Declare the results after comparing the values of Χ2 and Χ2(k,1-α), where k is degree of freedom and α is significance level.
Degree of Freedom:
It is = n - 1 - m
m: number of parameter in the distribution. So in case of normal distribution m is 2 (μ,α) and in case of poisson dist. m is = 1 (λ).
Example 1: Goodness of fit test for Normal Distribution
Year wise data is given about number of car accidents, find out whether given data follows normal distribution, α is 5% ? (In question only first two columns will be given). Sample size = 12
Answer:
Step 1: Stating Hypothesis
Null Hypothesis(H0): Data follows normal distribution
Alternative Hypothesis(Ha): Data do not follow normal distribution
Step 2: Criteria to reject null hypothesis:
if Χ2 > Χ2(k,1-α) then reject null hypothesis.
Step 3: Analyze sample data:
Compute the last 4 columns of the given table.
Rest of the columns are computed as:
Ei = Total(ΣOi)/sample size = 1757 / 12 = 146.4 and rest are obvious
Sum of last column (Χ2)= 18.135
Now find out value of Χ2(k,1-α) from table where k is 12 - 1 - 2 = 9 and 1 - α = 0.95, the value is 16.92 highlighted in below table, (it will be provided in the exam).
Step 4: Interpret the results
As we can see that Χ2 > Χ2(k,1-α) is true so we will reject the null hypothesis and declare that given sample data do not follow the normal distribution.
Example 2: Goodness of fit test for Poisson Distribution
Number of arrivals per minute at a bank located in the central business district of a city. Suppose that the actual arrivals per minute were observed in 200 one-minute periods over the course of a week. The results are summarized in Table below, find out whether the given data follows a Poisson distribution or not (α = 5%) ?, Expected Frequency should be > 1.
Answer:
Step 1: Stating Hypothesis
H0: The number of arrivals per minute follows a Poisson distribution
H1: The number of arrivals per minute does not follow a Poisson distribution
Step 2: Criteria to reject null hypothesis:
if Χ2 > Χ2(k,1-α) then reject null hypothesis.
Step 3: Analyze sample data:
Since the Poisson distribution has one parameter, its mean λ which can be computed from data given using the below formula:
X-bar = (∑(fi * mi))/∑fi
So when we compute about value from the data we get λ = (X-bar) = 580/200 = 2.90
Find the probabilities from the tables of the Poisson distribution table. Frequency of X successes
(X = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or more) can be determined.
The theoretical frequency for each is obtained by multiplying the appropriate Poisson probability by the sample size n. These results are summarized in Table below: (n=200)
For poisson distribution use formula : P(X=x) = (e-λ*λi)/i!
n=200 (Given in the example)
Observe from Table above that the theoretical frequency of 9 or more arrivals is less than 1.0.
In order to have all categories contain a frequency of 1.0 or greater, the category 9 or more is combined with the category of 8 arrivals as below:
Now we will apply chi-square test for determining whether the data follow Poisson probability distribution is computed using below formula:
∑(Oi - Ei)2/Ei : Oi is observed frequency and Ei is expected frequency
k (degree of freedom) = n - 1 - m = 9 - 1 - 1 = 7,
Why n is 9 cause we have arrivals (0-8), we have combined 9 or more to 8 to have all the theoretical frequencies > 1
And m is 1 as Poisson distribution has only 1 parameter that is λ.
It is = n - 1 - m
m: number of parameter in the distribution. So in case of normal distribution m is 2 (μ,α) and in case of poisson dist. m is = 1 (λ).
Example 1: Goodness of fit test for Normal Distribution
Year wise data is given about number of car accidents, find out whether given data follows normal distribution, α is 5% ? (In question only first two columns will be given). Sample size = 12
Answer:
Step 1: Stating Hypothesis
Null Hypothesis(H0): Data follows normal distribution
Alternative Hypothesis(Ha): Data do not follow normal distribution
Step 2: Criteria to reject null hypothesis:
if Χ2 > Χ2(k,1-α) then reject null hypothesis.
Step 3: Analyze sample data:
Compute the last 4 columns of the given table.
YEAR | Oi | Ei | Oi - Ei | (Oi − Ei)^2 | (Oi − Ei)^2/Ei |
---|---|---|---|---|---|
1978 | 164 | 146.4 | 17.6 | 309.76 | 2.116 |
1979 | 142 | 146.4 | -4.4 | 19.36 | 0.132 |
1980 | 153 | 146.4 | 6.6 | 43.56 | 0.298 |
1981 | 171 | 146.4 | 24.6 | 605.16 | 4.134 |
1982 | 171 | 146.4 | 24.6 | 605.16 | 4.134 |
1983 | 148 | 146.4 | 1.6 | 2.56 | 0.017 |
1984 | 136 | 146.4 | -10.4 | 108.16 | 0.739 |
1985 | 133 | 146.4 | -13.4 | 179.56 | 1.227 |
1986 | 138 | 146.4 | -8.4 | 70.56 | 0.482 |
1987 | 132 | 146.4 | -14.4 | 207.36 | 1.416 |
1988 | 145 | 146.4 | -1.4 | 1.96 | 0.013 |
1989 | 124 | 146.4 | -22.4 | 501.76 | 3.427 |
Rest of the columns are computed as:
Ei = Total(ΣOi)/sample size = 1757 / 12 = 146.4 and rest are obvious
Sum of last column (Χ2)= 18.135
Now find out value of Χ2(k,1-α) from table where k is 12 - 1 - 2 = 9 and 1 - α = 0.95, the value is 16.92 highlighted in below table, (it will be provided in the exam).
Step 4: Interpret the results
As we can see that Χ2 > Χ2(k,1-α) is true so we will reject the null hypothesis and declare that given sample data do not follow the normal distribution.
Example 2: Goodness of fit test for Poisson Distribution
Number of arrivals per minute at a bank located in the central business district of a city. Suppose that the actual arrivals per minute were observed in 200 one-minute periods over the course of a week. The results are summarized in Table below, find out whether the given data follows a Poisson distribution or not (α = 5%) ?, Expected Frequency should be > 1.
ARRIVALS | FREQUENCY |
0 | 14 |
1 | 31 |
2 | 47 |
3 | 41 |
4 | 29 |
5 | 21 |
6 | 10 |
7 | 5 |
8 | 2 |
9 or more | 0 |
Answer:
Step 1: Stating Hypothesis
H0: The number of arrivals per minute follows a Poisson distribution
H1: The number of arrivals per minute does not follow a Poisson distribution
Step 2: Criteria to reject null hypothesis:
if Χ2 > Χ2(k,1-α) then reject null hypothesis.
Step 3: Analyze sample data:
Since the Poisson distribution has one parameter, its mean λ which can be computed from data given using the below formula:
X-bar = (∑(fi * mi))/∑fi
So when we compute about value from the data we get λ = (X-bar) = 580/200 = 2.90
ARRIVALS | FREQUENCY | fi*mi |
0 | 14 | 0 |
1 | 31 | 31 |
2 | 47 | 94 |
3 | 41 | 123 |
4 | 29 | 116 |
5 | 21 | 105 |
6 | 10 | 60 |
7 | 5 | 35 |
8 | 2 | 16 |
9 or more | 0 | 0 |
Total | 200 | 580 |
Find the probabilities from the tables of the Poisson distribution table. Frequency of X successes
(X = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or more) can be determined.
The theoretical frequency for each is obtained by multiplying the appropriate Poisson probability by the sample size n. These results are summarized in Table below: (n=200)
For poisson distribution use formula : P(X=x) = (e-λ*λi)/i!
n=200 (Given in the example)
ARRIVALS | FREQUENCY | PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9 | THEORETICAL FREQUENCY = n*P(X) |
0 | 14 | 0.055 | 11 |
1 | 31 | 0.1596 | 31.92 |
2 | 47 | 0.2314 | 46.28 |
3 | 41 | 0.2237 | 44.74 |
4 | 29 | 0.1622 | 32.44 |
5 | 21 | 0.094 | 18.8 |
6 | 10 | 0.0455 | 9.1 |
7 | 5 | 0.0188 | 3.76 |
8 | 2 | 0.0068 | 1.36 |
9 or more | 0 | 0.003 | 0.6 |
Observe from Table above that the theoretical frequency of 9 or more arrivals is less than 1.0.
In order to have all categories contain a frequency of 1.0 or greater, the category 9 or more is combined with the category of 8 arrivals as below:
ARRIVALS | FREQUENCY | PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9 | THEORETICAL FREQUENCY = n*P(X) |
0 | 14 | 0.055 | 11 |
1 | 31 | 0.1596 | 31.92 |
2 | 47 | 0.2314 | 46.28 |
3 | 41 | 0.2237 | 44.74 |
4 | 29 | 0.1622 | 32.44 |
5 | 21 | 0.094 | 18.8 |
6 | 10 | 0.0455 | 9.1 |
7 | 5 | 0.0188 | 3.76 |
8 | 2 | 0.0068 | 1.96 |
Now we will apply chi-square test for determining whether the data follow Poisson probability distribution is computed using below formula:
∑(Oi - Ei)2/Ei : Oi is observed frequency and Ei is expected frequency
k (degree of freedom) = n - 1 - m = 9 - 1 - 1 = 7,
Why n is 9 cause we have arrivals (0-8), we have combined 9 or more to 8 to have all the theoretical frequencies > 1
And m is 1 as Poisson distribution has only 1 parameter that is λ.
ARRIVALS | FREQUENCY (Observed) | PROBABILITY, P (X ), FOR poisson distribution with lambda=2.9 | THEORETICAL FREQUENCY = n*P(X) | Oi - Ei | (Oi - Ei)^2 | (Oi - Ei)^2/Ei |
0 | 14 | 0.055 | 11 | 3.00 | 9 | 0.818181818 |
1 | 31 | 0.1596 | 31.92 | -0.92 | 0.8464 | 0.026516291 |
2 | 47 | 0.2314 | 46.28 | 0.72 | 0.5184 | 0.011201383 |
3 | 41 | 0.2237 | 44.74 | -3.74 | 13.9876 | 0.312641931 |
4 | 29 | 0.1622 | 32.44 | -3.44 | 11.8336 | 0.364784217 |
5 | 21 | 0.094 | 18.8 | 2.20 | 4.84 | 0.257446809 |
6 | 10 | 0.0455 | 9.1 | 0.90 | 0.81 | 0.089010989 |
7 | 5 | 0.0188 | 3.76 | 1.24 | 1.5376 | 0.40893617 |
8 | 2 | 0.0068 | 1.96 | 0.04 | 0.0016 | 0.000816327 |
Total | 2.28954 |
Now find out value of Χ2(k,1-α) from table where k is 9 - 1 - 1 = 7 and 1 - α = 0.95, the value is 14.07 highlighted in below table, (it will be provided in the exam).
Step 4: Interpret the results
since χ2 = 2.28954 < 14.07, So the decision is accept H0.Step 4: Interpret the results
There is insufficient evidence to conclude that the arrivals per minute do not fit a Poisson distribution or fit a Poisson Distribution.
Example 3:
The manager of a computer network has collected data on the number of times that service has been interrupted on each day over the past 500 days. The results are as follows:
INTERRUPTIONS PER DAY | NUMBER OF DAYS |
0 | 160 |
1 | 175 |
2 | 86 |
3 | 41 |
4 | 18 |
5 | 12 |
6 | 8 |
Total | 500 |
Does the distribution of service interruptions follow a
Poisson distribution? (Use the 0.01 level of significance.)
Example 4:
A random sample of 500 long distance telephone calls revealed the following distribution of call length (in minutes)
Length in Minutes | Frequency |
0–under 5 | 48 |
5–under 10 | 84 |
10–under 15 | 164 |
15–under 20 | 126 |
20–under 25 | 50 |
25–under 30 | 28 |
Total | 500 |
At the 0.05 level of significance, does call length follow a
normal distribution?
neat and clear explanation. thank you
ReplyDeleteDo you have the answer to Example 3?
ReplyDelete