Consider random counts that correspond to cells, so that every cell gets some count. For example, the counts could be numbers of failures or successes, and a cell might correspond to failures at a particular plant, or successes of a particular system, or failures during some time period. For more details, see the explanations for testing poolability when estimating p or estimating l. Let ni denote the count for cell i. A hypothesis, Ho is to be tested, such as the hypothesis that all the failures on demand have the same probability or that all the events in time occur with the same frequency. These two example hypotheses each involve an unknown parameter, the failure probability and the event frequency, respectively. In general, the hypothesis to be tested typically involves one or more unknown parameters.
Estimate the unknown parameter(s) from the count data, and calculate the expected
count for each cell if the hypothesis is true. Denote the expected count for cell
i by
. The difference between the observed count and the expected count for any cell
is
. There are
many cells, and therefore many ways of combining the differences to yield an overall
number. One useful way is to construct
,
called the chi-squared statistic, or sometimes the Pearson chi-squared statistic
after its inventor, Karl Pearson. If the observed counts
are far from the expected values
, the evidence
against Ho is strong. Therefore, Ho should be
rejected if
is large, and not rejected if
is small.
How large or small must
be? As the counts become large,
has approximately a chi-squared distribution. To calculate the degrees of freedom,
treat any demand counts as nonrandom. The degrees of freedom is the number of independent
counts, minus the number of estimated parameters under Ho. Reject
Ho if
is in the right tail of the chi-squared distribution, for example beyond the 95th
percentile.
The chi-squared approximation is valid if the data set is large. RADS gives a mild
warning if any of the expected values
are less than 1.0, and a stronger warning if any of the
values are less than 0.5. These warnings agree with standard published rules of
thumb.
This test for poolability does not use any ordering of the cells. Thus, tests that are specifically designed to detect trends may be more suited to analysis of differences between time periods.
In the typical RADS application, two attributes of any event are (a) whether it
is a failure or success and (b) the source of the data
¾
the plant, the system, the year, or the type of demand. RADS constructs a table
with two columns and R rows. The columns correspond to failures and successes,
respectively, and the R rows correspond to the R sources of data.
Denote the count in the ith row and jth
column by
, for i any number from 1 to R and j equal to 1 or 2. The number
of failures for row i is
and the number of successes is
. The number of demands is
. Similarly,
let
be the total number of failures in all rows, let
be the total number of successes, and let
be the grand total number of demands.
The hypothesis of poolability is that all the rows correspond to the same probability
of failure on demand. The natural estimate of this probability is
. If the data
sources can be pooled, the expected number of failures for row i is
, and the
expected number of successes is
. The Pearson chi-squared statistic is defined as
If the observed counts are far from the expected counts, the evidence against poolability
is strong. The existence of observed counts that are far from the expected counts
causes
to be large. Therefore, poolability is rejected when
is large. When the sample size is large,
has approximately a chi-squared distribution with R
-
1 degrees of freedom. So poolability is rejected with p-value 0.05 if
is larger than the 95th percentile of the chi-squared distribution, with p-value
0.01 if
is larger than the 99th percentile, and so forth.
The contribution to
from row i is
.
Inspection of these terms shows which rows show the most deviation from the overall average, in the sense of the chi-squared test. They can help show in what ways poolability is violated. In RADS the signed square root of this term is shown, with a positive sign if the number of failures is higher than expected, and a negative sign if the number of failures is lower than expected. These signed square roots are called the "Pearson residuals."
If the data set is small (few failures or few successes) the chi-squared approximation
is not very good. RADS gives a mild warning if any of the expected values
are less than 1.0, and a stronger warning if any of the
values are less than 0.5. These warnings agree with standard published rules of
thumb.
For more information, see Atwood (1994) (Ref i)
In the typical RADS application, various sources of the data
¾
plants, systems, years, or types of discovery activity
¾
each correspond to a count of events (such as initiating events or system demands)
and an exposure time, such as calendar years or critical hours. RADS constructs
a table with C cells, each cell corresponding to one source of data. The
ith cell contains a count, ni, the number
of events reported for that source of data during exposure time
. Denote the total number of events in all the data by
, and denote
the total exposure time by
.
The hypothesis of poolability is that all the cells correspond to the same event
frequency. The natural estimate of this frequency is
. If the data sources can be pooled, the expected number of events for cell i
is
. The Pearson
chi-squared statistic is defined as
.
If the observed counts are far from the expected counts, the evidence against poolability
is strong. The existence of observed counts that are far from the expected counts
causes
to be large. Therefore, poolability is rejected when
is large. When the sample size is large,
has approximately a chi-squared distribution with C
-
1 degrees of freedom. So poolability is rejected with p-value 0.05 if
is larger than the 95th percentile of the chi-squared distribution, with p-value
0.01 if
is larger than the 99th percentile, and so forth.
The contribution to
from row i is
.
Inspection of these terms shows which cells show the most deviation from the overall
average, in the sense of the chi-squared test. They can help show in what ways poolability
is violated. RADS shows the signed square root of these terms,. They are called
the "Pearson residuals".
If the data set is small (few events) the chi-squared approximation is not very
good. RADS gives a mild warning if any of the expected values
are less than
1.0, and a stronger warning if any of the
values are less than 0.5. These warnings agree with standard published rules of
thumb.
For more information, see Engelhardt (1994) (Ref [2]).