Pooling

Consider random counts that correspond to cells, so that every cell gets some count. For example, the counts could be numbers of failures or successes, and a cell might correspond to failures at a particular plant, or successes of a particular system, or failures during some time period. For more details, see the explanations for testing poolability when estimating p or estimating l. Let n_i denote the count for cell i. A hypothesis, H_o is to be tested, such as the hypothesis that all the failures on demand have the same probability or that all the events in time occur with the same frequency. These two example hypotheses each involve an unknown parameter, the failure probability and the event frequency, respectively. In general, the hypothesis to be tested typically involves one or more unknown parameters.

Estimate the unknown parameter(s) from the count data, and calculate the expected count for each cell if the hypothesis is true. Denote the expected count for cell i by . The difference between the observed count and the expected count for any cell is . There are many cells, and therefore many ways of combining the differences to yield an overall number. One useful way is to construct

called the chi-squared statistic, or sometimes the Pearson chi-squared statistic after its inventor, Karl Pearson. If the observed counts are far from the expected values , the evidence against H_o is strong. Therefore, H_o should be rejected if is large, and not rejected if is small.

How large or small must be? As the counts become large, has approximately a chi-squared distribution. To calculate the degrees of freedom, treat any demand counts as nonrandom. The degrees of freedom is the number of independent counts, minus the number of estimated parameters under H_o. Reject H_o if is in the right tail of the chi-squared distribution, for example beyond the 95^th percentile.

The chi-squared approximation is valid if the data set is large. RADS gives a mild warning if any of the expected values are less than 1.0, and a stronger warning if any of the values are less than 0.5. These warnings agree with standard published rules of thumb.

This test for poolability does not use any ordering of the cells. Thus, tests that are specifically designed to detect trends may be more suited to analysis of differences between time periods.

Testing for Homogeneity (Poolability) When Estimating p

In the typical RADS application, two attributes of any event are (a) whether it is a failure or success and (b) the source of the data ¾ the plant, the system, the year, or the type of demand. RADS constructs a table with two columns and R rows. The columns correspond to failures and successes, respectively, and the R rows correspond to the R sources of data. Denote the count in the i^th row and j^th column by , for i any number from 1 to R and j equal to 1 or 2. The number of failures for row i is and the number of successes is . The number of demands is . Similarly, let be the total number of failures in all rows, let be the total number of successes, and let be the grand total number of demands.

The hypothesis of poolability is that all the rows correspond to the same probability of failure on demand. The natural estimate of this probability is . If the data sources can be pooled, the expected number of failures for row i is , and the expected number of successes is . The Pearson chi-squared statistic is defined as

If the observed counts are far from the expected counts, the evidence against poolability is strong. The existence of observed counts that are far from the expected counts causes to be large. Therefore, poolability is rejected when is large. When the sample size is large, has approximately a chi-squared distribution with R - 1 degrees of freedom. So poolability is rejected with p-value 0.05 if is larger than the 95th percentile of the chi-squared distribution, with p-value 0.01 if is larger than the 99th percentile, and so forth.

The contribution to from row i is

Inspection of these terms shows which rows show the most deviation from the overall average, in the sense of the chi-squared test. They can help show in what ways poolability is violated. In RADS the signed square root of this term is shown, with a positive sign if the number of failures is higher than expected, and a negative sign if the number of failures is lower than expected. These signed square roots are called the "Pearson residuals."

If the data set is small (few failures or few successes) the chi-squared approximation is not very good. RADS gives a mild warning if any of the expected values are less than 1.0, and a stronger warning if any of the values are less than 0.5. These warnings agree with standard published rules of thumb.

For more information, see Atwood (1994) (Ref i) (Ref [1])

Testing for Homogeneity (Poolability) When Estimating Lambda

In the typical RADS application, various sources of the data ¾ plants, systems, years, or types of discovery activity ¾ each correspond to a count of events (such as initiating events or system demands) and an exposure time, such as calendar years or critical hours. RADS constructs a table with C cells, each cell corresponding to one source of data. The i^th cell contains a count, n_i, the number of events reported for that source of data during exposure time . Denote the total number of events in all the data by , and denote the total exposure time by .

The hypothesis of poolability is that all the cells correspond to the same event frequency. The natural estimate of this frequency is . If the data sources can be pooled, the expected number of events for cell i is . The Pearson chi-squared statistic is defined as

If the observed counts are far from the expected counts, the evidence against poolability is strong. The existence of observed counts that are far from the expected counts causes
to be large. Therefore, poolability is rejected when is large. When the sample size is large, has approximately a chi-squared distribution with C - 1 degrees of freedom. So poolability is rejected with p-value 0.05 if is larger than the 95th percentile of the chi-squared distribution, with p-value 0.01 if is larger than the 99th percentile, and so forth.

The contribution to from row i is

Inspection of these terms shows which cells show the most deviation from the overall average, in the sense of the chi-squared test. They can help show in what ways poolability is violated. RADS shows the signed square root of these terms,. They are called the "Pearson residuals".

If the data set is small (few events) the chi-squared approximation is not very good. RADS gives a mild warning if any of the expected values are less than 1.0, and a stronger warning if any of the values are less than 0.5. These warnings agree with standard published rules of thumb.

For more information, see Engelhardt (1994) (Ref [2]).