Empirical Bayes Model for p (Probability)

Suppose that the data for failures on demand have been partitioned into distinct data sources, such as distinct plants or systems. Each data source corresponds to an observed number of demands, say n_i demands for the ith data source. The number of failures for that data source is assumed to be a binomial(n_i, p) random variable. For concreteness, the model is explained here assuming that the distinct data sources are plants.

Suppose that p is believed to vary. For example, the chi-square test may have rejected the hypothesis of a constant p, or engineering considerations may suggest that p is not constant. What model should be used?

One approach is to analyze each plant separately, resulting in a separate estimate of p for each plant. This is reasonable if there are only two plants. It might also be reasonable if one plant is clearly different from all the others. In this case, the one plant could be analyzed separately, and all the other plants might be considered as a single homogeneous source. After this regrouping of the data, only two data sources exist, to be analyzed separately.

Often, however, removal of one outlier leaves a data set that contains another outlier, and removal of that may leave a set with still another outlier. In such cases it is simpler (that is, it involves fewer parameters to estimate) to use the following compound model.

The method models the variation between the plants. It is assumed that p varies from plant to plant, and follows a beta(a, b) distribution. Let p_i denote the value corresponding to the ith plant. At this plant, conditional on the value of p_i, it is assumed that each demand results in a failure with probability p_i. In short, the number of failures in n demands at the ith plant is binomial(n, p_i); the number of failures in n demands at a single random plant has a beta-binomial(n, a, b) distribution. The value of n is known, and the two parameters a and b are unknown.

RADS uses numerical iteration to find the maximum likelihood estimates of a and b.

The iterative procedure does not always converge. In some cases, the search for the maximum likelihood leads to a/(a+b) stabilizing at a finite value but a+b diverging to infinity. RADS stops the numerical search whenever a+b appears to be larger than the total number of trials in the data set; such a large value for a+b should not be used because it would result in plant-specific intervals shorter than the interval based on simply pooling all the data. In this case RADS states that the empirical Bayes distribution is degenerate, concentrated at a single point.

Suppose now that finite values of a and b have been estimated, giving an estimated overall distribution of p. We now naturally inquire about the value of p_i at the ith plant. The model says that p_i was randomly selected from the beta(a, b) distribution, and the x_i failures in n_i demands were generated from a binomial(n_i, p_i) distribution. Recall that a and b are estimated by a and b. The simplest approach is to act as if the estimates were the true values. For the ith data source, let the prior distribution of p_i be beta(a, b), and find the posterior distribution of p_i by updating the prior with the data. This is the usual Bayes method for estimating p_i, except that the prior parameters are estimated from all the data. This leads to the name “empirical Bayes estimation.”

More sophisticated methods account for the variability in the estimators a and b. In particular, Kass and Steffey (1989) give a simple first-order correction, which RADS applies to the beta-binomial model. This approach leaves each plant-specific mean unchanged but lengthens the interval somewhat.

RADS gives the overall distribution, beta(a, b), with its mean and a 90% interval corresponding to the 5th and 95 percentiles. RADS also gives the plant-specific means and 90% intervals. The plant-specific distributions are approximated by beta distributions, but after the Kass-Steffey adjustment the plant-specific distribution parameters do not correspond to simple updating of a single prior distribution with the plant-specific data.

Empirical Bayes Model for l (Rate)

Suppose that the data for and event frequency have been partitioned into distinct data sources, such as distinct plants or systems. Each data source corresponds to an observed exposure time, say t_i hours for the ith data source. The number of events for that data source is assumed to be a Poisson(lt_i) random variable. For concreteness, the model is explained here assuming that the distinct data sources are plants.

Suppose that l is believed to vary. For example, the chi-square test may have rejected the hypothesis of a constant l, or engineering considerations may suggest that p is not constant. What model should be used?

One approach is to analyze each plant separately, resulting in a separate estimate of l for each plant. This is reasonable if there are only two plants. It might also be reasonable if one plant is clearly different from all the others. In this case, the one plant could be analyzed separately, and all the other plants might be considered as a single homogeneous source. After this regrouping of the data, only two data sources exist, to be analyzed separately.

The method models the variation between the plants. It is assumed that l varies from plant to plant, and follows a gamma(a, b) distribution. Let l_i denote the value corresponding to the ith plant. At this plant, it is assumed that events occur randomly with frequency l_i. In short, the number of failures in time t at the ith plant is Poisson(l_it); the number of failures in time t demands at a single random plant has a gamma-Poisson(t, a, b) distribution. The value of t is known, and the two parameters a and b are unknown.

RADS uses numerical iteration to find the maximum likelihood estimates of a and b.

The iterative procedure does not always converge. In some cases, the search for the maximum likelihood leads to a/b stabilizing at a finite value but b diverging to infinity. RADS stops the numerical search whenever b appears to be larger than the total exposure time in the data set; such a large value for b should not be used because it would result in plant-specific intervals shorter than the interval based on simply pooling all the data. In this case RADS states that the empirical Bayes distribution is degenerate, concentrated at a single point.

Suppose now that finite values of a and b have been estimated, giving an estimated overall distribution of l. We now naturally inquire about the value of l_i at the ith plant. The model says that l_i was randomly selected from the gamma(a, b) distribution, and the x_i events in time t_i were generated from a Poisson(l_it_i) distribution. Recall that a and b are estimated by a and b. The simplest approach is to act as if the estimates were the true values. For the ith data source, let the prior distribution of l_i be beta(a, b), and find the posterior distribution of l_i by updating the prior with the data. This is the usual Bayes method for estimating l_i, except that the prior parameters are estimated from all the data. This leads to the name “empirical Bayes estimation.”

More sophisticated methods account for the variability in the estimators a and b. In particular, Kass and Steffey (1989) give a simple first-order correction, which RADS applies to the gamma-Poisson model. This approach leaves each plant-specific mean unchanged but lengthens the interval somewhat.

RADS gives the overall distribution, gamma(a, b), with its mean and a 90% interval corresponding to the 5th and 95 percentiles. RADS also gives the plant-specific means and 90% intervals. The plant-specific distributions are approximated by gamma distributions, but after the Kass-Steffey adjustment the plant-specific distribution parameters do not correspond to simple updating of a single prior distribution with the plant-specific data.