Confidence Statements for Efficiency Estimates from Stochastic Frontier Models

This paper is an empirical study of the uncertainty associated with technical efficiency estimates from stochastic frontier models. We show how to construct confidence intervals for estimates of technical efficiency levels under different sets of assumptions ranging from the very strong to the relatively weak. We demonstrate empirically how the degree of uncertainty associated with these estimates relates to the strength of the assumptions made and to various features of the data.


Introduction
This paper is a comprehensive study of methods of inference associated with technical efficiency estimates in the stochastic frontier model. We seek to characterize the nature and the empirical magnitude of the uncertainty associated with the usual estimates of efficiency levels.
From our perspective, deterministic approaches (e.g., data envelopment analysis (DEA)) produce efficiency measures, while statistical approaches (stochastic frontier models) produce efficiency estimates. The relative strengths and weaknesses of these approaches have been vigorously debated, and will continue to be. However, the strongest argument in favor of a statistical approach has always been that it provides a straightforward basis for inference, not just for point estimates. Thus, for example, one can construct standard errors and confidence intervals for estimates of technical efficiency. A statistical approach recognizes that uncertainty exists and is capable of quantifying it. In our view, uncertainty also exists within the deterministic approach, but methods of characterizing and quantifying it are still not well developed. Consistency of the DEA estimates has been established by Banker (1993) and by Tsybakov (1992, 1995). Korostelev, Simar, and Tsybakov also establish the rate of convergence of the estimates, and Banker (1995) considers certain types of hypothesis tests. These results are important but they do not lead to confidence intervals. Confidence intervals can be constructed by bootstrapping the DEA estimates; for example, Simar and Wilson (1995) give some theoretical results and an empirical example. However, in our view bootstrapping procedures are an imperfect substitute for an adequately developed distributional theory.
Ironically, the ability to conduct inference on efficiency estimates in stochastic frontier models has previously been noted approvingly, but has never been systematically exploited in an empirical setting. This paper seeks to fill this void and, in doing so, advances our understanding of the various sources of uncertainty inherent in econometric models for efficiency estimation.
Of course, the strength of the econometric approach comes at a cost: Strong and often arbitrary distributional assumptions are necessary to extract technical efficiency estimates and ultimately to construct confidence intervals. Therefore, a major aim of this paper will be to show how to perform inference on efficiency estimates under different sets of assumptions that range from the very strong to the relatively weak, and to see how the degree of uncertainty associated with these estimates relates to the strength of the assumptions made. Some of the methods we discuss require panel data. Most make specific distributional assumptions for statistical noise and technical inefficiency. However, we also make use of the methodology of multiple comparisons with the best (MCB), developed by Edwards and Hsu (1983) and applied to stochastic frontiers by Horrace and Schmidt (1994), which uses panel data to construct confidence intervals without the need for strong distributional assumptions.
In this paper, technical efficiency estimates and their confidence intervals are generated for three different panel data sets with different dimensional characteristics, using several formulations of the stochastic frontier model. We analyze these panel data as complete data sets and also in some cases broken down into their component cross sections to construct confidence intervals for technical efficiency estimates using different interval construction techniques. The results highlight the relevant strengths and weaknesses of the various techniques and data configurations, and also identify a few modeling assumptions that may be problematic. The paper addresses practical aspects of interval construction that may present problems for the data analyst.
The plan of the paper is as follows. Section 2 briefly reviews the stochastic frontier model as it relates to this paper. Section 3 reviews three interval construction techniques: the Jondrow et al. (JLMS) (1982) method, the Battese-Coelli (BC) (1988) methods, and the MCB method. Section 4 is an empirical analysis of three panel data sets for which we construct confidence intervals for technical efficiency estimates. Section 5 summarizes and concludes.

Stochastic Frontier Models
Stochastic frontier models were originally due to Aigner, Lovell, and Schmidt (1977) and Meeusen and van den Broeck (1977). These models were based on cross-sectional data and strong distributional assumptions. Similar models have also been developed for panel data. Pitt and Lee (1981) and Schmidt and Sickles (1984) were the first to exploit the advantages of panel data over cross-sectional data. Since this is not intended to be a comprehensive survey, the reader is referred to Cornwell and Schmidt (1995), Greene (1995), Lovell(1993), Lovell and Schmidt (1988), and Schmidt (1985) for further details. In this paper we make use of several formulations of the stochastic frontier model, which are given below.
The basic model that we will consider is as follows.
Here i indexes firms (or other productive units) and t indexes time periods. Typically is the logarithm of output and is a vector of inputs or functions of inputs. , is statistical noise and represents technical inefficiency, assumed to be time invariant. More specifically, if is the logarithm of output, technical efficiency of the ith firm is TE i =exp (-u i ) and technical inefficiency is 1 -TE i . We will refer to the composite error as . We will always assume the following: We will sometimes but not always make the additional assumptions: Assumption (A.4) implies that the Ui are half-normal, but this assumption could be replaced by other specific distributional assumptions, as in Stevenson (1980) or Greene (1990). Now define . Then we can rewrite (1) as the usual panel data model.
We regard zero as the absolute minimal value of u i , and hence as the absolute maximal value of , over any possible sample (essentially, as ). This can be distinguished from the minimal value of u i and the maximal value of in a given sample of size N, and this distinction is relevant when N is small and the u i (hence ) are treated as fixed. Let be the population rankings of the , so . Similarly, let be the population rankings of the u i , so that . Then . In this case the technical efficiency measures u i are defined by comparing to the absolute standard . We can consider the alternative of comparing to the within-sample standard , so that .Then equation (2) can be rewritten as: The difference between the two definitions of u is substantive and will be considered further in the sequel. Each formulation lends itself to particular estimation techniques that will be exploited in this paper. We now examine several estimation techniques for these models in both the cross-sectional and panel data cases.

Cross-Sectional Data
In the case of a single cross section, T = 1 and t is irrelevant and can be suppressed. Under assumptions (A. l)-(A.4), the model as given in equation (1) can be estimated by maximum likelihood (MLE). Details of this estimation, including the likelihood function, can be found in Aigner et al. (1977) and will not be addressed here. MLE of equation (1) yields , which are consistent as .

Define
. Under assumption (A.4), . Ordinary least squares (OLS) applied to equation (1) yields consistent estimates of . The corrected ordinary least squares (COLS) method constructs a consistent estimate of by adding a consistent estimate of to the OLS intercept. This requires a consistent estimate of , say , which can be derived from the third moment of the OLS residuals. Also a consistent estimate of can be derived from the second moment of the OLS residuals. See Olson et al. (1980) for details.
So, in summary, both COLS and MLE yield consistent estimates of , COLS is less efficient than MLE. In either case, point estimates for can be obtained, as described in Section 3.1.

Panel Data
We now turn to the case of panel data with T > 1. Under assumptions (A. l)-(A-4) equation (1) can be estimated by MLE. See Pitt and Lee (1981) for the likelihood function and other details. MLE yields estimates of the same parameters as in the cross-section case: . These estimates are consistent as ; therefore MLE is appropriate when N is large. Large T is not a substitute for large N.
Equation (1) can also be estimated by generalized least squares (GLS). This requires assumptions (A. l)-(A.4), except that it does not rely on specific distributional assumptions (normality of v, or half-normality of u). The standard panel data GLS procedure yields estimates of that are consistent as . Care must be taken to distinguish the usual GLS procedure uses . Under the half-normal distributional assumption, , so that the estimate of is easily converted to an estimate of . This is required to estimate and to convert the intercept, exactly as in the discussion of COLS above. We will refer to GLS with this intercept correction as the CGLS method. Point estimates fo u i and TE i can be obtained, as described in Section 3.2.
Equation (2) is useful primarily as a basis for estimation under weaker assumptions that treat the as fixed. A fixed-effect treatment may be useful because it relies only on assumptions (A.l) and (A.2), not (A.3) and (A:4), and because it is applicable when N is small and T is large (as well as when N is large). Suppose we estimate (2) by the usual fixed-effects estimation involving the within transformation (or, equivalently, dummy variables for firms), yielding estimates of . and measures inefficiency relative to the standard of the best firm in the sample. Now consider what happens as . Under the assumption (A.4) of half-normality, or in fact under any mechanism for the generation of ui that allows u arbitrarily close to zero with positive probability (density), . Thus, , so that inefficiency is measured relative to its absolute (not just within-sample) standard. This distinction becomes important when we examine different confidence interval construction techniques in the following section.
The statistical properties of the estimated Ui are complicated because of the max operation involved in the definition of . Consistency as both was argued heuristically (as above) by Schmidt and Sickles. Park and Simar (1994) and Kneip and Simar (1995) established the rate of convergence of the estimates. However, the asymptotic distributions of the estimates of and the u i are unknown, so that standard methods of construction of asymptotically valid confidence intervals based on these asymptotic distributions are currently not possible. Feasible methods of construction of confidence intervals will be discussed in the next section.

Techniques for Construction of Confidence Intervals
We use two different techniques to construct confidence intervals for technical efficiency estimates in stochastic frontier models. The first technique is based on the (conditional) distribution . It was developed for the cross-sectional case by Jondrow et al. (JLMS) (1982) and later generalized to the panel data case by Battese and Coelli (BC) (1988). The second technique is based on the MCB procedures developed by Edwards and Hsu (1983) and first applied to stochastic frontiers by Hot-race and Schmidt (1994). The MCB method will be based on fixed-effects estimates, while the JLMS and BC methods will be applied to the results of the other estimation techniques; this choice is primarily driven by the difference in distributional assumptions of the models.

Cross-Sectional Data: JLMS Method
For either cross-sectional estimation method, MLE or COLS, we use the JLMS method for interval construction. The JLMS technique follows from the distribution of u i conditional on (which is a scalar, since T = 1 for a cross section). JLMS show that given distributional assumptions (A-1) and (A.4), the distribution of random variable truncated (from the left) at zero, where . They evaluate , which is regarded as a point estimate for u i . A point estimate for TE i , due to Battese and Coelli (1988), is given by: where is the standard normal cdf. Implementing this procedure requires estimates of ; this in turn requires estimates of :, and the use of . Empirical implementations of the JLMS technique have focused on the point estimate . However, confidence intervals for are easily constructed from the density of . Critical values can be extracted from a standard normal density to place lower and upper bounds on . Because TE i is a monotonic transformation of u i , the lower and upper bounds for translate directly into upper and lower bounds on where As a semantic point, we will refer to the implementation of equation (5) in the crosssectional context as the JLMS method, since it relies on the JLMS result for the distribution of , even though equation (4) is due to BC. The BC method will refer to the corresponding calculations in the panel data case, described in the next section. It should be noted that both the JLMS and the BC methods treat as known, so that the confidence intervals do not reflect uncertainty about these parameters. For large N, this is probably unimportant, since the variability in the parameter estimates is small compared to the variability intrinsic to the distribution of the (and due to the presence of the statistical noise ).

Panel Data: Battese-Coelli Method
The BC method for construction of confidence intervals is a generalization of the JLMS method and also follows from the distribution of . The BC technique can be based on the MLE or CGLS estimates of . It extends the JLMS method to accommodate the case of panel data . The latter expressions are essentially the same as in JLMS, with . Then the distribution of is that of a random variable truncated at zero, a point estimate for TE i is given by equation (4) above, and confidence intervals are constructed as in equation (5) above.
The Battese Coelli method can also accommodate the case of an unbalanced panel, in which there are different numbers of time series observations per firm. Suppose that for firm i there are T i observations, where the notation reflects the fact that T i varies over i. We simply have to replace T by T i in the definition of above, so that ; note that now varies Over i. Then equations (4) and (5) hold exactly as before, except that . Thus an unbalanced panel causes no real problems for the BC method.

Panel Data: Multiple Companions with the Best
We now consider an adaptive multiple comparisons with the best technique (MCB). Simultaneous confidence intervals are constructed for . These can be monotonically transformed to confidence intervals for . A concise summary of the application of MCB techniques to stochastic frontiers is presented by Horrace and Schmidt (1994), so the procedure will not be fully detailed here. These intervals are unique in three respects. First, they do not presume that we know which firm in the sample is the most efficient firm as is implicitly the case for the usual estimates based on within estimation of the stochastic frontier model. Second, they are simultaneous and, as such, provide joint statements about which firms in the sample might be most efficient and which firms can be eliminated from contention for most efficient at a prespecified confidence level. Third, MCB intervals are naturally based on the within estimates and use only assumptions (A. 1) and (A.2) above; they do not require a distributional assumption for the .
As above let be the order statistics for the . Define ; these are measures of inefficiency relative to the most efficient firm in the sample. The point of MCB is to construct a set of simultaneous confidence intervals for ,based on estimates . The estimates come from the within regression, either as coefficients of dummy variables for firms, or (equivalently) as is the within estimate, . Then a set of simultaneous confidence intervals for is given by: Here s 2 is the usual pooled variance estimator; Q u , is the distribution function of a random variable and . Edwards and Hsu (1983) refer to these as adaptive intervals. For small values of N, tables for , can be found in Hochberg and Tamhane (1987), Dunnett (1964), Hahn andHendrickson 1971), andDLWI andMassey (1965). If appropriate critical values are not contained in the above tabulations (e.g., if N is very large), they are easily simulated. Notice that, as presented, the intervals are for a balanced design, where T i = T for all i. Application to the unbalanced case is discussed in Horrace and Schmidt (1994). The critical value , is the two-sided upper equicoordinate point of the N -1 variate equicorrelated t-distribution with common correlation and degrees of freedom . The equicorrelated structure emerges when the are independent or correlated with the special covariance structure: where is the variance-covariance matrix of . For the special covariance structure to emerge, it must be true that the terms are equal or near equal for all i and j. In general the are asymptotically independent as N or T gets large. In this study, when we cannot appeal to asymptotics, the condition for the special covariance structure is met or nearly met, so MCB is at least approximately applicable.
As previously stated, the bounds of the MCB intervals can reveal information about the population ranking of the production units. If, for a single firm, the upper and lower bounds of are 0 (or equivalently the lower and upper bounds on TE i equal 1) then that firm is most efficient (best) at the prespecified confidence level. However it is possible that several or many firms have , so that a single firm is not identified as best. We may also encounter firms for which , so these firms are revealed not best and bounds are given for their levels of (in)efficiency.
The width of these intervals hinges on three sources: estimation error, uncertainty over which firm is most efficient, and the multiplicity of the probability statement. We will try to disentangle these three sources in the empirical analyses that follow. To this end we introduce another interval construction technique called multiple comparisons with a control (MCC) due to Dunnett (1955). MCC creates simultaneous confidence intervals far the quantities , where can be any one of the population intercepts , chosen as the standard of comparison, and is not necessarily the largest intercept. (That is, ( can be any of the population intercepts, but it must be one of them; it cannot be an arbitrarily chosen number in the context of what follows.) However, if firm N is asserted to be most efficient, so that , the MCC intervals can be thought of as MCB intervals where the most efficient firm is known, a priori. In fact, if the MCB intervals reveal a single most efficient firm then they reduce to these MCC intervals. So the difference in width of the MCC and the MCB intervals is the effect of uncertainty about which firm is most efficient.
If a firm Nis known, a priori, to be most efficient, then a set of 100% simultaneous confidence intervals for , is given by where . The MCC intervals are constrained nonnegative to account for the Nth firm being most efficient and hence having the largest . As with the MCB intervals, the required equicorrelated structure emerges when the are uncorrelated or possess the special covariance structure. Notice that when one selects the firm with the largest as the MCC control, the upper bound of the MCC intervals is exactly the same as that for the MCB intervals. The primary difference between MCC and MCB is in the lower bound which does not depend on for MCC, while it does for MCB. For further details on this point see Horrace and Schmidt (1994).
We note in passing that several recent models in the frontiers literature have featured time-varying technical inefficiency. For example, see Cornwell, Schmidt, and Sickles (1990), Kumbhakar (1990), Battese and Coelli (1992), and Lee and Schmidt (1993). These models imply intercepts that vary over i and t. For a given value of t, it is natural to proceed as before to consider comparisons relative to the maximum (over i) of these intercepts, so that we essentially have a separate MCB problem for each t. However, there is no apparent reason to expect the equicorrelatedness condition to hold for the estimated , from any of these models, and if it does not hold the methods surveyed above would not apply. There is a limited literature on MCB procedures without the equicorrelatcdness condition; some references are given in Section 4.2 below.

Comparison of Different Techniques
A discussion of the differences between the interval construction techniques is in order. First, it should be noted that MCB provides joint confidence intervals for of equation (3), whereas JLMS and BC provide marginal intervals for of equation (2). The difference between which may be nontrivial when N is small. Conversely, the difference between joint and marginal intervals may be substantial when N is large. For example, one of our data sets has N = 171. Although independence would be poor assumption, it is instructive to note that a set of 171 independent intervals, each holding with a marginal probability of 0.95, would hold jointly with a probability of only (.95) 171 = 0 .0 00116. Conversely, joint confidence intervals that hold with a probability of 0.95 would correspond to marginal intervals with a confidence level far in excess of 0.95. Other things equal, we would certainly expect joint confidence intervals to be wider than corresponding marginal intervals, for a given level of confidence like 0.95.
The MCB and JLMS/BC methods also differ substantially in the way they handle estimation error. One sense in which this is true is that, assuming that the equicorrelated structure emerges for the the MCB intervals reflect the variability of . which the JLMS and the BC intervals ignore. This is probably not an important difference, since uncertainty about is not the only source, or in most cases the major source, of uncertainty about . To be more specific, consider the following expression for the within estimate of .
The term reflects estimation error in . As noted above, BC ignores this source of uncertainty while MCB does not. This term disappears as either , and is probably not important empirically for most data sets. More fundamentally, contains the error ; the within procedure separates by averaging away the . The significance of depends on T and on the relative sizes of and it is most troublesome when T is small and/or is large relative to . It is important to realize that the within estimate of , is generally biased upward (inefficiency is overstated), because the larger , will on average contain positive estimation error , while the smaller ,. will on average contain negative estimation error. (That is, the will obviously be more variable than the .) MCB recognizes this variability by including the sample equivalent of in the formula for the allowance, d, above. Also, the MCB intervals can be thought of as removing the bias just described; they are not centered on the value . The BC method uses distributional assumptions to remove estimation error more effectively. The first step in the BC procedure is to calculate (ignoring estimation error in ), so that the are averaged away, as in the within procedure. The second step is to construct , which equals times the shrinkage factor . This corresponds to the best linear predictor in the random-effects panel data literature; see Schmidt and Sickles (1984). It reflects the relative variability of . Finally, the distributional assumptions are used to imply the further shrinkage factor 4. Empirical Analyses

Indonesian Rice Farms -Erwidodo (1990)
We analyze data previously analyzed by Erwidodo (1990), Lee (199 1), and Lee and Schmidt (1993). For a complete discussion of the data see Erwidodo (1990). One hundred seventy-one rice farms in Indonesia were observed for six growing seasons. The data were collected by the Agro Economic Survey, as part of the Rural Dynamic Study in the rice production area of the Chimanuk River Basis, West Java and obtained from the Center for Agro Economic Research, Ministry of Agriculture, Indonesia. The 171 farms were located in six different villages and the six growing seasons consisted of three wet and three dry seasons. Thus the data configuration features large N and small T.
Inputs to the production of rice included in the data set are seed (kg), urea (kg), trisodium phosphate (TSP) (kg), labor (labor-hours), and land (hectares). Output is measured in kilograms of rice. The data also include dummy variables. DP equals 1 if pesticides were used and 0 otherwise. DV1 equals 1 if high yield varieties of rice were planted and DV2 equals 1 if mixed varieties were planted; the omitted category represents that traditional varieties were planted. DSS equals 1 if it was a wet season. There are also five region dummy variables, DRl, DR2, DR3, DR4, and DR5, for the six different villages in the survey.
COLS and MLE were performed on each of the six different periods (cross sections) in the panel. DSS, the dummy for wet season, had to be excluded for the cross-section models because it was constant across farms for a single period. Results are in Table 1. Unfortunately, periods 2, 3, 4, and 5 produced a positive third-order moment of the residuals, causing the MLE estimate to coincide with the OLS estimate as discussed in Waldman (1982). Additionally, this problem precludes COIS estimation since is negative. Therefore only periods 1 and 6 are analyzed as cross sections for this data set. Since the results for the two periods were similar only the period 1 results are reported in what follows.
Technical efficiencies and confidence intervals were produced using the JLMS technique; i.e., equations (4) and (5) above. Confidence levels are 95 %, 90 %, and 75 %. These results are contained in Tables 2a and 2b. Due to the large number of firms in the sample (17 l), only nine firms are reported here and in the sequel: the three firms with the highest the three firms with the lowest , and the three firms with the median The choice of the estimation procedure (COLS versus MLE) made very little difference, so we will discuss only the MLE results in Table 2b. Efficiency levels are not estimated as precisely as one might hope. The firm with the highest estimated efficiency level had estimated efficiency of 0.9452, but a 95% confidence interval ranged from 0.8322 to 0.9982. The median firm had estimated efficiency of 0.9053, with a 95% confidence interval of (0.7576, 0.9957); and the worst firm in the sample had estimated efficiency of 0.8040 with a 95% confidence interval of (0.6415, 0.9694). These are fairly wide confidence intervals. In fact the uncertainty about the inefficiency level of a given firm is definitely not small relative to the within-sample variability of the efficiency measures, and we would have little reason to have much faith in our efficiency rankings. The reason for this lack of precision is straightforward-most of the variation in . We have (for MLE, t = 1) var(v i ) so the variance of v is over nine times as large as the variance of . This makes it very difficult to estimate precisely.
Next, CGLS and MLE were performed on the entire panel. The variable DSS could now be included. Results are in Table 3. Technical efficiencies and confidence intervals were produced using the BC technique. These results are contained in Tables 4a and 4b. Efficiency levels based on the CGLS and MLE estimates are again similar. Not surprisingly, the panel data confidence intervals are tighter than their cross-sectional counterparts, because is smaller with six observations than with one. Nevertheless, the confidence intervals do not shrink as much as one might hope-compare a 95 % confidence interval for the median firm of (0.7638, 0.9945) in Table 4b to (0.7576, 0.9957) in Table 2b. This is partly due to having only six observations per firm, and partly to getting a larger value of for the panel than for the t = 1 cross section, which diminishes the value of the panel. The within estimates were calculated for the panel, with time-invariant regressors excluded to preclude multicollinearity. These results are also in Table 3. The covariance matrix for the very nearly exhibited the equicorrelated structure necessary to justify the MCB procedure: MCB intervals of 95 %, 90%) and 75 % were constructed for technical inefficiencies using critical values of = 3.42, 3.18, and 2.71, respectively, and are given in Table 5a. The intervals are too wide to be of much use. For example, the firm with the highest (and hence with estimated efficiency of 100% by the usual calculation) has a confidence interval ranging from 0.5613 to 1. Every firm in the sample has a confidence interval with upper limit equal to one; that is, at the 95% confidence level, no firm is revealed to be inefficient. In fact, this is still true at the 75% confidence level. The MCB intervals are much wider than their BC counterparts based on CGLS and MLE. We next attempt to determine the relative importance of three sources of width: estimation error, uncertainty of the identity of the most efficient firm, and the multiplicity of the probability statement. The easiest of these factors to investigate is uncertainty about the identity of the most efficient firm. To do so we simply assume that firm 164, which is the firm with the largest , is most efficient in the sense of having the largest (equivalently, smallest ). Under this assumption we construct the MCC intervals with firm 164 as the control.
Confidence intervals of 95 %, 90%) and 75 % required critical values of =3.42, 3.18, and 2.71, respectively. Results are in Table 5b. The MCC intervals are necessarily tighter than the MCB intervals, but not tight enough to be useful. In other words, the width of the MC3 intervals is not significantly decreased by knowing which firm is best. We can conclude that the width is primarily due to either estimation error or multiplicity or both.
To disentangle the effect of multiplicity on the interval width, we would like to be able to construct marginal intervals for each firm. In the case where MCB reveals a single firm as efficient, this can be accomplished with a simple application of the Bonferroni inequality. This will be demonstrated later. In the present case, where there is no single firm revealed as most efficient, the construction of marginal intervals is less clear, because it is necessary to make a simultaneous statement about the firms to determine a subset of firms that may be efficient, and then to reduce this joint statement to a marginal statement about a single firm. However, we can get some idea of the effect of the multiplicity of the intervals just by reducing the number of confidence intervals created, which we can do by considering a subset of the firms. We therefore redid MCB for only the nine firms for which we have reported results in Table 5a. (However, the parameter estimates are still from the whole sample of 171 firms.) Confidence intervals of 95 %, 90%, and 75 % required critical values of = 2.56, 2.38, and 1.96, respectively. Results are in Table 6a. As was the case in the MCC experiment, controlling for multiplicity did not result in a significant tightening of the intervals. For example, for the median firm, compare the new interval (0.3354, 0.9837) with N = 9 to the old interval (0.2899, 1.0000) with N = 171. We conclude that the multiplicity component of the intervals width is small, leaving only estimation error to account for the large width of the intervals.
Further evidence on this point is obtained by considering the smallest possible subset of firms (N = 2) and assuming that it is known that one of them is the most efficient. Thus, as in our MCC calculations, we assert that firm 164 is most efficient and we simply construct confidence intervals for for a given value of i. This is a standard calculation based on the estimate and its standard error, , and using critical values from the standard normal distribution. (Note that we have not imposed the equicorrelatedness assumption in this calculation, so our results will be slightly different from the results for MCC with N = 2, which would impose this assumption.) These are called per comparison intervals; they are given in Table 6b.
The per comparison intervals are indeed narrower than the MCB and the MCC intervals, but they are still fairly wide. For example, for the median firm we still have a 95 % confidence interval of (0.3788, 0.8102). This confirms our conclusion that, for this data set, the width of the confidence intervals is due primarily to the estimation error. As noted above, estimation error is important for this data set because T is small and is large relative to . There is simply too much noise to get a clear picture of the value of . The BC method does significantly better because it makes strong distributional assumptions that allow a much better separation of v from U. For this data set there does not seem to be a substitute for these strong assumptions.

Texas Utilities-Kimbhukm (1994)
In this study we reanalyze data originally analyzed by Kumbhakar (1994). Kumbhakar estimated a cost function, whereas we will estimate the production function. The data set consists of observations on 10 major privately owned Texas electric utilities observed annually over 18 years from 1966 to 1985, and includes information on annual labor, capital, and fuel (inputs) for electrical power generation (output). Due to the relatively small number of firms a cross-sectional study of the data was precluded. However, with 18 periods of observation per firm we have T larger than N, the opposite of the case with the Erwidodo rice farm data.
The model was estimated by CGLS and MLE with results given in Table 7. Notice that now is small relative to , so our estimates of technical efficiency should be more reliable than for the previous data set. It is instructive to point out that numerical accuracy became a problem in calculating TE i using equation (4). The small value for , produced extremely large values of , which, when evaluated in the standard normal cdf a(*), produced technical efficiencies greater than 100%. This was due to rounding error in the software package we originally selected. Fortunately, another package was found that evaluated the normal cdf more accurately. Tables 8a and 8b given our results for all 10 firms.
As expected, the efficiency estimates are much more precise than for the previous data set. For example, for firm 8 (one of the two median firms) and using MLE, efficiency is estimated as 0.8472 with a 95 % confidence interval of (0.8264, 0.8683). These are useful results in the sense that the uncertainty about a given firm's efficiency level is small relative to the between-firm variation in efficiencies; we can have some faith in our rankings. The within estimator was also calculated (Table 7), and MCB intervals were constructed (Table 9a). The covariance matrix for the again exhibited an almost-equicorrelated structure, so that MCB was applicable.
The MCB intervals successfully determined at the 95 % confidence level that firm 5 was the most efficient firm in the sample and that all others were inefficient. Consequently, the MCC intervals coincided with the MCB intervals are are not reported separately. In fact, firm 5 was identified as most efficient at the 99.9% confidence level, so essentially we were certain that it was the best. The confidence intervals for the other firms are wider than the corresponding BC intervals, but still not nearly as wide as the Erwidodo rice farm data. For example, for firm 8 compare the MCB intervals of (0.7809, 0.8603) to the BC interval of (0.8264, 0.8683).
It is interesting to note that there is very little overlap between the BC and the MCB intervals, with the MCB intervals being generally lower. Two opposing sources contribute to this difference. The difference between when N is small should make the BC intervals lower, since BC constructs a confidence interval for while MCB constructs confidence intervals for . However, this is apparently more than offset by the BC technique's more successful reduction of the effects of the estimation error. As noted above, the BC technique can be viewed as a set of shrinkages of the within efficiency measures, leading to generally higher efficiency measures.
Marginal intervals were easily constructed for each firm using the Bonferroni inequality. Since we knew the probability with which firm 5 could be identified as most efficient, we simply constructed a joint probability statement with this and a per comparison interval and selected the marginal confidence levels so that the Bonferroni inequality produced the desired joint confidence level. Here, since we were essentially certain that firm 5 was efficient, the joint probability statement essentially reduced to a single per comparison probability statement. In the rice farm data the per comparison intervals were conditional on firm 164 being efficient; we just assumed that this was the case. However, for the current data, we knew with almost certainty (99.9% certainty) that firm 5 was efficient, so our marginal statement essentially coincides with the per comparison statement, just as the MCB intervals coincide with the MCC intervals.
The marginal/percomparison intervals are contained in Table 9b. Again the actual standard errors were used, since we did not have to appeal to equicorrelatedness to get our critical values. As a general statement, the marginal (per comparison) intervals are comparable to the MCB intervals. Surprisingly, in many cases the marginal intervals are actually wider than the MCB intervals. This must reflect failure of the equicorrelatedness assumption underlying our MCB intervals, but it also is a reflection of the relative sizes of N and T in the data. To be more specific, consider the following expression for the standard error of the estimate : When T is small and N is large (e.g., T = 6 as in the rice farm data), the term is large relative to the other three terms, so any differences between and are unimportant. For MCB we assume , so these insignificant differences are ignored. When T is large, however, the term is small and the aforementioned differences may become significant. However, if we ignore this difference in MCB, then the standard error for MCB may be smaller than the standard error for some of the marginal (per comparison) intervals. This is less of a problem when both N and T are large, because large N tends to shrink the term so any differences in the term will become less pronounced. In cases where equicorrelatedness of the Cri cannot be assumed, there are some conservative MCB approximations available. Matejcik (1992) suggests techniques for adaptive MCB intervals that are robust to a generalization of the correlation matrix and compares their performance using computer simulation. These techniques are based on several MCC methods that are themselves robust and include: an MCC method based on Banjeree's inequality due to Tamhane (1977), a procedure using a moment-based approximation to the Behrens-Fisher problem due to Tamhane (1977), a method using all-pair-wise procedures due to Dunnett (1980), and his own technique based on a heteroscedastic selection procedure. An obvious line for further research is to examine the applicability of these techniques to stochastic frontier models.

Egyptian Tileries--Seale (1990)
We analyze data previously analyzed by Seale (1990). For a complete discussion of the data see Seale (1990). He observed 25 Egyptian small-scale floor tile manufacturers over three-week periods for 66 weeks, for a total of 22 separate observation periods. The set contains some missing data points, so the number of separate observation periods varies across firms, making this an unbalanced panel. The data were collected by the Non-Farm Employment Project in 1982-1983. The firms were located in Fayoum and Kalynbiya, Egypt. Inputs to the production of cement floor tiles are labor (labor-hours) and machines (machine-hours). Output is in square meters of tile.
The model was estimated by OLS, within and CGLS. The third moment of the OLS residuals was positive, so MLE was not attempted. Estimation results are given in Table 10. It may be noted that and are of similar magnitude. For this reason, and because the number of firms is similar to the number of periods per firm (for most firms), this data set has characteristics that put it in the middle ground between the Erwidodo rice farm data set (N much larger than T, larger than ) and the Kumbhakar utilities data set (T larger than N, larger than ). We should expect confidence intervals wider than for the utilities but narrower than for the rice farms. Table 11 gives the BC confidence intervals based on the CGLS estimates, for all firms. As a general statement, these confidence intervals are considerably wider than for the utility data. They are perhaps a little narrower than the confidence intervals for the rice farm data, but this is not entirely clear because the general level of efficiency is lower than it was for the rice farm data.
We next consider the MCB intervals. Because the panel is unbalanced, different are based on different numbers of observations, and we cannot expect the equicorrelated structure to hold. However, we CLIII still proceed with MCB if has the following product structure: See Horrace and Schmidt (1994). This structure held approximately, and so we calculated the MCB intervals, which are given in Table 12. As was the case for the BC results, the confidence intervals are generally narrower than those for the rice farm data but wider than those for the utility data. MCC and per comparison intervals for the within estimation are contained in Tables 13 and 14, respectively. Once again, they are not very different from the MCB intervals.

Conclusions
In this paper we have shown how to construct confidence intervals for efficiency estimates from stochastic frontier models. We have done so under a variety of assumptions that correspond to those made to calculate the efficiency measures themselves. For example, given distributional assumptions for statistical noise and inefficiency, the Jondrow et al. or Battese-Coelli estimates are typically used, and confidence intervals for these estimates are straightforward. With panel data but without distributional assumptions, efficiency estimates are commonly based on the fixed-effects (within) intercepts, and confidence intervals follow from the statistical literature on multiple comparisons with the best.
In our analysis of three panel data sets, we found confidence intervals that were wider than we would have anticipated before this study began. The efficiency estimates are more precise (and the confidence intervals are narrower) when T is large and when is large relative to and they are less precise when T is small and when is small relative to . However, frankly, in all cases that we considered the efficiency estimates were rather imprecise. We suspect that, in many empirical analyses using stochastic frontier models, differences across firms in efficiency levels are statistically insignificant, and much of what has been carefully explained by empirical analysts may be nothing more than sampling error. This is a fairly pessimistic conclusion, though it may turn out to be overly pessimistic when more empirical analysis is done. It is therefore important to stress that deterministic methods like DEA are not immune from this pessimism. Efficiency measures from DEA or other similar techniques are subject to the same sorts of uncertainty as are our estimates. The only difference is that we can clearly assess the uncertainty associated with our estimates while, at present, it is less clear how to assess the uncertainty associated with the DEA measures. In our opinion this should continue to be a high-priority item on the DEA research agenda.