Expected efficiency ranks from parametric stochastic frontier models

In the stochastic frontier model, we extend the multivariate probability statements of Horrace (J Econom, 126:335–354, 2005) to calculate the conditional probability that a firm is any particular efficiency rank in the sample. From this, we construct the conditional expected efficiency rank for each firm. Compared to the traditional ranked efficiency point estimates, firm-level conditional expected ranks are more informative about the degree of uncertainty of the ranking. The conditional expected ranks may be useful for empiricists. A Monte Carlo study and an empirical example are provided.

composed error, consisting of a two-sided error, representing noise, and a one-sided error, representing inefficiency. See, for example, Aigner et al. (1977); Coelli (1988, 1992), and Greene (2005). It is very often assumed that the two-sided error is normally distributed and the one-sided error is truncated normally or exponentially. If so, the distribution of inefficiency conditional on the composed error is truncated normally. Given these conditional inefficiency distributions (one for each firm), a common empirical question is how does one assess relative inefficiency in the sample? There are essentially two approaches. The first approach is to calculate the mean of each conditional inefficiency distribution, using the value of the regression residual for each firm in the conditioning argument. See Jondrow et al. (1982) for the cross-sectional case and Battese and Coelli (1988) for the panel data case. These conditional means (evaluated at the residual values) can be ordered across firms, and a sample-wide view of inefficiency is inferred from the order statistic. In particular, the firm with the smallest conditional mean may be deemed efficient relative to the rest in the sample. A second approach is to use the conditional inefficiency distributions to calculate the probability that each firm is best (has the lowest inefficiency), conditional on the (joint) composed errors. See Horrace (2005). These conditional efficiency probabilities can be evaluated at the values of the (joint) regression residuals to provide an alternative view of (in)efficiency in the sample, and, in particular, the firm with the largest efficiency probability may be deemed the most efficient. The first approach is a marginal approach in that each conditional mean is derived from a single conditional inefficiency distribution. The second approach is simultaneous in that each conditional efficiency probability is derived from all the conditional distributions, jointly. In this sense, the conditional probabilities contain information from the efficiency rank statistic that the conditional means do not provide. In the parlance of the multiple comparisons and ranking and selection literatures (e.g., Bechhofer 1954;Dunnett 1955;Gupta 1956;Gupta 1955), the conditional efficiency probabilities account for the "multiplicity" in the rank statistic (e.g., firm 1 is better than firm 2 and firm 3 and…).
This paper extends the conditional probability statements of Horrace (2005) to calculate not only the conditional probability that each firm is best (lowest inefficiency), but also the conditional probabilities that each firm is any efficiency rank (best, 2nd best, …, 2nd worst, worst) in the sample. The suite of conditional probabilities provides a complete picture of efficiency in the sample and is informative. To see this, let the sample consist of n firms and let the unconditional distribution of efficiency be the same for each firm (a common assumption). Then, the unconditional probability that any firm is a particular efficiency rank is simply 1/n, an uninteresting result. That is, the unconditional probability of any particular efficiency rank can be characterized by a discrete uniform distribution across firms. Once we condition on the sample data (on the regression residuals), the shape of this distribution across firms becomes less uniform (more informative). It is in this sense that the proposed conditional efficiency probabilities are empirically useful. In fact, our simulations show that when the variance of the one-sided error is small relative to that of the two-sided error (a noisy experiment), the conditional probabilities are close to the unconditional result, 1/n. As noise decreases, the probability weights of being a particular efficiency rank shift across firms, so the distribution becomes more informative.
Given the suite of conditional efficiency rank probabilities (a partition of the event space that firm i is efficiency rank r ), it is a simple matter to calculate the expected rank for each firm, conditional on the composed errors, evaluated at the residual values. These conditional expected ranks are also useful. Like the unconditional efficiency rank probabilities, the unconditional expect rank for each firm is constant across firms. For example, if n = 5 and if the unconditional distribution of inefficiency is (again) identical across firms, then the unconditional expected rank for each firm is (1 + · · · + 5)/5 = 3, an uninteresting result. The conditional expected rank, however, varies across firms, and this variability informs our understanding of the efficiency rankings. Continuing the example, if the firm with the highest efficiency score has a conditional expected rank of 1.2 (1 being the best and 5 being the worst), we are much more confident that it is the best firm in the sample than if it has a conditional expected rank of 2.2, and the conditional expected rank of 1.2 is certainly more informative than its unconditional expected rank of 3. Not surprisingly, the informativeness of the conditional expected rank is increasing in the signal to noise ratio in our simulations. Continuing the example, the conditional expected rank of 1.2 for the firm with the highest inefficiency score might be from a less noisy experiment than the 2.2 result. In a very noisy experiment, the same conditional expected rank might be close to 3, the unconditional result. Our simulations also reveal interesting relationships between the skew of the one-sided error and the distribution of the conditional expected ranks across firms. This paper is organized as follows. The next section presents the parametric frontier model, the conditional efficiency rank probabilities, and the conditional expected rank measure. The model allows for unbalanced panels, a case which has not been treated extensively in previous work on efficiency probabilities. In Sect. 3, a Monte Carlo study demonstrates how the empirical distribution of conditional efficiency rank probabilities and the conditional expected ranks vary with a) the amount of noise in the experiment and b) the skew of the unconditional inefficiency distributions. Section 4 presents an empirical application to vessel efficiency in the US North Atlantic Herring fleet, and Sect. 5 concludes.

Conditional inefficiency rank probabilities for parametric frontiers
We consider the parametric stochastic frontier model for an unbalanced panel of firms: Here, y it are observed logarithm of output of the ith firm in the tth period, the x it are observed production inputs, the u i ≥ 0 are iid unobserved errors representing unobserved inefficiency, and the v it are iid unobserved errors that cause the efficiency frontier to be stochastic. We assume that the distribution of v it is N (0, σ 2 v ) and distribution of u i is the truncation below zero of a N (μ, σ 2 u ) random variate. 1 Other distributions for u i have been considered (e.g., Greene 1990), but are beyond the scope of what follows. We also require that x it , u i , and v it be independent. Since y it are in log points, firm-level technical efficiency is defined as TE i = exp(−u i ). Maximum likelihood estimation of the model's parameters (α,β,μ,σ 2 u andσ 2 v ) is consistent (as n → ∞ or as T i → ∞).
The model in (1) is fairly flexible. It can represent both Cobb-Douglas and trans-log specifications, and it can be recast as a cost, revenue, or profit function. Generalizations for time-varying u i are plentiful. For example, see Kumbhakar (1990), Cornwell et al. (1990, Battese and Coelli (1992), Lee and Schmidt (1993), Cuesta (2000), Han and Orea (2005), Lee (2006) and Ahn and Lee (2007). There are also models that incorporate an additive firm heterogeneity term that is separate from inefficiency. See for example, Greene (2005), Chen et al. (2011), Kumbhakar et al. (2012), and Columbi et al. (forthcoming). Our empirical example in Sect. 4 involves a more flexible form than in (1), where the marginal products are allowed to vary across groups of firms; the model is estimated using the El- Grether (1995, 2000) estimation classification algorithm. Regardless of the simplicity or complexity of the underlying production function specification, the proposed expected rank results that follow are applicable as long as the ex post conditional distribution of technical inefficiency is truncated normal. 2 Based on our assumptions in (1), Battese and Coelli (1988) show that the distribution of u i conditional on the composed error That is, the conditional density function of u i is Then the conditional distribution function is where is the cumulative distribution function of a standard normal random variate. Then the conditional mean of u i is with φ, the density of a standard normal random variate. In principle, the population efficiency ranking is in terms of u i . That is, u [1] ≤ u [2] ≤ · · · ≤ u [n] , so that firm [1] is most efficient in the population, and firm [n] is least efficient. However, u i is unobserved and cannot be directly estimated, so what is often done is to calculate the vector of residuals e i = [e i1 . . . e i T i ] with e it = y it −α − x itβ , and estimate inefficiency asû i = E(u|ε i = e i ), the conditional mean evaluated at ε i = e i (withμ = μ, σ 2 u =σ 2 u , and σ 2 v =σ 2 v ) for each firm. Empirical exercises often include a rank ordering of theû i , which serves as a predictor of the ordered u i conditional on ε i . If T i = T , so that σ 2 * i = σ 2 * , then the firm rankings based onû i will be identical to the rankings based onē i . However, in the case of unbalanced panels, it is possible that these two rankings will not be identical, because σ 2 * i causesû i to no longer be monotonic inē i .
Based on the distribution of u i conditional on ε i , Horrace (2005) calculates the conditional probabilities that firm i is the most efficient in the sample, Pr(i = [1]|ε 1 , . . . , ε n ), and least efficient in the sample Pr(i = [n]|ε 1 , . . . , ε n ). The conditional efficiency probabilities are predicted by evaluating them at ε i = e i , i = 1, . . . , n. We generalize those results to calculate the conditional probability that firm i is any efficiency rank, r , in the sample, Pr(i = [r ]|ε 1 , . . . , ε n ). This conditional rank probability is the sum of the probabilities of all possible events where u i is rank r (and there may be many), however it is not necessary to calculate all possible event permutations to determine it. Instead we can start from events in which a particular set of firms are more efficient than i and the remaining firms are less efficient, without regard for the rankings of the firms within each set. Consider one such event with firm i at rank r , and define subsets For any rank, except r = 1 or r = n, there are multiple combinations of ranked firms above and below i, which yield the same rank. In fact, there are n−1 C r −1 = (n−1)! (n−r )!(r −1)! combinations that have firm i at rank r out of n firms. Accordingly, we index the sets of firms above and below i for each different combination that produces the same rank as N l− i (r ) and N l+ i (r ), l = 1, . . . , n−1 C r −1 . Then the conditional efficiency rank probability for rank r out of n is i = 1, . . . , n, r = 1, . . . , n. When r = 1 or r = n, these reduce to the conditional efficiency probabilities of Horrace (2005). The n 2 probabilities in (2) can be predicted by evaluating them at ε i = e i , i = 1, . . . , n (with μ =μ, σ 2 u =σ 2 u , and σ 2 v =σ 2 v ). It is not difficult to generate computer algorithms for efficient calculation of these probabilities. 3 When n is large, numerical calculation of the probabilities may be difficult, but simulating the probabilities by resampling from the conditional inefficiency distributions is straight-forward. We employ both numerical and simulated probabilities in our empirical example.
Substituting the unconditional density function, f (u), and distribution function, F(u), for the conditional density function and distributions (respectively) in (2), it is clear that This argument hinges on the unconditional draws of u (over i) being identically distributed. Obviously, if the unconditional distribution of the u i varies over i (e.g., Battese and Coelli 1995), then the unconditional Pr(i = [r ]) would not equal 1/n in general, and would be a function of the parameters of the underlying unconditional distributions. Also, if σ 2 v is large relative to σ 2 u , then realizations of ε i contain relatively little information about u i , so that the conditional distribution of u i is close to its unconditional distribution. Therefore, when σ 2 v is large relative to σ 2 u , the probabilities in (2) will be close to 1/n.
Of course reporting the n 2 probabilities in (2) in an empirical exercise may be impractical. However, much of the pertinent information contained in the r = 1, . . . , n conditional rank probabilities for a firm can be summarized with its conditional expected rank statistic, i = 1, . . . , n. This measure is an alternative way to characterize efficiency ranks that accounts for multiplicity in the rank statistic through the probabilities in (2). Again, it can be predicted by evaluating ε i at the values of e i for every firm. It also responds to the relative magnitudes of the signal (σ 2 u ) and noise (σ 2 v ) in the same way as the probabilities in (2). In a particularly noisy setting, the conditional rank probabilities in (2) are approximately equal to 1/n, and the conditional expected rank will be approximately equal across firms. In this sense, (2) and (3) provide information on one source of uncertainty in the efficiency ranks that the conditional means, E(u|ε i ), do not. 4 Of course, the conditional mean, the conditional rank probabilities, and the conditional expected rank are all different measures, so comparisons of their abilities to serve as substitutes should not be overstated.
All of the different characterizations of inefficiency (and their relative rankings) are evaluated at ε i = e i . Therefore, they all ignore estimation error, which (of course) is asymptotically negligible. Nonetheless, it may be important in finite samples. For the conditional means, there are existing procedures to address the issue. Simar and Wilson (2009) and Wheat et al. (forthcoming) recommend resampling techniques to incorporate estimation error into confidence intervals on technical inefficiency. Resampling techniques could certainly be employed to assess the effects of estimation error on the conditional expected ranks. The procedure to do so would be straightforward, and it would yield confidence sets for the vectors of conditional expected ranks. However, this is not the focus of the evaluation presented below.

Monte Carlo study
We use a series of simulations to demonstrate properties of the conditional expected rank statistic. For simplicity, we always set T i = 1. As equation 2 shows, the conditional rank probabilities for each firm depend on the conditional distributions, f (u|ε i ), which themselves depend on three parameters μ, σ 2 u , and σ 2 v . First, we follow standard simulation practice for stochastic frontier models (e.g., Olson et al. 1980), and explore how statistical noise, σ 2 v , affects the empirical distribution of the conditional rank probabilities in (2) and the conditional expected ranks in (3). To this end, we The point is that increasing noise should degrade the efficiency rank probabilities' ability to accurately detect the true rank of any firm, so that the conditional expected ranks are increasingly uninformative. Second, Feng and Horrace (2012) show that the skew of the inefficiency distribution can also confound our detection of firm ranks at different ends of the order statistic in different ways. 5 If the inefficiency distribution is "mostly stars" having many firms in the left tail (u i ∼ = 0 with high probability), then it is difficult to differentiate the individual ranks of these highly efficient firms (low [r ] firms). Conversely, if the inefficiency distribution is "mostly dogs" having fewer firms in the left tail (u i ∼ = 0 with low probability), then it is easier to differentiate the individual ranks of these highly efficient firms. 6 The amount of relative mass in one tail of a distribution affects the skew of the distribution. Therefore, our second interest is in seeing the effects of distributional skew (for a fixed variance) on the conditional rank probabilities and the conditional expected ranks. To do this, we select values μ and σ 2 u that hold the variance constant at V (u) = 0.36 (the variance of a standard normal random variable truncated at zero) but produce skews of 0.5, 1.0, and 1.5, respectively. 7 These values are listed in Table 1.
The different combinations of parameters (μ, σ 2 u , σ 2 v ) yield a total of 12 separate exercises (four exercises for each of three skew levels). In each exercise, we use a total of 5,000 replications. We use a modest number of firms, n= 5, to reduce the computa-  tional burden in (2) and simplify exposition. 8 We ignore the frontier specification and simulate the model ε i = v i − u i , so we are implicitly assuming that the production function is known. Our interest is not to understand how well the stochastic frontier model in (1) can be estimated, for this is widely known (e.g., Olson et al. 1980). It is simply to demonstrate the empirical utility of the proposed conditional rank probabilities and the conditional expected rank statistic, and to examine their responses to changes in noise and skew. Results are shown in Figs. 1, 2, and 3 for skew equal to 0.5 (low), 1.0 (medium), and 1.5 (high), respectively. We couch our discussion on the effects of changes in σ 2 v in terms of Fig. 1 (low-skew, Skew(u) = 0.5), but it could equally apply to Figs. 2 and 3. To achieve Skew(u) = 0.5 while holding V (u) = 0.36, Table 1 shows that we select μ = 0.89, σ 2 u = 0.52 for the results in Fig. 1   is small) as the skew of a truncated normal is always positive. However, the low skew (0.50) of the Fig. 1 simulations means that the distribution is relatively symmetric, so we shall see that differences in the ability of the conditional rank probabilities to accurately detect high-and low-ranked firms will become even more stark as we increase the skew (and increase uncertainly over which firms have lower true [r ]). See Figs. 4, 5, and 6 for a typical empirical inefficiency distribution for each of our three levels of skew: 0.5, 1.0, and 1.5, respectively. Each figure is a kernel density plot using a Gaussian kernel, bandwidth chosen by Silverman's rule of thumb (Silverman 1986), and no boundary-bias correction. Continuing with the low-skew results of Fig. 1, as we increase σ 2 v = {0.01, 0.1, 1, 10}, the empirical distribution of the efficiency probabilities becomes more uniform (and less informative). However, we also see in the four panels of Fig. 1 that it is always the case that Pr(1 = [1]|ε 1 , . . . , ε n ) < Pr(5 = [5]|ε 1 , . . . , ε n ), even in the noisiest (σ 2 v = 10) panel. Both of these empirical phenomena remain as we increase the skew (asymmetry) of the distribution of u to 1.0 and to 1.5 in Figs. 2 and 3, respectively (while holding V (u) constant). Looking across the figures, we see the effect. Consider the lowest noise panel (upper-left panel) in Figs. 1, 2, and 3. As the skew increases across Figs. 1, 2, and 3, Pr(1 = [1]|ε 1 , . . . , ε n ) is decreasing (0.820, 0.754, and 0.699, respectively), while Pr(5 = [5]|ε 1 , . . . , ε n ) is slightly increasing (0.867, 0.873, and 0.879, respectively). In the words of Almanidis et al. (2014), when the conditional distribution of u has "fewer stars" (low skew of Fig. 4), it is easier to detect stars, Pr(1 = [1]|ε 1 , . . . , ε n ) = 0.820, than when there are "mostly stars" (high skew of Fig. 6 that the conditional means, E(u|ε i ), would not uncover. These are also manifest in the conditional expected ranks which we now consider. Once the conditional rank probabilities are calculated for each firm at each rank, calculation of the conditional expected ranks of equation 3 is straight-forward. The distributions of conditional expected rank (for each simulation run in Figs. 1, 2, and 3) are contained in Tables 2, 3, and 4 (respectively). The utility of the conditional expected ranks is immediately obvious. First, the extent to which noise affects Pr(i = [r ]|ε 1 , . . . , ε n ) is clear. Consider the first panel (σ 2 v = 0.01) of Table 2. The difference between the true rank of firm 1 (first column) and the average conditional expected rank (second column) is relatively small (1 − 1.21 = − 0.21), but this difference is increasing in magnitude as we read down the panels and the level of noise increases: 1 − 1.72 = − 0.72, 1 − 2.66 = − 1.66, and 1 − 2.96 = − 1.96. These qualitative results are true for all firms (true rank) and for all levels of skew (Tables 2, 3 and 4). Obviously, as noise increases, the conditional expected ranks are moving toward the unconditional expected rank, 3 (bottom panel in Table 2), which reflects the nearly uniform distribution of the conditional efficiency probabilities (bottom panel of Fig. 1). The response of the quantiles of the expected ranks (columns with the heading "Quantiles of ρ") to increased noise is clear in Tables 2, 3, and 4: noise tends to push extreme quantiles (and their surrounding probability mass) to the center of the empirical distribution of the conditional expected ranks. Also, the effect of skew is clear across the tables. Consider firms 1 and 5 in the first (low noise) panels of Tables 2, 3, and 4. For firm 1, the difference between its true rank [1] and the average conditional expected rank is increasing in magnitude (−0.21, −0.30, −0.39) as skew increases (0.5, 1.0, and 1.5) across Tables 2, 3, and 4, respectively, while the same differences for firm 5 are non-increasing in magnitude across the tables (0.16, 0.15, 0.15). Again, this reflects the fact that as skew increases (and there are relatively more stars in the inefficiency distribution) it is harder to detect "stars" in the left tail of the inefficiency distribution than to detect "dogs" in the right tail. The quantiles in Tables 2, 3, and Finally, the conditional expected ranks are a convenient normalization of relative efficiency. Notice that the normalization is pegged to both ends of the true order statistic (1 and 5), such that ρ i ∈ [1, n]. Compare this to the traditional predictor of evaluated at ε i = e i . (See Jondrow et al. 1982.) This absolute predictor normalizes efficiency predictions to the unit interval, T E i ∈ (0, 1). Therefore, linear renormalizations of expected rank, like 1 − (1 − ρ i )/n, can be thought of as alternatives to the T E i normalization. However, the former is measured on a relative (within sample) scale, while the latter is measured on an absolute (out of sample) scale.

Empirical example
To illustrate our results on expected ranks, we revisit the empirical exercise in Flores-Lagunes et al. (2007), who estimate a stochastic production frontier for an unbalanced panel for n = 39 vessels from the US North Atlantic Herring fleet (2000)(2001)(2002)(2003). They specify a heterogeneous production function and use the El-Gamal and Grether estimation classification algorithm Grether 1995, 2000) to classify the fleet into three production tiers. See Flores-Lagunes et al. (2007) for a complete discussion of the data, the production function and the estimation algorithm. 11 Suffice it to say that vessel output is total catch (tons) and inputs are things like vessel size (tons), hours at sea, and crew size. The estimation yields μ * i and σ 2 * i for each vessel. That is, each vessel's conditional inefficiency distribution is a N (μ * i , σ 2 * i ) truncated at zero. The North Atlantic Herring fleet consists of two technologies: trawlers and "purse seiners". While in motion, trawling vessels drag large nets to take catch. A purse seine is a large net that is dropped toward the ocean floor, while the vessel is at rest. The gear encircles catch as it is hauled back up to the boat. Vessels use only one of these technology (there are costs to refitting vessels with the different gear types). The El-Gamal and Grether estimation classification algorithm stratifies the fleet into three production tiers, where each tier has separate marginal product estimates (estimates of α and β in Eq. 1). The first and second tiers consist exclusively of trawlers and the third tier consists of a mix of trawlers and purse seiners. Efficiency is characterized within (and not across) each production tier. Flores-Lagunes et al. (2007) also calculate 95 % subsets of the best and worst vessels in each tier. That is, they calculate subsets of vessel indices which contain the index of the most and least efficient vessels in the population with probability at least 95 %. We use their subsets of the best in each tier in what follows.
The estimates of μ * i and σ 2 * i for the all vessels in each production tier (tier 1, tier 2, and tier 3) are reproduced in the second and third columns of Tables 5, 6, and 7, respectively. The first column contains the unique vessel numbers from the Flores-Lagunes et al. (2007) analysis. The fourth column contains traditional technical efficiency predictors T E i = E exp (−u) |ε i from (4) evaluated at ε i = e i , and the results in Tables 5, 6, and 7 are ranked on this value. The fifth column contains the conditional expected ranks, ρ i , in (3) for the five vessels with the largest T E in each tier. 12 The choice of five vessels is purely for convenience, as numerical calculation of the rank probabilities becomes difficult with more vessels. The sixth column also contains conditional expected ranks, but for only those vessels that were in the subset of the best vessels with 95 % probability. For example, in tier 1 (Table 5), the 95 % subset of the best is {1, 2, 5, 10, 11, 14, 27, 39, 37}, and we calculate conditional expected ranks for these vessels only. For the expected ranks in column 6, we used simulated conditional rank probabilities, because the subsets were often larger than  , 2, 5, 10, 11, 14, 27, 39, 37} c Expected rank among all vessels in the tier  , 7, 8, 12, 13, 19, 21, 32} c Expected rank among all vessels in the tier 5, making numerical calculation difficult. Finally, the last column in Tables 5, 6 and 7 contains the conditional expected ranks for all vessels in each tier, also based on simulated rank probabilities. 13 The results are compelling. Starting with Table 5, we see that for the top five vessels (column 5) the conditional expected ranks only range in value from 2.272 (vessel 2) to 3.479 (vessel 14), indicating a fairly noisy analysis. Had this been a particularly precise empirical exercise the lower bound of the range would be closer to 1 and the upper bound closer to 5. This is echoed in the subset of the nine best vessels (column 6) where the conditional expected ranks only range in value from 3.181 (vessel 2) to 6.950 (vessel 1), and in the full complement of thirteen vessels in the tier (column 7) where it ranges from 3.360 to 12.989. As we increase the number of vessels included in the conditional expected ranks across columns 5 through 7, we see that the ability of the conditional expected ranks to cover the population range of vessel ranks is diminished. For example, the expected rank for the top-ranked vessel, number 2, is increasing from 2.272 to 3.181 to 3.360. This is presumably because of the multiplicity in the conditional efficiency rank probability calculation in Eq. 2. Also the vessel rankings based on T E i always match those based on ρ i for the top five vessels in column 5 of Table 5. This is almost the case in columns 6 and 7, except for vessels 14 and 27 whose expected ranks are reversed compared to the T E i ranks. However, these are small (perhaps, negligible) differences which could be due to sampling variability in the simulated rank probabilities. Table 6 tells a slightly different story. The range of the conditional expected ranks is tighter than in Table 5 and only ranges from 2.535 (vessel 21) to 3.533 (vessel 7) for the top 5 vessels (column 5), so the analysis is more noisy, however, the ranks based on T E i are different than those based on ρ i for the top five. In particular, the ranks of vessel 13 and 12 are reversed, and it is clear why this is the case: the truncated normal distributions (upon which they are based) are vastly different in shape even though the means are approximately the same. That is, E[exp(−u)|ε 12 = e 12 ] = 0.863 ∼ = E[exp(−u)|ε 13 = e 13 ] = 0.864, however, the means and variances of the distributions before truncation are extremely different. Compare μ * 13 = 0.135 to μ * 12 = −0.518 and σ 2 * 13 = 0.009 to σ 2 * 12 = 0.125. Vessel 12 has more mass near zero in the distribution of f (u|ε i ). This difference in the ranks between T E i and ρ i for vessels 12 and 13 persists as we move to the subset of the eight best vessels (column 6) and all vessels (column 7). Table 7 tells an even more nuanced story. The range of the expected ranks for the top 5 vessels (column 5) is wider: from 1.395 (vessel 3) to 4.151 (vessel 34), so this is the most precise rank statistic across the same columns in Tables 5, 6 and 7, yet there is still some switching in the ranks based on ρ i for these five vessels. In particular, the ranks of vessels 33 and 16 and of 30 and 34 are switched in column 5. Notice that the differences in T E i for these vessel pairs are not that large, so it is really not surprising that the additional information provided by the conditional rank probabilities might switch the ranking. They are also switched in column 6 for vessels 33 and 16. Notice that in column 6, vessel 30 did not make it into the subset of the best vessels, while vessel 34 did. This is because the variance of inefficiency (before truncation) is much smaller for vessel 30 (σ 2 * 30 = 0.001) than it is for vessel 34 (σ 2 * 34 = 0.023). That is, we cannot reject the hypothesis that 34 is best, while we can for vessel 30. (A similar phenomenon occurs for vessels 4 and 1 in column 6 of Table 5). All of this underscore the importance of taking into account multiplicity and noise in any ranking exercise.

Conclusions
We extend the current literature on ranked efficiency scores by defining and proposing the use of conditional efficiency rank probabilities and conditional expected efficiency ranks as a means to provide improved insight into efficiency score rankings. Although our model was fairly restrictive, our results can be more broadly applied than this might indicate. Indeed, there is a broad class of parametric models that yield conditional efficiency distributions that are truncated normal and to which our results directly apply. Additionally, even if the resulting conditional distributions are not truncated normals, our results (and the results of Horrace 2005) can be adapted to these cases.
We demonstrated nuances of the proposed measures with a Monte Carlo Study. The conditional expected ranks responded in predictable ways to the inherent noisiness of the statistical exercise and to the skew of the underlying efficiency distribution. While it is generally ignored in empirical applications of the stochastic frontier model, skew is a very important moment to consider in drawing conclusions on ranked efficiency predictors. Our empirical example based on fishing vessels underscores the importance of taking into account multiplicity and noise in any ranking exercise, and the empirical relevance of the conditional rank probabilities and the conditional expected ranks is made clear.
One potential area of future research is that the OLS residuals are necessarily correlated, so while the conditional inefficiency distributions based on the true regression errors are independent, these distributions based on the residuals are technically not so. It would be interesting to see if analytic solutions were forthcoming and the corre-lation of the residual could be estimated or approximated. It may also be worthwhile to consider resampling techniques to estimate confidence intervals for the conditional expected ranks, so that the usual assumptionβ = β can be relaxed.
It may be fruitful to explore higher moments of the conditional rank distribution for each firm. We have discussed the conditional expectation of the distribution, but it may be worthwhile to consider the conditional variance of the rank of each firm. Calculating the variance, and any higher moments, would be a straight-forward exercise based on the conditional rank probabilities that we have presented. One might speculate that firms with high conditional probabilities of being best and worst would have higher conditional variance of their rank distribution than those with high probability of being in the center of the efficiency rank statistic. The best and worst firms will have more weight in one tail of their conditional rank distributions than firms with higher probability at the median efficiency ranks. However, this remains to be seen.