Document Type

Article

Date

1997

Keywords

algorithm, quantile estimates, disk-resident data

Language

English

Disciplines

Computer Sciences

Description/Abstract

The '-quantile of an ordered sequence of data values is the element with rank ' \Theta n, where n is the total number of values. Accurate estimates of quantiles are required for the solution of many practical applications. In this paper, we present a new algorithm for estimating the quantile values for disk-resident data. Our algorithm has the following characteristics: (1) It requires only one pass over the data; (2) It is deterministic; (3) It produces good lower and upper bounds of the true values of the quantiles; (4) It requires no a priori knowledge of the distribution of the data set; (5) It has a scalable parallel formulation; (6) Extra time and memory for computing additional quantiles (beyond the first one) is constant per quantile. We present experimental results on the IBM SP-2. The experimental results show that the algorithm is indeed robust and does not depend on the distribution of the data sets.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.