HPHFS: A file system prototype for a cluster-based server environment using network-attached storage devices

Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Electrical Engineering and Computer Science


Alok Choudhary


Data access semantics, Performance bottlenecks, Cluster-based server, Storage devices

Subject Categories

Computer Sciences


The current demand on high-performance I/O is higher than ever and cannot afford to allow any unnecessary overhead. Over the last few years, because of a performance bottleneck at the file servers, traditional file system implementations have exposed their limitations for providing aggregate I/O bandwidth on demand. Such a performance bottleneck should be avoided in order to provide clients with scalable and high I/O bandwidth. File systems should be able to provide a demand-specific storage service in order to serve a wide variety of applications as a main I/O service provider. Furthermore, network-striping of data files is essential in order to increase aggregate per-application I/O bandwidth and to support parallel processing. In this work we adopt and combine two off-the-shelf technologies, NASD and Fibre Channel, to provide a scalable solution to I/O subsystem architecture for a cluster-based server environment. On top of such a new I/O architecture, we propose a prototype for a High-Performance Hybrid File system (HPHFS) that provides a new I/O access semantics.

HPHFS provides a data access semantics through a third-party transfer mechanism that enables a data path to bypass a performance bottleneck at the traditional file server by providing direct data transfer between storage devices and clients. HPHFS also employs a two-level disk space allocation mechanism that provides an optimum access semantics for both small and large data files. The sizes of data files span a wide range from a few tens of bytes to multi-gigabytes, and any sole implementation of either network-distributed or network-striped file systems results in favoring a group of data files while sacrificing the performance of I/O accesses for the other group of data files. The current user demand on I/O performance cannot afford such a bargain in I/O bandwidth; a file system should be adaptive in order to provide a consistent Quality-of-Service for users. HPHFS is adaptive in the sense that it can change its policy rather than adheres to a single fixed policy in order to provide a storage mechanism that butter suits the given condition. HPHFS is not prejudiced in the sense that its storage mechanism provides an optimum QoS for both groups of data files. Finally, HPHFS employs a hybrid use of network-striped and non-striped data distribution, thus providing users with the flexibility of choice. Network-striping in HPHFS not only increases per-application aggregate I/O bandwidth, but also supports parallel computing within a cluster of general purpose workstations. With HPHFS, a data file can be stored on a single disk without being striped across multiple disks when network-striping is not beneficial. Among other advantages, users have the freedom of specifying whether or not their data files are to be distributed. In a simulation-based performance evaluation, the HPHFS prototype outperformed both network-striped and network-distributed file system prototypes for all the types of workloads simulated.


Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.