mpiJava: An Object-Oriented Java interface to MPI

. A basic prerequisite for parallel programming is a good communication API. The recent interest in using Java for scienti(cid:12)c and engineering application has led to several international e(cid:11)orts to produce a message passing interface to support parallel computation. In this paper we describe and then discuss the syntax, functionality and performance of one such interface, mpiJava , an object-oriented Java interface to MPI. We (cid:12)rst discuss the design of the mpiJava API and the issues associated with its development. We then move on to brie(cid:13)y outline the steps necessary to ’port’ mpiJava onto a range of operating systems, including Windows NT, Linux and Solaris. In the second part of the paper we present and then discuss some performance measurements made of communications bandwidth and latency to compare mpiJava on these systems. Finally, we summarise our experiences and then brie(cid:13)y mention work that we plan to undertake.


Introduction
It is generally recognised that the vast majority of scienti c and engineering applications are written in either C, C++ or Fortran.The recent popularity o f J a va has led to it being seriously considered as a good language to develop scienti c and engineering applications, and in particular for parallel computing 1 2 3 4 .Sun's claims, on behalf of Java, that it is simple, e cient and platform-neutral -a natural language for network programming -makes it attractive to scienti c programmers who wish to harness the collective computational power of parallel platforms as well as networks of workstations or PCs, with interconnections ranging from LANs to the Internet.The attractiveness of Java for scienti c computing is being encouraged by bodies like J a va Grande 5 .The Java Grande forum has been set up to co-ordinate the communities e orts to standardise primitive Java types.This technique allows MPIJ to achieve communications speeds comparable with native MPI implementations.jmpi 13 is an MPI environment built upon JPVM 14 , a Java-based implementation of PVM.jmpi, o ers a full Java API to MPI 1.1 as well as features such as thread safety and multiple communication end-points per task.jmpi is a pure Java MPI environment, but is complicated by the need to call JPVM methods.Another recently announced Java MPI interface, called JavaWMPI 15 , is built upon the Windows MPI environment WMPI 20 .JavaWMPI has a very similar in structure to mpiJava, but the syntax of the interface is less object-oriented and a procedural method is used to perform polymorphism between Java datatypes.MPI Software Technology, Inc. has also announced their intention to deliver a commercial Java i n terface to MPI called JMPI 16 .Java implementations of the related PVM message-passing environment h a ve been reported by Y alamanchilli et.al. 17 and The MPI Forum 6 .Many of the above-mentioned groups, as part of the Java Grande forum activities, have recently published a position paper 18 in an attempt to standardise on a single API.

Overview of this article
First we outline the mpiJava API and describe various special issues that arise in Java.Implications of object serialization are also explored brie y as are the di culties due to the lack of true multidimensional arrays in Java.
This discussion is followed by a description of an implementation of the proposed Java binding through a set of wrappers that use the JNI to call existing MPI implementations.The virtues and problems of this implementation strategy are discussed, and results of tests and benchmarks on Solaris, Windows NT and Linux are presented.

Introduction to the mpiJava API
The MPI standard is explicitly object-based.The C and Fortran bindings rely on `opaque objects' that can be manipulated only by acquiring object handles from constructor functions, and passing the handles to suitable functions in the library.The C++ binding speci ed in the MPI 2 standard collects these objects into suitable class hierarchies and de nes most of the library functions as class member functions.The mpiJava API follows this model, lifting the structure of its class hierarchy directly from the C++ binding.The major classes of mpiJava are illustrated in Figure 1.
The class MPI only has static members.It acts as a module containing global services, such as initialization of MPI, and many global constants including the default communicator COMM WORLD.
The most important class in the package is the communicator class Comm.All communication functions in mpiJava are members of Comm or its subclasses.As usual in MPI, a communicator stands for a 'collective object' logically shared Fig. 1.Principal classes of mpiJava by a group of processors.The processes communicate, typically by addressing messages to their peers through the common communicator.
Another class that is important for the discussion below is the Datatype class.This describes the type of the elements in the message bu ers passed to send, receive, and all other communication functions.Various basic datatypes are prede ned in the package.These mainly correspond to the primitive t ypes of Java, shown in Figure 2. In both cases the actual argument corresponding to buf must be a Java array.In the current implementation they must be arrays with elements of primitive type.By implication they must be one-dimensional arrays, because Java ' m ultidimensional arrays' are really arrays of arrays.In these and all other mpiJava calls, the bu er array argument is followed by an o set that speci es the element of in array where the message actually starts.

Special features of the Java binding
The mpiJava API is modelled as closely as practical on the C++ binding de ned in the MPI 2.0 standard currently we only support the MPI 1.1 subset.A numberofchanges to argument lists are forced by of the restriction that arguments cannot be passed by reference in Java.In general outputs of mpiJava methods come through the result value of the function.In many cases MPI functions return more than one value.This is dealt with in mpiJava in various ways.Sometimes an MPI function initializes some elements in an array and also returns a count of the number of elements modi ed.In Java w e t ypically return an array result, omitting the count.The count can be obtained subsequently from the length member of the array.Sometimes an MPI function initializes an object conditionally and returns a separate ag to say i f the operation succeeded.In Java w e return an object handle which i s null if the operation fails.Occasionally an extra eld is added to an existing MPI class to hold extra results -for example the Status class has an extra eld, index, initialized by functions like Waitany.Rarely none of these methods work and we resort to de ning auxilliary classes to hold multiple results from a particular function.In another change to C++, we often omit array size arguments, because they can be picked up within the wrapper by reading the length member of the array argument.
As a result of these changes mpiJava argument lists are often more concise than the corresponding C or C++ argument lists.
Normally in mpiJava, MPI destructors are called by the Java finalize method for the class.This is invoked automatically by the Java garbage collector.For most classes, therefore, no binding of the MPI class FREE function appears in the Java API.Exceptions are Comm and Request, which d o h a ve explicit Free members.In those cases the MPI operation could have observable side-e ects beyond simply freeing resources, so their execution is left under direct control of the programmer.

Derived datatype vs Object serialization
In MPI new derived types of class Datatype can be created using suitable library functions.The derived types allow one to treat contiguous, strided, or indirectly indexed segments of program arrays as individual message elements.The corresponding array subsections can then be communicated in a single function call, potentially exploiting any special hardware or software the platform provides for exchanging scattered data between user space and the communication system.
Currently mpiJava provides all the derived datatype constructors of standard MPI, with one limitation: it places signi cant restrictions on its binding of MPI TYPE STRUCT.In C or Fortran this function can be used to describe an entity combining elds of di erent primitive or derived type.Because of the assumption that bu ers are one-dimensional arrays with elements of primitive type, mpiJava imposes a restriction that all the types combined by its Datatype.Struct member must have the same base type, which must agree with the element type of the bu er array.Also mpiJava does not provide an analogue of MPI BOTTOM bu er address, or the MPI ADDRESS function for nding o sets relative to this absolute member base.In C or Fortran these functions allow bu ers to include elds from separately declared variables or arrays, but the mechanism does not sit well with the pointer-free Java language model.Approaches based on the MPI derived datatype model do not seem to be the best way to alleviate this restriction.A better option is probably to exploit the run-time type information already provided in Java objects.We are developing a v ersion of mpiJava that adds one new prede ned datatype: MPI.Object A message bu er can then be an array o f a n y serializable Java objects.The objects are serialized automatically in the wrapper of send operations, and unserialized at their destination.
The absence of true multi-dimensional arrays in Java limits another use of derived data types.In MPI the MPI TYPE VECTOR function creates a derived datatype representing a strided section of an array.I n C o r F ortran this strided section can be identi ed with a section of a multi-dimensional array.I t could describe, say, an edge of the local patch o f a t wo-dimensional distributed array.In Java there is no equivalence betwe e n a m ulti-dimensional array and a contiguous patch of memory, or a one-dimensional array.The programmer may c hoose to linearize all multi-dimensional arrays in the algorithm, representing them as one-dimensional arrays with suitable index expressions.In this case derived datatypes can be used to send and receive sections of the array.Alternatively the programmer may use Java arrays of arrays to represent m ulti-dimensional arrays.This simpli es the index arithmetic in the program.Sections of the array are then explicitly copied to one-dimensional bu ers for communication.The latter option seems to be more popular with programmers.
Although, for reasons of conformance of with MPI standards, we expect to continue supporting derived datatypes in mpiJava, their value in the Java domain is less clear-cut than in C or Fortran.Allowing serializable objects as bu er elements is probably a more powerful facility.
3 mpiJava implementations on PC Platforms

Introduction
As Java is a platform-neutral language there is much i n terest in 'porting' mpiJava to PC-based systems, in particular Windows NT and Linux.To 'port' mpiJava ont o a P C e n vironment i t is necessary to have a native MPI library, a v ersion of the Java Development T oolkit JDK and a C compiler.mpiJava consists of two main parts: the MPI Java classes and the C stubs that binds the MPI Java classes to the underlying native MPI implementation.We create these C stubs using JNI -the means by which J a va can call and pass parameters to and from a native API. Figure 4 provides a simple schematic view of the software layers involved.
To 'port' mpiJava onto a new platform, generally two steps need to be undertaken: Create a native library out of the compiled JNI C stubs.Compile MPI Java in class libraries -ensuring that the correctly named stub library is loaded by the Java System.loadLibrary"stublib"call in the main source le MPI.java.

Fig. 4. Software Layers
The development and testing of mpiJava was undertaken on various Sun and SGI UNIX platforms using MPICH.As interfacing Java t o MPI is not always trivial, in earlier implementation we often saw low-level con icts between the Java runtime and the interrupt mechanisms used in the MPI implementations.The situation is improving as the JDK matures, in particular version 1.2 allows the use of green or native threads, which h a ve eliminated the interrupt problem that we encountered with earlier releases of the JDK.mpiJava is now stable on NT platforms using WMPI and JDK 1.1 or later as well as UNIX platforms using MPICH and JDK 1.2 .

Windows NT
To test mpiJava under Windows NT we had the choice of a number of MPI implementations to pick from 19 .We c hose WMPI from the Instituto Supererior de Engenharia de Coimbra, Portugal.WMPI is a full implementation of MPI for Microsoft Win32 platforms.WMPI is based on MPICH and includes a p4 21 device standard.P4 provides the communication internals and a startup mechanism.The WMPI package is a set of libraries for Borland C++, Microsoft Visual C++ and Microsoft Visual FORTRAN.The release of WMPI provides libraries, header les, examples and daemons for remote start-up.WMPI can co-exist and interact with MPICH ch p4 in a cluster of mixed UNIX and Win32 platforms.WMPI is still under development and is freely available.mpiJava under WMPI: To create a release of mpiJava for WMPI the following steps were undertaken: Step 1 -Compile the mpiJava JNI C interface into a Win32 Dynamic Link Library mpiJava.dll.
Step 2 -Modify the name of the library loaded by the mpiJava interface MPI.java to that of the of the newly compiled library.
Step 3 -Compile the Java MPI interface into class libaries.
Step 4 -Create a JNI interface to WMPI.This was necessary as under WMPI a master process is rst spawned.Its purpose is to rst read in a job con guration le and use the information within it to set up and run the actual MPI processes.An idiosyncrasy of WMPI is that all MPI processes must have a le name with the extension .EXE.This led to the need to produce a JNI to WMPI so that the JVM was loaded and the 'main' method of mpiJava Java class started.

Linux
At the time of writing this paper, our attempts to 'port' mpiJava to Linux are in progress.We are currently experiencing problems similar to those encountered during our early attempts to create the interface on Solaris, mentioned in section 3.1 .Sun's releases the JDK for Solaris and NT platforms rst.On other platforms, such as Linux, it is necessary for developers to 'port' the JDK.The most recent release of the JDK for Linux is 1.1.7 and this is version is known to be the cause of our problems.It is anticipated that JDK 1.2 will be available for Linux shortly and that we will be able to report our experiences with Linux at the IPPS SPDP 99 workshop in April 1999.

Functionality T ests
An integral part of the development of this project was to produce or translate a number of basic MPI test codes to mpiJava.A n obvious starting point was the C test suite originally developed by IBM .This suite had been modi ed to comply fully with the MPI standard and to be compatible with the MPICH.The suite consists of fty-seven C programs that test the following MPI calls and data types; collective operations, communicators, data types, environmental inquiries, groups, point to point and virtual topologies.These codes were all translated to mpiJava.
Under WMPI and Solaris-MPICH these codes were run either as multiple processes on a single machine Shared Memory mode -SM or as multiple processes running on separate machines Distributed Memory mode -DM.Under WMPI and Solaris-MPICH all the codes ran in both modes without alterations.Our experiences using Linux-MPICH will be reported when JDK 1.2 is available for Linux.

Simple Communications Performance Measurement 4.1 Introduction
At this early stage of our project we h a ve decided to restrict performance measurements to those that will give some indication of the basic inter-processor communications performance.The actual computational performance of each process is felt to be dependent on the local JVM and associated technologies used by speci c vendors to increase the performance of Java.

PingPong Communications Performance Tests
In this program increasing sized messages are sent back and forth between processes -this is commonly called PingPong.This benchmark is based on standard blocking MPI Send MPI Recv.PingPong provides information about latency of MPI Send MPI Recv and uni-directional bandwidth.To ensure that anomalies in message timings are minimised the PingPong is repeated many times for each message size.The codes used for these tests were those developed by Baker and Grassl 23 .The three existing codes MPI-C, MPI-Fortran and Winsock-C were used for comparison and we implemented an mpiJava version for our purposes.
The main problem encountered running the PingPong code was that under WMPI on Win32 MPI Wtime had been implemented with a millisecond resolution.It was necessary to adapt each of the codes to use an alternative timer with microsecond s resolution.The performance tests shown in the next section were run on two similar systems:  The mpiJava curve mirrors that of C with an almost constant o set up to 8K, thereafter the curves converge meeting at 256K.Under MPICH, the curves for C and mpiJava mirror each other in a similar fashion to those under WMPI, again there is a constant o set and convergence at around 256K.
Under WMPI the peak bandwidth of C is around 65 MBytes s and mpiJava is 54 MBytes s.The peaks occur at around 64K.Under MPICH the bandwidth is attening out, but still increasing for C and mpiJava, at the 1M.The actual rate measured at this point is about 50 MBytes s.
Clearly the WMPI C code perform best of those tested.The performance of mpiJava in SM under WMPI is good -it exhibits a fairly constant o verhead of 95s up to 2K, thereafter it converges with the C curve.The performance the C code under MPICH is slightly surprising as the NT and Solaris platforms used for these tests had similar speci cations.It is assumed that the performance re ects the usage of MPICH rather than a native v ersion of MPI for Solaris.Even so, the MPICH results for mpiJava show that it exhibits reasonable performance.In DM the di erences between the MPI codes is not as pronounced as seen in SM.Under WMPI the C and mpiJava codes display v ery similar performance characteristics throughout the range tested.Under MPICH, there is distinct performance di erence between C and mpiJava, H o wever the di erence is much smaller than in SM and the curves converge at the 4K.All curves peak at about 1 MByte s, which is about 90 of the maximum attainable on 10 Mbps Ethernet link.

Overall Results Discussion
In both SM and DM modes mpiJava adds a fairly constant o verhead compared to normal native MPI.In an environment like WMPI, which has been optimised for NT, the actual overheads of using mpiJava are relatively small at around 100ms.Under MPICH the situation is not quite so good, here the use of mpiJava introduces an extra overheads of between 250 -300s.
It should be noted that these results compare codes running directly under the operating system with those running in the JVM.For example, according to a single 200 MHz PentiumPro will achieve in excess of 62 M op s on a Fortran version of LinPack.A test of the Java LinPack c o d e g a ve a peak performance of 22 M op s for the same processor running the JVM.The di erence in performance will account for much of the additional overhead that mpiJava imposes on C MPI codes.From this it can be deduced that the quality and performance of JVM on each platform will have the greatest e ect on the usefulness of mpiJava for scienti c computation.

Overall Summary
We h a ve discussed the design and development o f mpiJava -a pure Java i n terface to MPI.We have also highlighted the bene ts of a fully object-oriented Java API compared to those currently available.Our performance tests have shown that mpiJava should ful l the needs of MPI programmers not only in terms of functionality but also in terms of good performance when compared to similar C MPI programs.Unfortunately, at the time of submission of this paper we h a ve been unable to test mpiJava under Linux, but we believe that the problems we have encountered we b e overcome soon and we will be able to present our nding during the Workshop in April 1999.Overall, however, we feel that we have implemented a well designed, functional and e cient J a va i n terface to MPI.

Particular Conclusions
mpiJava provides a fully functional and e cient J a va i n terface to MPI.Our performance tests have shown that, in terms of communications speeds, WMPI on NT out performs MPICH on Solaris.When used for distributed computing the current implementation of mpiJava does not impose a huge overhead on-top of the native MPI interface.We h a ve discovered some of the limitation in the usage of JNI.In particular with MPICH where we had problems with UNIX signals.We are hopeful that these problems will disappear when we start using JDK 1.2 and native threads.Our performance tests indicate that much of the additional latency that mpiJava imposes is due to the relatively poor performance of the JVM rather than the impact of messages traversing additional software layers.The syntax of mpiJava is easy to understand and use, thus making it relatively simple for programmers with either a Java or Scienti c background to take up.We believe that mpiJava will also provide a popular means for teaching students the fundamentals of parallel programming with MPI.

Fig. 6 .
Fig. 6.PingPong Results in Distributed Memory DM mode Fig. 3. Minimal mpiJava program run in two processes

Table 1 .
Two dual processor P6 200 MHz NT 4 workstations each with 128 MBytes of DRAM.Two dual processor UltraSparc 200 MHz Solaris workstations with 256 MBytes of DRAM.Both systems were connected via 10BaseT Ethernet and the tests were carried out when there was little network activity and on quiet machines.Time for 1 Byte MessagesIn Table1we show the transmission time in microseconds s to send a 1 byte message in each of the environments tested.In SM the mpiJava wrapper adds an extra 94s 140 and 226s 152 compared to WMPI and MPICH C respectively.I n D M t h e mpiJava wrapper adds an extra 66s 11 and 282s 42 compared to WMPI and MPICH C respectively.The Wsock gures are those for a WinSock implementation of PingPong benchmark using TCP.Linux results will be presented during the conference workshop.