Analyzing Images Containing Multiple Sparse Patterns with Neural Networks

We have addressed the problem of analyzing images containing multiple sparse overlapped patterns. This problem arises naturally when analyzing the composition of organic macromolecules using data gathered from their NMR spectra. Using a neural network approach, we have obtained excel lent results in using NMR data to analyze the presence of amino acids in protein molecules. We have achieved high correct classification percentages (about 87%) for images containing as many as five substantially distorted overlapping patterns.


Introduction
The ability to analyze complex images containing multiple patterns is a difficult but important task in several real-world applications. A robotic vision system, for example, may be required to identify juxtaposed objects on a conveyer belt. In handwritten character recognition, an image may contain several, possibly overlapped, characters each of which must be recognized separately. The conventional approach in such problems is to segment the input image and then perform feature-extraction followed by classification [3]. This method is appropriate when the expected patterns consist of contiguous sets of distinct features. Special techniques are required for efficient analysis when images are large but sparse.
In this paper, we are concerned with the problem of analyzing images containing large multiple overlapped sparse patterns. Such patterns consist of a small number of features dispersed widely in the image. The features are usually small in size: perhaps just a single pixel. This classification problem is encountered when analyzing images obtained by certain types of NMR (Nuclear Magnetic Resonance) spectroscopy.
In the application described in this paper, a pattern that belongs to a single class will typically consist of about ten non-zero pixels in a 256 X 256 image. For each class of patterns, the features occur only in certain characteristic regions of the image. An image presented for classification will generally contain a number of patterns that belong to several different classes. Our task is to determine the classes to which we may attribute patterns contained in an input image.
Relatively little work has been reported in the literature on efficient neural network techniques for the analysis of images containing multiple patterns. One possible approach is to use Strong and Whitehead's physiological model (16] which describes how humans are able to pay selective attention to each pattern contained in a complex image in a sequential order. Fukushima's selective-attention neural network for classifying overlapped patterns [6] (based on the Neocognitron [5] [4]) gives some of the few successful results iu this area. If a composite pattern consisting of several hand-written characters is presented, the network attends to those characters one at a time and recognizes individual characters.
The main problem in applying Fukushima's approach for large images task is the sheer size of the required network. As many as 41000 cells are needed for classifying patterns in a 19 x 19 image. Since we must process considerably larger 256 x 256) images, the computational requirements using Fukushima's model are too high.
Our approach is to perform efficient analysis by exploiting the sparseness of images. We have developed a modular analyzer for the problem of analyzing images containing multiple sparse patterns. Each module detects the presence of patterns that belong to one class in the input image. Each module has two stages. The first stage is a feature detector based on clustering [8]. For each class of patterns, cluster analysis is used to identify those regions of the input image where the features of patterns belonging to that class are most likely to be found. The second stage of each module is a standard backpropagation-trained feed-forward neural network [15] which performs the tasks of thresholding and classification.
We have used our analyzer to analyze images obtained from NMR spectroscopy of proteins. Proteins are large complex molecules made up of building blocks called amino acids, of which there are 18 commonly occurring types. The presence of a constituent amino acid in a protein can be detected by observing a characteristic pattern in the NMR spectrum of the protein. We have been able to analyze correctly artificially generated NMR spectra of small proteins containing upto five different types of amino acids.
In the next section, we discuss the problem of analyzing multiple sparse patterns, describe some details of the NMR analysis problem, and discuss previous work on this topic. In section 3, we describe details of the analyzer. Details of the experiments and experimental results are presented in section 4. Section 5 contains concluding remarks.

The problem
Our problem is to analyze images containing multiple sparse patterns. Three sparse patterns all of which belong to the same class are shown in figure 1. Each pattern belonging to this class has three 'features'. The location of each feature (a feature is just single non-zero pixel) is shown by a '.' sign. As suggested by figure 1, the locations of the features are not fixed. However, they do remain within the feature-  shown, the feature-regions for two classes are known to the analyzer. Using this knowledge, the analyzer can determine whether a pattern of either class is present in the input image.
If the feature-regions for different classes do not overlap, pattern recognition would be a simple task, since it would suffice to recognize positively any one feature of each class. In the applications that we are interested in, feature-regions for different classes do overlap. Consequently, a feature may lie within feature-regions of several classes.
Although such a feature does not permit us to decide unambiguously whether a particular class of patterns is present in the image, it does however partially constrain the classification. Overall, the problem of classifying images containing sparse patterns may be characterized as having multiple simultaneous constraints. Rumelhart and McClelland [15] note that such problems are ideal candidates for neural-network solutions.

Problem definition
In this section, we formally define the problem of analyzing images with multiple patterns. We first define our representation of an input image and then specify the information needed to characterize each class of patterns. We then describe the process by which this information is used to determine whether a particular class of patterns is present in an image. where each Pi= (!'is, Pi 11 , ~,.)for 1 ~ i < N.
An image, P, may contain several patterns, (disjoint subsets of P) where by 'pattern' we mean a collection of peaks associated with a certain class. We do not know in advance how many patterns· are contained in an input image. Therefore, the number of peaks, N, in each image varies.
Characterizing a class of patterns: In some cases, we find that patterns which belong to one class may occur in several different configurations. We therefore define a set of pattern-templates, Tc, for each class C, where each pattern-template characterizes one configuration: Each pattern-template, is a set of feature-templates: it is determined implicitly when a neural network is trained.
Matching a feature-template: This is the first step in pattern recognition. We must determine whether a peak in the input image matches a feature-template. Given a peak Pi which may correspond to feature Fcc,j,k)l we define a measure of the distance between the two by a matching function. It is given by where g, the 'error function', is chosen to decrease with distance lr-(Piz, ~1 1 )j.
We say that a peak Pi 'matches' a feature-template F(c,j,k) if: Matching a pattern-template: We say that an input image P 'matches' a pattern- , we can identify a unique peak PiE P such that P; matches Fcc,j,k)· Classification: If an input image P matches a pattern-template tee,;), a pattern of class C is defined to be present in the input image. The overall analysis task is to determine all the classes whose features are present in the input image, hence the above procedure is repeated for every class. 5

Classification of NMR spectra
NMR (Nuclear Magnetic Resonance) spectroscopy is a powerful method for the determination of the three-dimensional structure of complex organic macromolecules such as proteins [18]. Proteins are long chains of smaller molecules called amino acids. Approximately 18 different types of amino acids are commonly found in proteins. The first step in analyzing the structure of a protein is to determine its constituent amino acids. One type of NMR spectroscopy used for this purpose is called Correlational spectroscopy (usually referred to as COSY).
A COSY spectrum is a two-dimensional integer array which is symmetric about the SW-NE diagonal. The spectra of the amino acids are sparse patterns of the kind described in section 2.1. The number of peaks in the spectrum of an amino acid is generally quite small (usually between 2 and 10). There is a slight variability in the positions of the peaks of an amino acid, which arises from interactions with the neighboring amino acids in the protein.
The spectrum of a protein is the result of the combination of the spectra of its constituent amino acids. The task of determining the constituent amino acids of a protein is therefore analogous to the task of analyzing an image containing multiple sparse patterns. The training set for our analyzer consists of a number of sample spectra for each type of amino acid. These spectra were generated from information about the distributions of peaks for each type of amino acid tabulated in [7].
Although we have only analyzed images obtained from one type of spectroscopy (COSY), there is actually a wide variety of NMR techniques available to the chemist.
Most of the automatic peak-assignment programs described in the chemical literature use several different types of spectra as inputs. We expect the performance of our system to improve with the use of multiple sources of inputs.
Kleywegt et al. [13] have described a system called CLAIRE which takes as input the 'HOHAHA' (Homonuclear Hartmann-Hahn Spectroscopy) spectrum and the 'NOESY' (Nuclear Overhauser Effect Spectroscopy) spectrum of the protein being investigated. When used on a protein with 46 amino acids, it was able to identify the peaks for 34 of them correctly.
Another system devised by Eads and Kuntz [2] requires for input the COSY, the NOESY and HOHAHA spectra. For a protein with 51 amino acids, their program 6 was able to identify the peaks for 21 of them. The remaining peaks in the spectra were assigned manually. Although their program took only 15 minutes to run, the manual assignment required two weeks of effort.
The use of neural networks for this problem does not seem to have been investigated previously except in a rudimentary manner for classifying one-dimensional NMR spectra [17] 3 System description In this section, we describe a modular analyzer for the analysis of images containing multiple sparse patterns. An important aspect of our system is the use of a clustering algorithm to train feature detectors for each class. In machine vision systems, clustering is often used for image-segmentation [14]. Note that we distinguish between the use of the term 'feature detection' in the context of statistical pattern recognition and in our system. We are using feature detection to identify spatial features in the patterns. In statistical pattern recognition, feature detection refers to the process of choosing the best set of variables to characterize a set of items [1].
We first describe how clustering is relevant to our problem and then present an overview of the analyzer. We then describe the internal details of the modules.

Clustering
Clustering is used to find the expected locations of features. We illustrate this with an example. Let us suppose that the training set for some class say, C, consists of the three images in figure 1. Each image contains one pattern, which belongs to class C. These images have been superimposed in figure 3. Clearly, the features occur in three clusters. The center of each cluster is the expected location of a feature.
The procedure may be summarized thus: for each class C, create a set Rc containing the locations of all features in an image created by superimposing all the training set images for class C. By applying a clustering algorithm to Rc, we determine the expected location of each feature.
We have investigated two clustering algorithms: K-means clustering algorithm [8] and the LVQ (Learning Vector Quantizer) [9]. We have found that the LVQ performs -·······  better for our problem. A benchmarking study by Kohonen et. al. [11] also found that the LVQ produced better results than K-means clustering on several classification tasks.
The LVQ is a nearest-neighbor classifier which is trained by a competitive learning procedure. An LVQ is an array of units, each of which possess a reference vector m of the same dimensionality as the pattern space. In a trained LVQ, the reference vectors are the positions of the cluster centers. When a pattern is presented to the LVQ, only the unit whose reference vector is closest to the input pattern responds. Thus, the LVQ divides the pattern space into a number of regions, with one reference vector in each region. These regions are a Voronoi tesselation of the input space.
An additional property of the LVQ, which we do not use in our system, is that the training procedure creates an ordered map of the pattern space. In other words, if two units are close together in the array, their decision regions will be close to each other in the pattern space. Although LVQs with two-dimensional arrays of units are have been used for applications such as speech processing [12], we have obtained good results with a one-dimensional LVQ for our problem.
The learning procedure is now summarized briefly. Initially, the reference vectors are randomly distributed in the pattern space. Each training step requires the presentation of a pattern chosen randomly from the training set. A unit whose reference vector is the closest to the presented pattern is called the winner. The reference vectors of the winning unit and its neighbors in the array are adjusted so as to move them closer to the presented pattern. The amount by which the reference vectors are adjusted is gradually decreased during the learning procedure.
For each class C, we use the LVQ learning procedure on the set Rc to obtain a set of reference vectors. Before we describe how the reference vectors are used in our analyzer, we make two observations regarding the choice of number of units used in

The clustering filter
The role of each clustering filter (shown in figure 5) is to extract relevant information from the input image for one class of patterns. A clustering filter consists of a number of feature detectors. A feature detector is a processing unit which is activated by the presence of a feature (peak) in an associated region of the input image called the receptive field. The output of a feature detector is a real value which depends on the  As we noted previously, there are cases where two peaks sometimes occur very close together. We do not need to make any special provision for this situation. We use only one set of feature d~tectors to cover the combined feature region. The output from these feature detectors will be higher, but this is easily handled by the neural network.

The neural network
Although a clustering filter is trained to respond most strongly to patterns of a particular class, it is possible (due to overlap of feature-regions) that some of the detectors of one class may be activated when patterns of another class are presented.
We use a neural network to determine implicitly the appropriate thresholds for each pattern detector. A neural network is trained after the clustering filters of the first stage have been trained and set up.
For each class C, the neural network (of the corresponding module) must be taught to discriminate between feature vectors obtained from images containing a pattern of class C and feature vectors produced from images which do not contain patterns of class C.
We use backpropagation (15] to train the network. Backpropagation is a supervised learning algorithm in which a set of patterns to be learnt are repeatedly presented to the network together with their target output patterns. At the outset of a training session, the weights and biases are randomly initialized. For each pattern presented, the error backpropagation rule defines a correction to the weights and thresholds to minimize the square sum of the differences between the target and the actual outputs. The learning process is repeated until the average difference between the target and the actual output falls below an operator-specified level.

Results
In this section, we describe our experiments in training and testing the sparse image recognition system, and report the results obtained.

System parameters
To substantiate our approach, seven modules were trained for the NMR protein analysis problem. Each module can detect patterns corresponding to one amino acid. The final output from each module is a yes/no answer about whether the respective class (amino acid) is judged to be present. From among 18 possible amino acids, we trained modules for seven amino acids whose spectra appeared to be the most complex, with more peaks than the others. But in training as well as testing the modules, we used data which included peaks from the other 11 amino acids as well, and obtained good results in analyzing the presence of the seven amino acids for which modules were trained. This shows that an incremental approach is possible: upon building modules for the other 11 amino acids, we expect that our results will continue to hold.  Glutamic acid (e) 14 10 Phenylalanine (f) 26 10 Isoleucine (i) 19 20 Valine (v) 15 20 The training set consists of a total of 90 single-class images, with 5 for each of the 18 amino acids. The equation indicating how weights are changed using the error back-propagation procedure [15] is: (1) In each module, we used a value of TJ = 0.1 for the learning rate parameter, and a value of a = 0.05 for the momentum coefficient. The target mean squared error (to terminate the network training procedure) was set to be 0.01. During training, the target output for the networks was set to be 0.1 when the required answer was 'no' and 0.9 when the required answer was 'yes'. Weights in the network were updated only at the end of each 'epoch' (one sequence of presentations of all training inputs).

Experimental results
The goal of the experiments was to measure the correctness of overall classification     It is desirable to perform correct classification even in the presence of small errors or corrupted data. Hence, we tested our system with composite images produced by superimposing distorted versions of the training set images to the system. With the small receptive field system ( T = 0.5), the combined effect of distortion and multiple patterns causes classification accuracy to deteriorate substantially. On the other hand, classification capabilities of the large receptive field system ( T = 0.1) are less affected and degrade more gracefully with noise. This phenomenon contrasts with the observation that the small receptive field system performs marginally better on uncorrupted test data.

Concluding Remarks
In this paper, we have addressed the problem of analyzing images containing multiple sparse overlapped patterns. This problem arises naturally when analyzing the composition of organic macromolecules using data gathered from their NMR spectra.
Using a neural network approach, we have obtained excellent results in using NMR data to analyze the presence of amino acids in protein molecules. We have achieved high correct classification percentages (about 87%) for images containing as many as five substantially distorted overlapping patterns.
The architecture of our system is modular: each module analyzes the input image and delivers a yes/no output regarding the presence of one class of patterns in the image. Each module contains two stages: a clustering filter, and a feedforward neural network. An unconventional aspect of our approach is the use of clustering to detect spatial features of patterns. This may be compared to the use of clustering for imagesegmentation in several machine vision systems.
We performed a number of experiments to measure the correctness of overall classification when the system was presented with composite images containing several patterns of different classes. We tried two versions of the system, one with small receptive field detectors and the other with large receptive field detectors. In both cases, we observed that the rate of correct classification decreased as the number of patterns in the image was increased. To determine the ability of the system to cope with variations in the patterns, images with random perturbations to the patterns were presented to the system in another series of experiments. In this case, we observed that the classifica·~ion abilities of the large receptive field system are less affected and degrade more gracefully.
The classification process described in this paper is only the first step in the analysis of NMR spectra. It is of considerable interest to chemists to determine the precise association of the peaks in in the input image with different patterns. We are currently working on an extension to the system described in this paper to perform this task. We plan to refine the clustering algorithm to enable the use of featuredetectors with variable size receptive fields. We expect to improve performance by combining the evidence from multiple input sources, as is done in other NMR analysis methods.