Date of Award
Master of Science (MS)
Electrical Engineering and Computer Science
CNN, Computer Vision, Deep Learning, Image Processing, Image Segmentation, Medical Image
Image segmentation has been an important area of study in computer vision. Image segmentation is a challenging task, since it involves pixel-wise annotation, i.e. labeling each pixel according to the class to which it belongs. In image classification task, the goal is to predict to which class an entire image belongs. Thus, there is more focus on the abstract features extracted by Convolutional Neural Networks (CNNs), with less emphasis on the spatial information. In image segmentation task, on the other hand, the abstract information and spatial information are needed at the same time. One class of work in image segmentation focuses on ``recovering” the high-resolution features from the low resolution ones. This type of network has an encoder-decoder structure, and spatial information is recovered by feeding the decoder part of the model with previous high-resolution features through skip connections. Overall, these strategies involving skip connections try to propagate features to deeper layers. The second class of work, on the other hand, focuses on ``maintaining" high resolution features throughout the process.
In this thesis, we first review the related work on image segmentation and then introduce two new models, namely Unet-Laplacian and Dense Parallel Network (DensePN). The Unet-Laplacian is a series CNN model, incorporating a Laplacian filter branch. This new branch performs Laplacian filter operation on the input RGB image, and feeds the output to the decoder. Experiments results show that, the output of the Unet-Laplacian captures more of the ground truth mask, and eliminates some of the false positives. We then describe the proposed DensePN, which was designed to find a good balance between extracting features through multiple layers and keeping spatial information. DensePN allows not only keeping high-resolution feature maps but also feature reuse at deeper layers to solve the image segmentation problem. We have designed the Dense Parallel Network based on three main observations that we have gained from our initial trials and preliminary studies. First, maintaining a high resolution feature map provides good performance. Second, feature reuse is very efficient, and allows having deeper networks. Third, having a parallel structure can provide better information flow. Experimental results on the CamVid dataset show that the proposed DensePN (with 1.1M parameters) provides a better performance than FCDense56 (with 1.5M parameters) by having less parameters at the same time.
Wang, Jiyang, "SEMANTIC IMAGE SEGMENTATION VIA A DENSE PARALLEL NETWORK" (2019). Theses - ALL. 324.