Hierarchical Image Segmentation based on Sequential Partitioning and Merging
1. Background In computer vision, image segmentation is the process that decomposes an image into multiple segments. The goal of this process is to partition an image into something more meaningful and easier for subsequent analyses. For example, for the image shown in Figure 1, human eyes can easily recognize that there are two persons walking on a beach. Apparently, they have just finished snorkeling. To computers, however, this image is nothing but an array of pixel data, with each image pixel (picture element) containing three different kinds of color values (Red, Green, and Blue). If we can decompose the image into several smaller regions with each region containing similar colors or textures, it becomes easier for the computer to recognize the possible objects in the image (like humans, diving boots, and beach) and understand the image content. Over the past few decades, hundreds or even thousands of segmentation algorithms have been proposed, trying to produce segmentation results that are close to what human eyes perceive. Among these segmentation approaches, graph-based methods and clustering-based methods have proven to be quite successful and have been widely used. However, for the beach image in Figure 1, graph-based