Abstract
Vision systems employing region segmentation by color are crucial in real-time mobile robot applications. With careful attention to algorithm efficiency, fast color image segmentation can be accomplished using commodity image capture and CPU hardware. This paper describes a system capable of tracking several hundred regions of up to 32 colors at 30 Hz on general purpose commodity hardware. The software system consists of: a novel implementation of a threshold classifier, a merging system to form regions through connected components, a separation and sorting system that gathers various region features, and a top down merging heuristic to approximate perceptual grouping. A key to the efficiency of our approach is a new method for accomplishing color space thresholding that enables a pixel to be classified into one or more, up to 32 colors, using only two logical AND operations. The algorithms and representations are described, as well as descriptions of three applications in which it has been used.
1 Introduction
An important first step in many color vision tasks is to classify each pixel in an image into one of a discrete number of color classes. The leading approaches to accomplishing this task include linear color thresholding, nearest neighbor classification, color space thresholding and probabilistic methods. Linear color thresholding works by partitioning the color space with linear boundaries (e.g. planes in 3-dimensional spaces). A particular pixel is then classified according to which partition it lies in. This method is convenient for learning systems such as neural networks (NNs), or multivariate decision trees (MDTs) [2].
4 Conclusion
We have presented a new system for real-time segmentation of color images. It can classify each pixel in a full resolution captured color image, find and merge regions of up to 32 colors, and report their centroid, bounding box and area at 30 Hz. The primary contribution of this system is that it is a software-only approach implemented on general purpose, inexpensive, hardware (in our case 350MHz or 700MHz x86 compatible systems with $200 image digitizers). Among full frame processing systems, this provides a significant advantage over more expensive hardware-only solutions, or other slower software approaches. The system operates on the image in several steps:
1. Optionally project the color space.
2. Classify each pixel as one of up to 32 colors.
3. Run length encode each scanline according to color.
4. Group runs of the same color into regions. 5. Pass over the structure gathering region statistics. 6. Sort regions by color and size.