Indexed on: 14 Sep '16Published on: 14 Sep '16Published in: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Camera-based text processing has attracted considerable attention and numerous methods have been proposed. However, most of these methods have focused on the scene text detection problem and relatively little work has been performed on camera-captured document images. In this paper, we present a text-line detection algorithm for camera-captured document images, which is an essential step towards document understanding. In particular, our method is developed by incorporating state estimation (an extension of scale selection) into a connected component (CC)-based framework. To be precise, we extract CCs with the maximally stable extremal region (MSER) algorithm and estimate the scales and orientations of CCs from their projection profiles. Since this state estimation facilitates a merging process (bottom-up clustering) and provides a stopping criterion, our method is able to handle arbitrarily oriented text-lines and works robustly for a range of scales. Finally, a text-line/non-text-line classifier is trained and non-text candidates (e.g., background clutters) are filtered out with the classifier. Experimental results show that the proposed method outperforms conventional methods on a standard dataset and works well for a new challenging dataset.