Parameter-independent geometric document layout analysis

Dae Seok Ryu, Sun M. Kang, Seong Whan Lee

Research output: Chapter in Book/Report/Conference proceedingChapter

8 Citations (Scopus)

Abstract

We propose a new method independent of parameters for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables and lines. A pyramidal quadtree structure is constructed for multiscale analysis and top-down approach, and a periodicity measure is suggested to find a periodical attribute of text regions. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Experimental results with the document database from the University of Washington show that the proposed method works better than the previous ones.

Original languageEnglish
Title of host publicationProceedings - International Conference on Pattern Recognition
Pages397-400
Number of pages4
Volume15
Edition4
Publication statusPublished - 2000

Fingerprint

Textures

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Ryu, D. S., Kang, S. M., & Lee, S. W. (2000). Parameter-independent geometric document layout analysis. In Proceedings - International Conference on Pattern Recognition (4 ed., Vol. 15, pp. 397-400)

Parameter-independent geometric document layout analysis. / Ryu, Dae Seok; Kang, Sun M.; Lee, Seong Whan.

Proceedings - International Conference on Pattern Recognition. Vol. 15 4. ed. 2000. p. 397-400.

Research output: Chapter in Book/Report/Conference proceedingChapter

Ryu, DS, Kang, SM & Lee, SW 2000, Parameter-independent geometric document layout analysis. in Proceedings - International Conference on Pattern Recognition. 4 edn, vol. 15, pp. 397-400.
Ryu DS, Kang SM, Lee SW. Parameter-independent geometric document layout analysis. In Proceedings - International Conference on Pattern Recognition. 4 ed. Vol. 15. 2000. p. 397-400
Ryu, Dae Seok ; Kang, Sun M. ; Lee, Seong Whan. / Parameter-independent geometric document layout analysis. Proceedings - International Conference on Pattern Recognition. Vol. 15 4. ed. 2000. pp. 397-400
@inbook{f05709a6c67c43c8ae02d2942cbf2926,
title = "Parameter-independent geometric document layout analysis",
abstract = "We propose a new method independent of parameters for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables and lines. A pyramidal quadtree structure is constructed for multiscale analysis and top-down approach, and a periodicity measure is suggested to find a periodical attribute of text regions. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Experimental results with the document database from the University of Washington show that the proposed method works better than the previous ones.",
author = "Ryu, {Dae Seok} and Kang, {Sun M.} and Lee, {Seong Whan}",
year = "2000",
language = "English",
volume = "15",
pages = "397--400",
booktitle = "Proceedings - International Conference on Pattern Recognition",
edition = "4",

}

TY - CHAP

T1 - Parameter-independent geometric document layout analysis

AU - Ryu, Dae Seok

AU - Kang, Sun M.

AU - Lee, Seong Whan

PY - 2000

Y1 - 2000

N2 - We propose a new method independent of parameters for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables and lines. A pyramidal quadtree structure is constructed for multiscale analysis and top-down approach, and a periodicity measure is suggested to find a periodical attribute of text regions. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Experimental results with the document database from the University of Washington show that the proposed method works better than the previous ones.

AB - We propose a new method independent of parameters for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables and lines. A pyramidal quadtree structure is constructed for multiscale analysis and top-down approach, and a periodicity measure is suggested to find a periodical attribute of text regions. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Experimental results with the document database from the University of Washington show that the proposed method works better than the previous ones.

UR - http://www.scopus.com/inward/record.url?scp=33750930091&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750930091&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:33750930091

VL - 15

SP - 397

EP - 400

BT - Proceedings - International Conference on Pattern Recognition

ER -