Automatic generation of structured hyperdocuments from multi-column document images

Ji Yeon Lee, Song Ha Choi, Seong Whan Lee

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents(ToC) page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that HTML documents corresponding to the paper documents can be generated in a visual layout, and that their structured table of contents page, with the hierarchically ordered section titles hyperlinked to the contents, can be also produced by the proposed methods.

Original languageEnglish
Title of host publicationProceedings - International Conference on Pattern Recognition
Pages422-425
Number of pages4
Volume15
Edition4
Publication statusPublished - 2000

Fingerprint

HTML
Experiments

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture

Cite this

Lee, J. Y., Choi, S. H., & Lee, S. W. (2000). Automatic generation of structured hyperdocuments from multi-column document images. In Proceedings - International Conference on Pattern Recognition (4 ed., Vol. 15, pp. 422-425)

Automatic generation of structured hyperdocuments from multi-column document images. / Lee, Ji Yeon; Choi, Song Ha; Lee, Seong Whan.

Proceedings - International Conference on Pattern Recognition. Vol. 15 4. ed. 2000. p. 422-425.

Research output: Chapter in Book/Report/Conference proceedingChapter

Lee, JY, Choi, SH & Lee, SW 2000, Automatic generation of structured hyperdocuments from multi-column document images. in Proceedings - International Conference on Pattern Recognition. 4 edn, vol. 15, pp. 422-425.
Lee JY, Choi SH, Lee SW. Automatic generation of structured hyperdocuments from multi-column document images. In Proceedings - International Conference on Pattern Recognition. 4 ed. Vol. 15. 2000. p. 422-425
Lee, Ji Yeon ; Choi, Song Ha ; Lee, Seong Whan. / Automatic generation of structured hyperdocuments from multi-column document images. Proceedings - International Conference on Pattern Recognition. Vol. 15 4. ed. 2000. pp. 422-425
@inbook{acd19d399d454736a2ec39664d1a7968,
title = "Automatic generation of structured hyperdocuments from multi-column document images",
abstract = "In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents(ToC) page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that HTML documents corresponding to the paper documents can be generated in a visual layout, and that their structured table of contents page, with the hierarchically ordered section titles hyperlinked to the contents, can be also produced by the proposed methods.",
author = "Lee, {Ji Yeon} and Choi, {Song Ha} and Lee, {Seong Whan}",
year = "2000",
language = "English",
volume = "15",
pages = "422--425",
booktitle = "Proceedings - International Conference on Pattern Recognition",
edition = "4",

}

TY - CHAP

T1 - Automatic generation of structured hyperdocuments from multi-column document images

AU - Lee, Ji Yeon

AU - Choi, Song Ha

AU - Lee, Seong Whan

PY - 2000

Y1 - 2000

N2 - In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents(ToC) page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that HTML documents corresponding to the paper documents can be generated in a visual layout, and that their structured table of contents page, with the hierarchically ordered section titles hyperlinked to the contents, can be also produced by the proposed methods.

AB - In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents(ToC) page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that HTML documents corresponding to the paper documents can be generated in a visual layout, and that their structured table of contents page, with the hierarchically ordered section titles hyperlinked to the contents, can be also produced by the proposed methods.

UR - http://www.scopus.com/inward/record.url?scp=33750895860&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750895860&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:33750895860

VL - 15

SP - 422

EP - 425

BT - Proceedings - International Conference on Pattern Recognition

ER -