Fast text caption localization on video using visual rhythm

Seong Soo Chun, Hyeokman Kim, Jung Rim Kim, Sangwook Oh, Sanghoon Sull

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages259-268
Number of pages10
Volume2314
ISBN (Print)3540433589
Publication statusPublished - 2002
Event5th International Conference on Visual Information Systems, VISUAL 2002 - Hsin Chu, Taiwan, Province of China
Duration: 2002 Mar 112002 Mar 13

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2314
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other5th International Conference on Visual Information Systems, VISUAL 2002
CountryTaiwan, Province of China
CityHsin Chu
Period02/3/1102/3/13

Fingerprint

Sampling
Text
Vision
Sampling Strategy
Image Sequence
Indexing
Consecutive
Strings

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Chun, S. S., Kim, H., Kim, J. R., Oh, S., & Sull, S. (2002). Fast text caption localization on video using visual rhythm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2314, pp. 259-268). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2314). Springer Verlag.

Fast text caption localization on video using visual rhythm. / Chun, Seong Soo; Kim, Hyeokman; Kim, Jung Rim; Oh, Sangwook; Sull, Sanghoon.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2314 Springer Verlag, 2002. p. 259-268 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2314).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chun, SS, Kim, H, Kim, JR, Oh, S & Sull, S 2002, Fast text caption localization on video using visual rhythm. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 2314, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2314, Springer Verlag, pp. 259-268, 5th International Conference on Visual Information Systems, VISUAL 2002, Hsin Chu, Taiwan, Province of China, 02/3/11.
Chun SS, Kim H, Kim JR, Oh S, Sull S. Fast text caption localization on video using visual rhythm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2314. Springer Verlag. 2002. p. 259-268. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Chun, Seong Soo ; Kim, Hyeokman ; Kim, Jung Rim ; Oh, Sangwook ; Sull, Sanghoon. / Fast text caption localization on video using visual rhythm. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2314 Springer Verlag, 2002. pp. 259-268 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{7271fec922144bc38ce222361e25bced,
title = "Fast text caption localization on video using visual rhythm",
abstract = "In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.",
author = "Chun, {Seong Soo} and Hyeokman Kim and Kim, {Jung Rim} and Sangwook Oh and Sanghoon Sull",
year = "2002",
language = "English",
isbn = "3540433589",
volume = "2314",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "259--268",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Fast text caption localization on video using visual rhythm

AU - Chun, Seong Soo

AU - Kim, Hyeokman

AU - Kim, Jung Rim

AU - Oh, Sangwook

AU - Sull, Sanghoon

PY - 2002

Y1 - 2002

N2 - In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.

AB - In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.

UR - http://www.scopus.com/inward/record.url?scp=81855176949&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81855176949&partnerID=8YFLogxK

M3 - Conference contribution

SN - 3540433589

VL - 2314

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 259

EP - 268

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -