Reusing of information constructed in HTML documents: A conversion of HTML into OWL

Hoon Hwangbo, Hong Chul Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.

Original languageEnglish
Title of host publication2008 International Conference on Control, Automation and Systems, ICCAS 2008
Pages871-875
Number of pages5
DOIs
Publication statusPublished - 2008 Dec 1
Event2008 International Conference on Control, Automation and Systems, ICCAS 2008 - Seoul, Korea, Republic of
Duration: 2008 Oct 142008 Oct 17

Other

Other2008 International Conference on Control, Automation and Systems, ICCAS 2008
CountryKorea, Republic of
CitySeoul
Period08/10/1408/10/17

Fingerprint

HTML
Ontology
Semantics

Keywords

  • Analyzing system of english grammar
  • Conversion
  • Data extraction
  • HTML
  • OWL
  • Reusing information

ASJC Scopus subject areas

  • Control and Systems Engineering

Cite this

Hwangbo, H., & Lee, H. C. (2008). Reusing of information constructed in HTML documents: A conversion of HTML into OWL. In 2008 International Conference on Control, Automation and Systems, ICCAS 2008 (pp. 871-875). [4694654] https://doi.org/10.1109/ICCAS.2008.4694654

Reusing of information constructed in HTML documents : A conversion of HTML into OWL. / Hwangbo, Hoon; Lee, Hong Chul.

2008 International Conference on Control, Automation and Systems, ICCAS 2008. 2008. p. 871-875 4694654.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hwangbo, H & Lee, HC 2008, Reusing of information constructed in HTML documents: A conversion of HTML into OWL. in 2008 International Conference on Control, Automation and Systems, ICCAS 2008., 4694654, pp. 871-875, 2008 International Conference on Control, Automation and Systems, ICCAS 2008, Seoul, Korea, Republic of, 08/10/14. https://doi.org/10.1109/ICCAS.2008.4694654
Hwangbo H, Lee HC. Reusing of information constructed in HTML documents: A conversion of HTML into OWL. In 2008 International Conference on Control, Automation and Systems, ICCAS 2008. 2008. p. 871-875. 4694654 https://doi.org/10.1109/ICCAS.2008.4694654
Hwangbo, Hoon ; Lee, Hong Chul. / Reusing of information constructed in HTML documents : A conversion of HTML into OWL. 2008 International Conference on Control, Automation and Systems, ICCAS 2008. 2008. pp. 871-875
@inproceedings{ef7524e3952345fdbb283abdbab4e08d,
title = "Reusing of information constructed in HTML documents: A conversion of HTML into OWL",
abstract = "There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.",
keywords = "Analyzing system of english grammar, Conversion, Data extraction, HTML, OWL, Reusing information",
author = "Hoon Hwangbo and Lee, {Hong Chul}",
year = "2008",
month = "12",
day = "1",
doi = "10.1109/ICCAS.2008.4694654",
language = "English",
isbn = "9788995003893",
pages = "871--875",
booktitle = "2008 International Conference on Control, Automation and Systems, ICCAS 2008",

}

TY - GEN

T1 - Reusing of information constructed in HTML documents

T2 - A conversion of HTML into OWL

AU - Hwangbo, Hoon

AU - Lee, Hong Chul

PY - 2008/12/1

Y1 - 2008/12/1

N2 - There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.

AB - There have been efforts of making a knowledge based web, represented by Semantic Web. However, in this trend, HTML is not appropriate as a language for ontology and a structure of information. Due to numerous amounts of information in it, it seems rational to reuse those data in HTML. Previous studies are not enough to broadly convert HTML into OWL because they mainly focus on conversions of structured data (table tags), and they just give simple executions. In addition, GRDDL, a recommendation of W3C, needs an additional script for a conversion, and the output format of it is RDF which has some restrictions. This paper will offer three steps of conversions; (1) Extraction of information, (2) Acquiring triples, (3) Constructing ontology. There are two types of information; text-formed and non-text-formed information. In addition, there are two kinds of tags which include only text-formed information or which include both of text-formed and non-text-formed one. Depending on the type of tags, we classify tag categories and set rules for each of them. Using those rules, we can make triples, and finally we can construct ontology.

KW - Analyzing system of english grammar

KW - Conversion

KW - Data extraction

KW - HTML

KW - OWL

KW - Reusing information

UR - http://www.scopus.com/inward/record.url?scp=58149101999&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=58149101999&partnerID=8YFLogxK

U2 - 10.1109/ICCAS.2008.4694654

DO - 10.1109/ICCAS.2008.4694654

M3 - Conference contribution

AN - SCOPUS:58149101999

SN - 9788995003893

SP - 871

EP - 875

BT - 2008 International Conference on Control, Automation and Systems, ICCAS 2008

ER -