An automatic code classification system by using memory-based learning and information retrieval technique

Heui Seok Lim, Won Kyu Hoon Lee, Hyeoncheol Kim, Soon Young Jeong, Heonchang Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes an automatic code classification for Korean census data by using information retrieval technique and memoory-based learning technique. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code: book from the Census Bureau. The system was trained by memory baised learning and experimented with 46,762 industry records and occupation 36,286 records. It was evaluated by using 10-fold cross-validation method. As experimental results, the proposed system showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages577-582
Number of pages6
Volume3689 LNCS
DOIs
Publication statusPublished - 2005 Dec 1
Event2nd Asia Information Retrieval Symposium, AIRS 2005 - Jeju Island, Korea, Republic of
Duration: 2005 Oct 132005 Oct 15

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3689 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other2nd Asia Information Retrieval Symposium, AIRS 2005
CountryKorea, Republic of
CityJeju Island
Period05/10/1305/10/15

Fingerprint

Information Storage and Retrieval
Censuses
Information retrieval
Information Retrieval
Learning
Data storage equipment
Census
Occupations
Industry
Language
Numerics
Cross-validation
Questionnaire
Natural Language
Convert
Fold
Experimental Results
Surveys and Questionnaires

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Lim, H. S., Lee, W. K. H., Kim, H., Jeong, S. Y., & Yu, H. (2005). An automatic code classification system by using memory-based learning and information retrieval technique. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3689 LNCS, pp. 577-582). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3689 LNCS). https://doi.org/10.1007/11562382_53

An automatic code classification system by using memory-based learning and information retrieval technique. / Lim, Heui Seok; Lee, Won Kyu Hoon; Kim, Hyeoncheol; Jeong, Soon Young; Yu, Heonchang.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3689 LNCS 2005. p. 577-582 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3689 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lim, HS, Lee, WKH, Kim, H, Jeong, SY & Yu, H 2005, An automatic code classification system by using memory-based learning and information retrieval technique. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 3689 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3689 LNCS, pp. 577-582, 2nd Asia Information Retrieval Symposium, AIRS 2005, Jeju Island, Korea, Republic of, 05/10/13. https://doi.org/10.1007/11562382_53
Lim HS, Lee WKH, Kim H, Jeong SY, Yu H. An automatic code classification system by using memory-based learning and information retrieval technique. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3689 LNCS. 2005. p. 577-582. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11562382_53
Lim, Heui Seok ; Lee, Won Kyu Hoon ; Kim, Hyeoncheol ; Jeong, Soon Young ; Yu, Heonchang. / An automatic code classification system by using memory-based learning and information retrieval technique. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3689 LNCS 2005. pp. 577-582 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{d4261a739ba9452da39779622f2b5a9a,
title = "An automatic code classification system by using memory-based learning and information retrieval technique",
abstract = "This paper proposes an automatic code classification for Korean census data by using information retrieval technique and memoory-based learning technique. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code: book from the Census Bureau. The system was trained by memory baised learning and experimented with 46,762 industry records and occupation 36,286 records. It was evaluated by using 10-fold cross-validation method. As experimental results, the proposed system showed 99.10{\%} and 92.88{\%} production rates for level 2 and level 5 codes respectively.",
author = "Lim, {Heui Seok} and Lee, {Won Kyu Hoon} and Hyeoncheol Kim and Jeong, {Soon Young} and Heonchang Yu",
year = "2005",
month = "12",
day = "1",
doi = "10.1007/11562382_53",
language = "English",
isbn = "3540291865",
volume = "3689 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "577--582",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - An automatic code classification system by using memory-based learning and information retrieval technique

AU - Lim, Heui Seok

AU - Lee, Won Kyu Hoon

AU - Kim, Hyeoncheol

AU - Jeong, Soon Young

AU - Yu, Heonchang

PY - 2005/12/1

Y1 - 2005/12/1

N2 - This paper proposes an automatic code classification for Korean census data by using information retrieval technique and memoory-based learning technique. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code: book from the Census Bureau. The system was trained by memory baised learning and experimented with 46,762 industry records and occupation 36,286 records. It was evaluated by using 10-fold cross-validation method. As experimental results, the proposed system showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.

AB - This paper proposes an automatic code classification for Korean census data by using information retrieval technique and memoory-based learning technique. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code: book from the Census Bureau. The system was trained by memory baised learning and experimented with 46,762 industry records and occupation 36,286 records. It was evaluated by using 10-fold cross-validation method. As experimental results, the proposed system showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.

UR - http://www.scopus.com/inward/record.url?scp=33646123065&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646123065&partnerID=8YFLogxK

U2 - 10.1007/11562382_53

DO - 10.1007/11562382_53

M3 - Conference contribution

AN - SCOPUS:33646123065

SN - 3540291865

SN - 9783540291862

VL - 3689 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 577

EP - 582

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -