An application of information retrieval technique to automated code classification

Heui Seok Lim, Seong Hoon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes an application of information retrieval techniques to automated industry and occupation code classification for Korean Census records. The purpose of the proposed system is to convert natural language responses on survey questionnaires into corresponding numeric codes according to standard code book from the Census Bureau. The system was experimented with 46,762 industry records and occupation 36,286 records using 10-fold cross-validation evaluation method. As experimental results, the system showed 87.08% and 66.08% production rates when classifying industry records into level 2 and level 5 codes respectively. In semi-automated mode, it showed 99.10% and 92.88% production rates for level 2 and level 5 codes respectively.

Original languageEnglish
Title of host publicationKnowledge-Based Intelligent Information and Engineering Systems - 9th International Conference, KES 2005, Proceedings
PublisherSpringer Verlag
Pages90-96
Number of pages7
ISBN (Print)3540288945, 9783540288947
DOIs
Publication statusPublished - 2005
Externally publishedYes
Event9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005 - Melbourne, Australia
Duration: 2005 Sept 142005 Sept 16

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3681 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, KES 2005
Country/TerritoryAustralia
CityMelbourne
Period05/9/1405/9/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'An application of information retrieval technique to automated code classification'. Together they form a unique fingerprint.

Cite this