The Effects of Feature Optimization on High-Dimensional Essay Data

Bong Jun Yi, Do Gil Lee, Hae-Chang Rim

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Current machine learning (ML) based automated essay scoring (AES) systems have employed various and vast numbers of features, which have been proven to be useful, in improving the performance of the AES. However, the high-dimensional feature space is not properly represented, due to the large volume of features extracted from the limited training data. As a result, this problem gives rise to poor performance and increased training time for the system. In this paper, we experiment and analyze the effects of feature optimization, including normalization, discretization, and feature selection techniques for different ML algorithms, while taking into consideration the size of the feature space and the performance of the AES. Accordingly, we show that the appropriate feature optimization techniques can reduce the dimensions of features, thus, contributing to the efficient training and performance improvement of AES.

Original languageEnglish
Article number421642
JournalMathematical Problems in Engineering
Volume2015
DOIs
Publication statusPublished - 2015

Fingerprint

High-dimensional Data
Scoring
Learning systems
Optimization
Feature Space
Learning algorithms
Feature extraction
Machine Learning
Feature Selection
Optimization Techniques
Normalization
Learning Algorithm
High-dimensional
Discretization
Experiments
Experiment
Training

ASJC Scopus subject areas

  • Mathematics(all)
  • Engineering(all)

Cite this

The Effects of Feature Optimization on High-Dimensional Essay Data. / Yi, Bong Jun; Lee, Do Gil; Rim, Hae-Chang.

In: Mathematical Problems in Engineering, Vol. 2015, 421642, 2015.

Research output: Contribution to journalArticle

@article{6bd2a6a4e3da422795867283d922f946,
title = "The Effects of Feature Optimization on High-Dimensional Essay Data",
abstract = "Current machine learning (ML) based automated essay scoring (AES) systems have employed various and vast numbers of features, which have been proven to be useful, in improving the performance of the AES. However, the high-dimensional feature space is not properly represented, due to the large volume of features extracted from the limited training data. As a result, this problem gives rise to poor performance and increased training time for the system. In this paper, we experiment and analyze the effects of feature optimization, including normalization, discretization, and feature selection techniques for different ML algorithms, while taking into consideration the size of the feature space and the performance of the AES. Accordingly, we show that the appropriate feature optimization techniques can reduce the dimensions of features, thus, contributing to the efficient training and performance improvement of AES.",
author = "Yi, {Bong Jun} and Lee, {Do Gil} and Hae-Chang Rim",
year = "2015",
doi = "10.1155/2015/421642",
language = "English",
volume = "2015",
journal = "Mathematical Problems in Engineering",
issn = "1024-123X",
publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - The Effects of Feature Optimization on High-Dimensional Essay Data

AU - Yi, Bong Jun

AU - Lee, Do Gil

AU - Rim, Hae-Chang

PY - 2015

Y1 - 2015

N2 - Current machine learning (ML) based automated essay scoring (AES) systems have employed various and vast numbers of features, which have been proven to be useful, in improving the performance of the AES. However, the high-dimensional feature space is not properly represented, due to the large volume of features extracted from the limited training data. As a result, this problem gives rise to poor performance and increased training time for the system. In this paper, we experiment and analyze the effects of feature optimization, including normalization, discretization, and feature selection techniques for different ML algorithms, while taking into consideration the size of the feature space and the performance of the AES. Accordingly, we show that the appropriate feature optimization techniques can reduce the dimensions of features, thus, contributing to the efficient training and performance improvement of AES.

AB - Current machine learning (ML) based automated essay scoring (AES) systems have employed various and vast numbers of features, which have been proven to be useful, in improving the performance of the AES. However, the high-dimensional feature space is not properly represented, due to the large volume of features extracted from the limited training data. As a result, this problem gives rise to poor performance and increased training time for the system. In this paper, we experiment and analyze the effects of feature optimization, including normalization, discretization, and feature selection techniques for different ML algorithms, while taking into consideration the size of the feature space and the performance of the AES. Accordingly, we show that the appropriate feature optimization techniques can reduce the dimensions of features, thus, contributing to the efficient training and performance improvement of AES.

UR - http://www.scopus.com/inward/record.url?scp=84945906056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945906056&partnerID=8YFLogxK

U2 - 10.1155/2015/421642

DO - 10.1155/2015/421642

M3 - Article

AN - SCOPUS:84945906056

VL - 2015

JO - Mathematical Problems in Engineering

JF - Mathematical Problems in Engineering

SN - 1024-123X

M1 - 421642

ER -