An efficient grid-based k-prototypes algorithm for sustainable decision-making on spatial objects

Hong Jun Jang, Byoungwook Kim, Jongwan Kim, Soon Young Jung

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Data mining plays a critical role in sustainable decision-making. Although the k-prototypes algorithm is one of the best-known algorithms for clustering both numeric and categorical data, clustering a large number of spatial objects with mixed numeric and categorical attributes is still inefficient due to complexity. In this paper, we propose an efficient grid-based k-prototypes algorithm, GK-prototypes, which achieves high performance for clustering spatial objects. The first proposed algorithm utilizes both maximum and minimum distance between cluster centers and a cell, which can reduce unnecessary distance calculation. The second proposed algorithm as an extension of the first proposed algorithm, utilizes spatial dependence; spatial data tends to be similar to objects that are close. Each cell has a bitmap index which stores the categorical values of all objects within the same cell for each attribute. This bitmap index can improve performance if the categorical data is skewed. Experimental results show that the proposed algorithms can achieve better performance than the existing pruning techniques of the k-prototypes algorithm.

Original languageEnglish
Article number2614
JournalSustainability (Switzerland)
Volume10
Issue number8
DOIs
Publication statusPublished - 2018 Jul 25

Fingerprint

Decision making
decision making
performance
data mining
pruning
spatial data
Data mining
Values

Keywords

  • Clustering
  • Data mining
  • Grid-based k-prototypes
  • Spatial data
  • Sustainability

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Renewable Energy, Sustainability and the Environment
  • Management, Monitoring, Policy and Law

Cite this

An efficient grid-based k-prototypes algorithm for sustainable decision-making on spatial objects. / Jang, Hong Jun; Kim, Byoungwook; Kim, Jongwan; Jung, Soon Young.

In: Sustainability (Switzerland), Vol. 10, No. 8, 2614, 25.07.2018.

Research output: Contribution to journalArticle

@article{2b688829927448a0975ae080c2115771,
title = "An efficient grid-based k-prototypes algorithm for sustainable decision-making on spatial objects",
abstract = "Data mining plays a critical role in sustainable decision-making. Although the k-prototypes algorithm is one of the best-known algorithms for clustering both numeric and categorical data, clustering a large number of spatial objects with mixed numeric and categorical attributes is still inefficient due to complexity. In this paper, we propose an efficient grid-based k-prototypes algorithm, GK-prototypes, which achieves high performance for clustering spatial objects. The first proposed algorithm utilizes both maximum and minimum distance between cluster centers and a cell, which can reduce unnecessary distance calculation. The second proposed algorithm as an extension of the first proposed algorithm, utilizes spatial dependence; spatial data tends to be similar to objects that are close. Each cell has a bitmap index which stores the categorical values of all objects within the same cell for each attribute. This bitmap index can improve performance if the categorical data is skewed. Experimental results show that the proposed algorithms can achieve better performance than the existing pruning techniques of the k-prototypes algorithm.",
keywords = "Clustering, Data mining, Grid-based k-prototypes, Spatial data, Sustainability",
author = "Jang, {Hong Jun} and Byoungwook Kim and Jongwan Kim and Jung, {Soon Young}",
year = "2018",
month = "7",
day = "25",
doi = "10.3390/su10082614",
language = "English",
volume = "10",
journal = "Sustainability",
issn = "2071-1050",
publisher = "MDPI AG",
number = "8",

}

TY - JOUR

T1 - An efficient grid-based k-prototypes algorithm for sustainable decision-making on spatial objects

AU - Jang, Hong Jun

AU - Kim, Byoungwook

AU - Kim, Jongwan

AU - Jung, Soon Young

PY - 2018/7/25

Y1 - 2018/7/25

N2 - Data mining plays a critical role in sustainable decision-making. Although the k-prototypes algorithm is one of the best-known algorithms for clustering both numeric and categorical data, clustering a large number of spatial objects with mixed numeric and categorical attributes is still inefficient due to complexity. In this paper, we propose an efficient grid-based k-prototypes algorithm, GK-prototypes, which achieves high performance for clustering spatial objects. The first proposed algorithm utilizes both maximum and minimum distance between cluster centers and a cell, which can reduce unnecessary distance calculation. The second proposed algorithm as an extension of the first proposed algorithm, utilizes spatial dependence; spatial data tends to be similar to objects that are close. Each cell has a bitmap index which stores the categorical values of all objects within the same cell for each attribute. This bitmap index can improve performance if the categorical data is skewed. Experimental results show that the proposed algorithms can achieve better performance than the existing pruning techniques of the k-prototypes algorithm.

AB - Data mining plays a critical role in sustainable decision-making. Although the k-prototypes algorithm is one of the best-known algorithms for clustering both numeric and categorical data, clustering a large number of spatial objects with mixed numeric and categorical attributes is still inefficient due to complexity. In this paper, we propose an efficient grid-based k-prototypes algorithm, GK-prototypes, which achieves high performance for clustering spatial objects. The first proposed algorithm utilizes both maximum and minimum distance between cluster centers and a cell, which can reduce unnecessary distance calculation. The second proposed algorithm as an extension of the first proposed algorithm, utilizes spatial dependence; spatial data tends to be similar to objects that are close. Each cell has a bitmap index which stores the categorical values of all objects within the same cell for each attribute. This bitmap index can improve performance if the categorical data is skewed. Experimental results show that the proposed algorithms can achieve better performance than the existing pruning techniques of the k-prototypes algorithm.

KW - Clustering

KW - Data mining

KW - Grid-based k-prototypes

KW - Spatial data

KW - Sustainability

UR - http://www.scopus.com/inward/record.url?scp=85050472794&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050472794&partnerID=8YFLogxK

U2 - 10.3390/su10082614

DO - 10.3390/su10082614

M3 - Article

AN - SCOPUS:85050472794

VL - 10

JO - Sustainability

JF - Sustainability

SN - 2071-1050

IS - 8

M1 - 2614

ER -