TY - GEN
T1 - A non-parametric method for data clustering with optimal variable weighting
AU - Chung, Ji Won
AU - Choi, In Chan
PY - 2006
Y1 - 2006
N2 - Since cluster analysis in data mining often deals with large-scale high-dimensional data with masking variables, it is important to remove non-contributing variables for accurate cluster recovery and also for proper interpretation of clustering results. Although the weights obtained by variable weighting methods can be used for the purpose of variable selection (or, elimination), they alone hardly provide a clear guide on selecting variables for subsequent analysis. In addition, variable selection and variable weighting are highly interrelated with the choice on the number of clusters. In this paper, we propose a non-parametric data clustering method, based on the W-k-means type clustering, for an automated and joint decision on selecting variables, determining variable weights, and deciding the number of clusters. Conclusions are drawn from computational experiments with random data and real-life data.
AB - Since cluster analysis in data mining often deals with large-scale high-dimensional data with masking variables, it is important to remove non-contributing variables for accurate cluster recovery and also for proper interpretation of clustering results. Although the weights obtained by variable weighting methods can be used for the purpose of variable selection (or, elimination), they alone hardly provide a clear guide on selecting variables for subsequent analysis. In addition, variable selection and variable weighting are highly interrelated with the choice on the number of clusters. In this paper, we propose a non-parametric data clustering method, based on the W-k-means type clustering, for an automated and joint decision on selecting variables, determining variable weights, and deciding the number of clusters. Conclusions are drawn from computational experiments with random data and real-life data.
UR - http://www.scopus.com/inward/record.url?scp=33750545329&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750545329&partnerID=8YFLogxK
U2 - 10.1007/11875581_97
DO - 10.1007/11875581_97
M3 - Conference contribution
AN - SCOPUS:33750545329
SN - 3540454853
SN - 9783540454854
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 807
EP - 814
BT - Intelligent Data Engineering and Automated Learning, IDEAL 2006 - 7th International Conference, Proceedings
PB - Springer Verlag
T2 - 7th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2006
Y2 - 20 September 2006 through 23 September 2006
ER -