TY - GEN
T1 - Two-stage framework for visualization of clustered high dimensional data
AU - Choo, Jaegul
AU - Bohn, Shawn
AU - Park, Haesun
PY - 2009
Y1 - 2009
N2 - In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.
AB - In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.
KW - 2D projection
KW - Clustered data
KW - Dimension reduction
KW - Generalized singular value decomposition
KW - Linear discriminant analysis
KW - Orthogonal centroid method
KW - Principal component analysis
KW - Regularization
UR - http://www.scopus.com/inward/record.url?scp=72849129962&partnerID=8YFLogxK
U2 - 10.1109/VAST.2009.5332629
DO - 10.1109/VAST.2009.5332629
M3 - Conference contribution
AN - SCOPUS:72849129962
SN - 9781424452835
T3 - VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings
SP - 67
EP - 74
BT - VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings
T2 - VAST 09 - IEEE Symposium on Visual Analytics Science and Technology
Y2 - 12 October 2009 through 13 October 2009
ER -