Two-stage framework for visualization of clustered high dimensional data

Jaegul Choo, Shawn Bohn, Haesun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.

Original languageEnglish
Title of host publicationVAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings
Pages67-74
Number of pages8
DOIs
Publication statusPublished - 2009 Dec 1
Externally publishedYes
EventVAST 09 - IEEE Symposium on Visual Analytics Science and Technology - Atlantic City, NJ, United States
Duration: 2009 Oct 122009 Oct 13

Publication series

NameVAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings

Conference

ConferenceVAST 09 - IEEE Symposium on Visual Analytics Science and Technology
CountryUnited States
CityAtlantic City, NJ
Period09/10/1209/10/13

Fingerprint

Visualization
Discriminant analysis
Principal component analysis

Keywords

  • 2D projection
  • Clustered data
  • Dimension reduction
  • Generalized singular value decomposition
  • Linear discriminant analysis
  • Orthogonal centroid method
  • Principal component analysis
  • Regularization

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Information Systems

Cite this

Choo, J., Bohn, S., & Park, H. (2009). Two-stage framework for visualization of clustered high dimensional data. In VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings (pp. 67-74). [5332629] (VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings). https://doi.org/10.1109/VAST.2009.5332629

Two-stage framework for visualization of clustered high dimensional data. / Choo, Jaegul; Bohn, Shawn; Park, Haesun.

VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings. 2009. p. 67-74 5332629 (VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Choo, J, Bohn, S & Park, H 2009, Two-stage framework for visualization of clustered high dimensional data. in VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings., 5332629, VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings, pp. 67-74, VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Atlantic City, NJ, United States, 09/10/12. https://doi.org/10.1109/VAST.2009.5332629
Choo J, Bohn S, Park H. Two-stage framework for visualization of clustered high dimensional data. In VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings. 2009. p. 67-74. 5332629. (VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings). https://doi.org/10.1109/VAST.2009.5332629
Choo, Jaegul ; Bohn, Shawn ; Park, Haesun. / Two-stage framework for visualization of clustered high dimensional data. VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings. 2009. pp. 67-74 (VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings).
@inproceedings{246cfcbe154d4ec5b31e45681fb605d2,
title = "Two-stage framework for visualization of clustered high dimensional data",
abstract = "In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.",
keywords = "2D projection, Clustered data, Dimension reduction, Generalized singular value decomposition, Linear discriminant analysis, Orthogonal centroid method, Principal component analysis, Regularization",
author = "Jaegul Choo and Shawn Bohn and Haesun Park",
year = "2009",
month = "12",
day = "1",
doi = "10.1109/VAST.2009.5332629",
language = "English",
isbn = "9781424452835",
series = "VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings",
pages = "67--74",
booktitle = "VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings",

}

TY - GEN

T1 - Two-stage framework for visualization of clustered high dimensional data

AU - Choo, Jaegul

AU - Bohn, Shawn

AU - Park, Haesun

PY - 2009/12/1

Y1 - 2009/12/1

N2 - In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.

AB - In this paper, we discuss dimension reduction methods for 2D visualization of high dimensional clustered data. We propose a two-stage framework for visualizing such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by applying a supervised dimension reduction method such as linear discriminant analysis which preserves the original cluster structure in terms of its criteria. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, the dimension is further reduced to 2 for visualization purposes by another dimension reduction method such as principal component analysis. The role of the second-stage is to minimize the loss of information due to reducing the dimension all the way to 2. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.

KW - 2D projection

KW - Clustered data

KW - Dimension reduction

KW - Generalized singular value decomposition

KW - Linear discriminant analysis

KW - Orthogonal centroid method

KW - Principal component analysis

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=72849129962&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72849129962&partnerID=8YFLogxK

U2 - 10.1109/VAST.2009.5332629

DO - 10.1109/VAST.2009.5332629

M3 - Conference contribution

AN - SCOPUS:72849129962

SN - 9781424452835

T3 - VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings

SP - 67

EP - 74

BT - VAST 09 - IEEE Symposium on Visual Analytics Science and Technology, Proceedings

ER -