Local feature frequency profile: A method to measure structural similarity in proteins

In-Geol Choi, Jaimyoung Kwon, Sung Hou Kim

Research output: Contribution to journalArticle

64 Citations (Scopus)

Abstract

Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of Cα distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each Cα distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the "map" of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.

Original languageEnglish
Pages (from-to)3797-3802
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume101
Issue number11
DOIs
Publication statusPublished - 2004 Mar 16
Externally publishedYes

Fingerprint

Proteins

Keywords

  • Local protein structural features profile
  • Protein distance matrix
  • Protein fold
  • Protein fold space
  • Protein structural similarity

ASJC Scopus subject areas

  • Genetics
  • General

Cite this

Local feature frequency profile : A method to measure structural similarity in proteins. / Choi, In-Geol; Kwon, Jaimyoung; Kim, Sung Hou.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 101, No. 11, 16.03.2004, p. 3797-3802.

Research output: Contribution to journalArticle

@article{f45b7228bf49425bbb5b7bb224ad5cb8,
title = "Local feature frequency profile: A method to measure structural similarity in proteins",
abstract = "Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of Cα distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each Cα distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the {"}map{"} of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.",
keywords = "Local protein structural features profile, Protein distance matrix, Protein fold, Protein fold space, Protein structural similarity",
author = "In-Geol Choi and Jaimyoung Kwon and Kim, {Sung Hou}",
year = "2004",
month = "3",
day = "16",
doi = "10.1073/pnas.0308656100",
language = "English",
volume = "101",
pages = "3797--3802",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "11",

}

TY - JOUR

T1 - Local feature frequency profile

T2 - A method to measure structural similarity in proteins

AU - Choi, In-Geol

AU - Kwon, Jaimyoung

AU - Kim, Sung Hou

PY - 2004/3/16

Y1 - 2004/3/16

N2 - Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of Cα distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each Cα distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the "map" of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.

AB - Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of Cα distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each Cα distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the "map" of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.

KW - Local protein structural features profile

KW - Protein distance matrix

KW - Protein fold

KW - Protein fold space

KW - Protein structural similarity

UR - http://www.scopus.com/inward/record.url?scp=1642389927&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1642389927&partnerID=8YFLogxK

U2 - 10.1073/pnas.0308656100

DO - 10.1073/pnas.0308656100

M3 - Article

C2 - 14985506

AN - SCOPUS:1642389927

VL - 101

SP - 3797

EP - 3802

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 11

ER -