Evolution of protein structural classes and protein sequence families

In-Geol Choi, Sung Hou Kim

Research output: Contribution to journalArticle

54 Citations (Scopus)

Abstract

In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four "allowed" regions of the protein structure space, "structural selection," an important component of natural selection in gene evolution. Our studies on tracing the "common structural ancestor" for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were "born" at different evolutionary times, not traceable to one or two ancestral proteins: "the multiple birth model" for the evolution of protein sequence families.

Original languageEnglish
Pages (from-to)14056-14061
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume103
Issue number38
DOIs
Publication statusPublished - 2006 Sep 19
Externally publishedYes

Fingerprint

Proteins
Multiple Birth Offspring
Genetic Selection
Genes
Population

Keywords

  • Common structural ancestor
  • Evolutionary age
  • Protein fold classes
  • Protein structure universe

ASJC Scopus subject areas

  • Genetics
  • General

Cite this

Evolution of protein structural classes and protein sequence families. / Choi, In-Geol; Kim, Sung Hou.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 103, No. 38, 19.09.2006, p. 14056-14061.

Research output: Contribution to journalArticle

@article{957e7cc078d842e6ba830c6110d148da,
title = "Evolution of protein structural classes and protein sequence families",
abstract = "In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four {"}allowed{"} regions of the protein structure space, {"}structural selection,{"} an important component of natural selection in gene evolution. Our studies on tracing the {"}common structural ancestor{"} for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were {"}born{"} at different evolutionary times, not traceable to one or two ancestral proteins: {"}the multiple birth model{"} for the evolution of protein sequence families.",
keywords = "Common structural ancestor, Evolutionary age, Protein fold classes, Protein structure universe",
author = "In-Geol Choi and Kim, {Sung Hou}",
year = "2006",
month = "9",
day = "19",
doi = "10.1073/pnas.0606239103",
language = "English",
volume = "103",
pages = "14056--14061",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "38",

}

TY - JOUR

T1 - Evolution of protein structural classes and protein sequence families

AU - Choi, In-Geol

AU - Kim, Sung Hou

PY - 2006/9/19

Y1 - 2006/9/19

N2 - In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four "allowed" regions of the protein structure space, "structural selection," an important component of natural selection in gene evolution. Our studies on tracing the "common structural ancestor" for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were "born" at different evolutionary times, not traceable to one or two ancestral proteins: "the multiple birth model" for the evolution of protein sequence families.

AB - In protein structure space, protein structures cluster into four elongated regions when mapped based solely on similarity among the 3D structures. These four regions correspond to the four major classes of present-day proteins defined by the contents of secondary structure types and their topological arrangement. Evolution of and restriction to these four classes suggest that, in most cases, the evolution of genes may have been constrained or selected to those genetic changes that results in structurally stable proteins occupying one of the four "allowed" regions of the protein structure space, "structural selection," an important component of natural selection in gene evolution. Our studies on tracing the "common structural ancestor" for each protein sequence family of known structure suggest that: (i) recently emerged proteins belong mostly to three classes; (ii) the proteins that emerged earlier evolved to gain a new class; and (iii) the proteins that emerged earliest evolved to become the present-day proteins in the four major classes, with the fourth-class proteins becoming the most dominant population. Furthermore, our studies also show that not all present-day proteins evolved from one single set of proteins in the last common ancestral organism, but new common ancestral proteins were "born" at different evolutionary times, not traceable to one or two ancestral proteins: "the multiple birth model" for the evolution of protein sequence families.

KW - Common structural ancestor

KW - Evolutionary age

KW - Protein fold classes

KW - Protein structure universe

UR - http://www.scopus.com/inward/record.url?scp=33749014910&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749014910&partnerID=8YFLogxK

U2 - 10.1073/pnas.0606239103

DO - 10.1073/pnas.0606239103

M3 - Article

C2 - 16959887

AN - SCOPUS:33749014910

VL - 103

SP - 14056

EP - 14061

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 38

ER -