Structure-based functional inference in structural genomics

Sung Hou Kim, Dong Hae Shin, In-Geol Choi, Ursula Schulze-Gahmen, Shengfeng Chen, Rosalind Kim

Research output: Contribution to journalArticle

49 Citations (Scopus)

Abstract

The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1-6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6-10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http:www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.

Original languageEnglish
Pages (from-to)129-135
Number of pages7
JournalJournal of Structural and Functional Genomics
Volume4
Issue number2-3
DOIs
Publication statusPublished - 2003 Nov 13
Externally publishedYes

Fingerprint

Genomics
Proteins
Mycoplasma
Technology
Genome Size
Proteomics
Virulence
Animals
Pipelines
Genes
Throughput

Keywords

  • Berkeley Structural Genomics Center
  • Molecular function
  • Protein function
  • Structural genomics

ASJC Scopus subject areas

  • Genetics
  • Structural Biology
  • Biochemistry

Cite this

Structure-based functional inference in structural genomics. / Kim, Sung Hou; Shin, Dong Hae; Choi, In-Geol; Schulze-Gahmen, Ursula; Chen, Shengfeng; Kim, Rosalind.

In: Journal of Structural and Functional Genomics, Vol. 4, No. 2-3, 13.11.2003, p. 129-135.

Research output: Contribution to journalArticle

Kim, Sung Hou ; Shin, Dong Hae ; Choi, In-Geol ; Schulze-Gahmen, Ursula ; Chen, Shengfeng ; Kim, Rosalind. / Structure-based functional inference in structural genomics. In: Journal of Structural and Functional Genomics. 2003 ; Vol. 4, No. 2-3. pp. 129-135.
@article{71f8aed90a2349e9888e4a7d343db403,
title = "Structure-based functional inference in structural genomics",
abstract = "The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1-6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6-10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http:www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.",
keywords = "Berkeley Structural Genomics Center, Molecular function, Protein function, Structural genomics",
author = "Kim, {Sung Hou} and Shin, {Dong Hae} and In-Geol Choi and Ursula Schulze-Gahmen and Shengfeng Chen and Rosalind Kim",
year = "2003",
month = "11",
day = "13",
doi = "10.1023/A:1026200610644",
language = "English",
volume = "4",
pages = "129--135",
journal = "Journal of Structural and Functional Genomics",
issn = "1345-711X",
publisher = "Springer Netherlands",
number = "2-3",

}

TY - JOUR

T1 - Structure-based functional inference in structural genomics

AU - Kim, Sung Hou

AU - Shin, Dong Hae

AU - Choi, In-Geol

AU - Schulze-Gahmen, Ursula

AU - Chen, Shengfeng

AU - Kim, Rosalind

PY - 2003/11/13

Y1 - 2003/11/13

N2 - The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1-6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6-10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http:www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.

AB - The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1-6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6-10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http:www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.

KW - Berkeley Structural Genomics Center

KW - Molecular function

KW - Protein function

KW - Structural genomics

UR - http://www.scopus.com/inward/record.url?scp=0242355016&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0242355016&partnerID=8YFLogxK

U2 - 10.1023/A:1026200610644

DO - 10.1023/A:1026200610644

M3 - Article

C2 - 14649297

AN - SCOPUS:0242355016

VL - 4

SP - 129

EP - 135

JO - Journal of Structural and Functional Genomics

JF - Journal of Structural and Functional Genomics

SN - 1345-711X

IS - 2-3

ER -