Codon and amino-acid distribution in DNA

J. K. Kim, S. I. Yang, Y. H. Kwon, Eun Il Lee

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

According to the Zipf's law, the distribution of rank-ordered frequency of words in the natural language can be modelled on the power law. In this paper, we examine the frequency distribution of 64 codons over the coding and non-coding regions of 88 DNA from EMBL and GenBank database, using exponential fitting. Also, we regard 20 amino-acids as vocabulary, perform the same frequency analysis to the same database and show that amino-acids can be used as biological meaningful words for Zipf's approach. Our analysis suggests that a natural language structure may exist not only in the coding region of DNA but in the non-coding one of DNA.

Original languageEnglish
Pages (from-to)1795-1807
Number of pages13
JournalChaos, Solitons and Fractals
Volume23
Issue number5
DOIs
Publication statusPublished - 2005 Mar 1

Fingerprint

Natural Language
amino acids
Amino Acids
deoxyribonucleic acid
Coding
Exponential Fitting
Zipf's law
Frequency Analysis
coding
Power Law
frequency distribution

ASJC Scopus subject areas

  • Statistical and Nonlinear Physics

Cite this

Codon and amino-acid distribution in DNA. / Kim, J. K.; Yang, S. I.; Kwon, Y. H.; Lee, Eun Il.

In: Chaos, Solitons and Fractals, Vol. 23, No. 5, 01.03.2005, p. 1795-1807.

Research output: Contribution to journalArticle

Kim, J. K. ; Yang, S. I. ; Kwon, Y. H. ; Lee, Eun Il. / Codon and amino-acid distribution in DNA. In: Chaos, Solitons and Fractals. 2005 ; Vol. 23, No. 5. pp. 1795-1807.
@article{5a288e910cab43ec8bafef87e34965fa,
title = "Codon and amino-acid distribution in DNA",
abstract = "According to the Zipf's law, the distribution of rank-ordered frequency of words in the natural language can be modelled on the power law. In this paper, we examine the frequency distribution of 64 codons over the coding and non-coding regions of 88 DNA from EMBL and GenBank database, using exponential fitting. Also, we regard 20 amino-acids as vocabulary, perform the same frequency analysis to the same database and show that amino-acids can be used as biological meaningful words for Zipf's approach. Our analysis suggests that a natural language structure may exist not only in the coding region of DNA but in the non-coding one of DNA.",
author = "Kim, {J. K.} and Yang, {S. I.} and Kwon, {Y. H.} and Lee, {Eun Il}",
year = "2005",
month = "3",
day = "1",
doi = "10.1016/j.chaos.2004.07.027",
language = "English",
volume = "23",
pages = "1795--1807",
journal = "Chaos, Solitons and Fractals",
issn = "0960-0779",
publisher = "Elsevier Limited",
number = "5",

}

TY - JOUR

T1 - Codon and amino-acid distribution in DNA

AU - Kim, J. K.

AU - Yang, S. I.

AU - Kwon, Y. H.

AU - Lee, Eun Il

PY - 2005/3/1

Y1 - 2005/3/1

N2 - According to the Zipf's law, the distribution of rank-ordered frequency of words in the natural language can be modelled on the power law. In this paper, we examine the frequency distribution of 64 codons over the coding and non-coding regions of 88 DNA from EMBL and GenBank database, using exponential fitting. Also, we regard 20 amino-acids as vocabulary, perform the same frequency analysis to the same database and show that amino-acids can be used as biological meaningful words for Zipf's approach. Our analysis suggests that a natural language structure may exist not only in the coding region of DNA but in the non-coding one of DNA.

AB - According to the Zipf's law, the distribution of rank-ordered frequency of words in the natural language can be modelled on the power law. In this paper, we examine the frequency distribution of 64 codons over the coding and non-coding regions of 88 DNA from EMBL and GenBank database, using exponential fitting. Also, we regard 20 amino-acids as vocabulary, perform the same frequency analysis to the same database and show that amino-acids can be used as biological meaningful words for Zipf's approach. Our analysis suggests that a natural language structure may exist not only in the coding region of DNA but in the non-coding one of DNA.

UR - http://www.scopus.com/inward/record.url?scp=9544250390&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=9544250390&partnerID=8YFLogxK

U2 - 10.1016/j.chaos.2004.07.027

DO - 10.1016/j.chaos.2004.07.027

M3 - Article

VL - 23

SP - 1795

EP - 1807

JO - Chaos, Solitons and Fractals

JF - Chaos, Solitons and Fractals

SN - 0960-0779

IS - 5

ER -