Journal of Zhejiang University SCIENCE
(ISSN 1009-3095, Monthly)

2005   Vol. 6B   No. 5   p.408-412


            [ Home Page ] | [ PDF Full Text ]   On-line Access Date:   Apr. 10, 2005

Statistical properties of nucleotide clusters in DNA sequences

CHENG Jun†1, ZHANG Lin-xi2

(1Department of Physics, Jinhua University, Jinhua 321017, China)
(2Department of Physics, Zhejiang University, Hangzhou 310027, China)
E-mail: Jh_Chengjun@163.com
Received Oct. 29, 2004; revision accepted Jan. 27, 2005

Abstract: Using the complete genome of Plasmodium falciparum 3D7 which has 14 chromosomes as an example, we have examined the distribution functions for the amount of C or G and A or T consecutively and non-overlapping blocks of m bases in this system. The function P(S) about the number of the consecutive C-G or A-T content cluster conforms to the relation P(S)∝eαs; values of the scaling exponent αCG are much larger than αAT; and αAT of 14 chromosomes are hardly changed, whereas αCG of 14 chromosomes have a number of fluctuations. We found maximum value of A-T cluster size is much larger than C-G, which implies the existence of large A-T cluster. Our study of the width function ξ(m) of cluster C-G content showed that follows good power law ξ(m)∝mγ. The average γ for 14 chromosomes is 0.931. These investigations provide some insight into the nucleotide clusters of DNA sequences, and help us understand other properties of DNA sequences.

Key words: DNA sequence, Plasmodium falciparum 3D7, Nucleotide clusters, Power law
doi:10.1631/jzus.2005.B0408             CLC number: Q615