Journal of Zhejiang University SCIENCE
(ISSN 1009-3095, Monthly)
2005 Vol. 6B No. 5 p.408-412
Statistical properties of nucleotide clusters in DNA sequences
CHENG Jun†1, ZHANG Lin-xi2
(1Department of Physics, Jinhua University, Jinhua 321017, China)
(2Department of Physics, Zhejiang University, Hangzhou 310027, China)
†E-mail: Jh_Chengjun@163.com
Received Oct. 29, 2004; revision accepted Jan. 27, 2005
Abstract: Using the complete genome of Plasmodium falciparum 3D7 which has 14 chromosomes as an example, we have examined the distribution functions for the amount of C or G and A or T consecutively and non-overlapping blocks of m bases in this system. The function P(S) about the number of the consecutive C-G or A-T content cluster conforms to the relation P(S)∝e−αs; values of the scaling exponent αCG are much larger than αAT; and αAT of 14 chromosomes are hardly changed, whereas αCG of 14 chromosomes have a number of fluctuations. We found maximum value of A-T cluster size is much larger than C-G, which implies the existence of large A-T cluster. Our study of the width function ξ(m) of cluster C-G content showed that follows good power law ξ(m)∝m−γ. The average γ for 14 chromosomes is 0.931. These investigations provide some insight into the nucleotide clusters of DNA sequences, and help us understand other properties of DNA sequences.
Key words: DNA sequence, Plasmodium falciparum 3D7, Nucleotide clusters, Power law
doi:10.1631/jzus.2005.B0408 CLC number: Q615