摘 要随着人类基因组计划和一些模式生物基因组计划的完成,公共数据库中生物数据的增长速度越来越快。如何从海量的生物数据中解读、提取和获得有用的生物信息,已成为基因组计划下一步亟待解决的问题。生物信息学作为一门崭新的交叉学科,它的研究内容非常丰富。本文的研究内容主要有以下两个部分:
在第二章中,我们主要对DNA 序列和蛋白质序列的图形表示方法进行了综述。首先,我们简单介绍了DNA序列的2-D、3-D等图形表示方法;然后我们介绍了蛋白质序列的图形表示方法;最后我们总结了图形表示方法在生物信息学应用中的数值刻画方法。
在第三章中,我们对DNA序列的“四水平线”图形表示方法进行了推广,基于20种氨基酸的 5字母模型,提出了一种新的蛋白质序列的图形表示方法,这种方法使得含有20种氨基酸的蛋白质序列在应用中的复杂操作变得简化。5325
关键词:生物信息学;图形表示;数值刻画;DNA 序列;蛋白质序列;
The study of the graphical representation methods of biological sequences
Abstract
The main contents are listed as follows:
In Chapter two, we mainly sum up graphical representation method of DNA sequences and protein sequences. Firstly, we simply introduce the method of representing DNA sequence of 2-D, 3-D and other graphics. Then we introduce the method of representing protein sequence pattern. At last, we sum up the graphical representation method of describing method in numerical applications in information biology .
In Chapter three, we mainly introduce the graphical representation of protein sequence. Because the protein sequence is different from DNA sequence. We have to consider twenty kinds of character string. So it’s more complex than DNA sequence which is showed by four kinds of character string. But for complex problems, human beings always use the easier way to solve .The way is five letter model which is mentioned in the last chapter. Simplify the protein sequence and give people new graphics.
Keywords: DNA sequences, quotient matrices, normalized leading eigenvalue, similarity, protein sequence,5 letter model
目 录