HHblits高精度二硫键网络预测及其应用
时间:2022-03-06 10:17 来源:毕业论文 作者:毕业论文 点击:次
摘要众所周知,蛋白质的三维(3D)结构对其生物功能有密切的关系。目前迫切需要从序列中准确预测蛋白质结构。其中含半胱氨酸的蛋白质在自然界中占有重要位置,但目前 还没有具体地用于预测这些蛋白质的结构有效方法。因为含半胱氨酸的蛋白质常常会形 成二硫键这种特殊结构片段,如果有一种有效方法能够预测出二硫键的连接,那么就可 以有助于预测含半胱氨酸的蛋白质的三维结构,并且也能让我们获得这种蛋白质的功能 信息。 78625 在这项研究中,我们在原有研究的基础上,尝试一种蛋白质二硫键成键模式的预测 新的方法。主要方法是基于特征提取的机器学习方法,预测蛋白质二硫键成键可能性或 回归值,然后把这个值作为最大权值匹配的边权,最终得到一个预测二硫键成键模式的 计算模型。研究中主要进行的尝试包括,首先使用 Modeller 对同源蛋白质进行三维结 构预测并单独训练一个模型,使用 HHblits 获取相应可能的二硫键成键形成一个模型, 基于传统特征提取方法的 DNN 预测方法的模型,最终采用三个模型的融合作为一个最 终成键的预测值,再进行最大权值匹配。并使用交叉验证和独立验证在不同的基准数据 集的比较,并且该方法能取得良好的预测效果,优于许多现有的基于序列的预测。 毕业论文关键词 二硫键 深度神经网络 蛋白质序列 蛋白质三级结构 隐马尔科夫模型 特征提取 毕 业 设 计 说 明 书 外 文 摘 要 Title The Prediction and Application of High Accurate Disulfide Network Abstract As we all know, the three-dimensional protein (3D) structure of its biological function are closely related。 There is urgent need from a single sequence to predict protein structures for building bridges of protein sequences and structures。 Wherein the cysteine-containing proteins occupies an important position in nature, but there is no particular effective method for the prediction of these proteins。 Because cysteine-containing proteins often form a disulfide bond structure in particular fragment, if there is an effective way to be able to predict the disulfide bond connection mode, it can help predict three-dimensional structure of the cysteine-containing proteins, and also allows us to obtain information about the function of this protein。 In this study, based on the original research, we propose a new approach to predict a protein disulfide-bonding pattern。 The main method is based on machine learning feature extraction method to predict protein disulfide bonding based on the possibility or the regression value。 Then the value is viewed as the edge weight for maximum weight matching。 Finally we get a predicted disulfide bond bonding pattern。 Also, We first use Modeller, a homologous protein three- dimensional structure prediction and train an inpidual model。 We use HHblits method to get corresponding possible disulfide bond。 Based on the traditional feature extraction method, we adapt DNN to train the features and give to predicted possibility。 The eventual integration of the three models as a final predicted bonding value。 Then go to the maximum weight matching。 Using cross- validation and independent verification in different benchmarks, and the method can achieve good prediction effect, better than many existing sequence-based prediction。 Keywords disulfide, DNN, protein sequence, three-dimensional protein structure, HMM, MSAs, feature extraction 目 次 1 引言 3 1。1 二硫键定义 (责任编辑:qin) |