
当前位置: 毕业论文 > 数学论文 >


时间:2024-07-09 22:15来源:95770



Abstract:Clustering analysis is a very important step in data mining. A variety of clustering methods can speed up the process of data mining and improve the quality of data mining. This paper introduces the classifications and algorithms of clustering analysis, including hierarchical clustering, K-means clustering, fuzzy clustering, ordered  sample clustering and PAM clustering and CLARA clustering in K-medoids clustering, and using the open source software R 3.3.3 to achieve the algorithms. The paper focuses on the methods of how to deal with the clustering analysis for mixed data, including how to calculate the integrated distance of mixed data, how to determine the best number of clusters for mixed data, which methods should be choose to achieve the clustering analysis for mixed  data and the application of software R. In this paper, the Gower method is used to calculate the distance of the mixed data first. Secondly, the optimal number of clusters is determined according to the width of silhouette coefficient. Thirdly, PAM algorithm and CLARA algorithm are used to realize the clustering analysis for mixed data and further comparative analysis. Finally, select Byar prostate cancer data set for empirical analysis. Through the empirical analysis, it is found that the two kinds of clustering methods can cluster well for mixed data. However, there are some differences in the results of clustering between the two methods for the Byar dataset. According to the differences between the two clustering results, we can analysis some reasons, these provides some basis for the further study.

Keywords: Clustering Analysis, Mixed Data, PAM Algorithm, CLARA Algorithm, R


第一章 绪论 1

1.1研究背景及意义 1

1.1.1研究背景 1

1.1.2研究意义 1

1.2混合型数据的聚类方法及研究现状 1

1.2.1混合型数据的聚类方法 1

1.2.2K-medoids算法的研究现状 2

1.2.3K-medoids算法存在的问题 3

1.3本文的主要研究内容及框架 3

第二章 聚类分析的分类及算法 4

2.1聚类分析的概念、数据类型及聚类统计量 4

2.1.1聚类分析的概念 4

2.1.2聚类分析的相异度度量 混合型数据的聚类分析及R软件实现:http://www.youerw.com/shuxue/lunwen_204234.html
