摘要Openstack是一个基于虚拟技术的私有云平台,Hadoop是一个分布式大数据处理框架。本毕业设计意在首先搭建Openstack平台,并于其上利用Hadoop集群进行大数据处理和分析,并维护Hadoop集群。利用Openstack提供的虚拟化技术和Sahara项目,为Hadoop提供环境支持,将Hadoop集群搭建在Openstack平台上,并进而利用Hadoop的HDFS分布式File Syetem、MapReduce算法思想、HBase存储策略等,进行大数据处理和贮存。将复杂、深奥的底层逻辑封装、整合,暴露给更容易触及的上层结构。在整套系统能够正常工作的前提下,进行日常维护工作,并实时监控服务器状态及集群运作状态,用直观浅显的形式及时反馈给使用者。51059
该论文有图9幅,表2个,参考文献11篇。
毕业论文关键词:大数据 云计算平台 Openstack Hadoop 运维
Deployment and Maintenance of Hadoop on Openstack
Abstract
Openstack is an open source cloud management platform , Hadoop is a distributed system infrastructure. The graduation project is intended to build on that platform Hadoop cluster Openstack distributed data processing and storage , and maintains Hadoop cluster. Use of virtualization technology to provide Openstack and Sahara project , to provide environmental support Hadoop , Hadoop cluster will build on Openstack platform , and thus take advantage of Hadoop 's HDFS distributed file system , MapReduce algorithm thinking , HBase distributed large data storage modules analysis, processing , storage . The complex , esoteric bottom logic package integration, exposed to more easily reach the superstructure . Under the premise of the entire system to work properly , perform routine maintenance and real-time monitoring server status and operational status of the cluster , with plain form of visual feedback to the user.
Key Words: Big data Cloud computing platform Openstack Hadoop Deployment and maintenance
目 录
摘要Ⅰ
Abstract-Ⅱ
目录Ⅲ
图清单-Ⅴ
表清单-Ⅴ
1 绪论 1
1.1 Openstack发展现状 1
1.2 Hadoop发展现状 2
1.3 本文研究内容及主要贡献 2
2 Openstack平台准备 4
2.1 Openstack各组件的部署 4
2.2本章小结 8
3 Hadoop部署及运维 9
3.1 分布式Hadoop集群部署 9
3.2 Hadoop各组件概况及部署 10
3.3 本章小结 14
4 Hadoop实践 15
4.1 Hadoop部署实践 15
4.2 Hbase部署实践 18
4.3 本章小结 19
5. Hadoop运维 20
5.1配置文件分发策略 20
5.2 注意配置的内存限制 20
5.3 开启trash功能 20
5.4 定期备份可用副本 21
5.5 本章小结 21
6 结论 22
参考文献 23
致谢 24
图清单
图序号 图名称