摘要:基于本原则本课题内容,使用了Hadoop数据处理架构处理数据信息,也就说配置成Hadoop集群。(Hadoop主要是用JAVA语言编写的)要使得多节点处理数据必须使用多个虚拟节点达到MapReduce的要求,所以采用了VMware Workstation虚拟机搭建Ubuntu系统,以达到与win7成为双系统。(Ubuntu为Linux的其中一种开源系统)通过搭建3个Ubuntu系统完成一个三台主机的小集群Hadoop软件构架。(其中一台定位为:master、jobtracker、namenode;剩下两台为:slave、tasktracker、datanode)其中一个是master结点,主要是用于运行hadoop程序中的namenode、secondorynamenode和jobtracker任务;另外两个结点为slave结点,其中一个是用于冗余目的,所以模拟hadoop集群至少要有3个结点,如果电脑配置非常高,可以增加其它slave的结点。71916
毕业论文关键词:Ubuntu;hadoop;mapreduce;完全分布式
Analyze customer behavior patterns
based on large data analysis framework
Abstract:Based on this topic, the Hadoop data processing architecture is used to process data information, which is said to be configured as a Hadoop cluster。 (Hadoop is mainly written in the JAVA language) to make data processing more than node must use multiple virtual nodes to achieve the requirement of the graphs, so using the virtual machine VMware Workstation set up Ubuntu system, in order to achieve become double system with Windows 7。 (Ubuntu is one of the open source systems for Linux) by building three Ubuntu systems to complete a three-engine cluster of Hadoop software architecture。 (one of them is: master, jobtracker, namenode; the other two are: slave, tasktracker, datanode) one of which is the master node, which is mainly used to run the namenode, secondorynamenode and jobtracker task of the hadoop program; Two other nodes for the slave, one is for redundancy purposes, so the simulation hadoop cluster should have at least three node, if the computer configuration is very high, can increase the other slave nodes。
KeyWords:Ubuntu;hadoop;mapreduce;Complete distributed systems
目录
1。绪论 1
1。1 课题研究背景及意义 1
1。1。1 研究背景 1
1。1。2研究意义 1
1。2论文结构 2
1。3本章小结 2
2。相关技术介绍 3
2。1 VMware workstation虚拟机 3
2。1。1 VMware的作用 3
2。1。2 VMware的安装及配置 3
2。1。3 VMware的系统工具插件VMware-tools 3
2。2 Linux系统 5
2。2。1 Linux的作用 5
2。2。2 Linux的种类 6
2。2。3 Ubuntu的安装及配置 6
2。2。4 Ubuntu的语言以及操作细节 7
2。3 Java语言环境 9
2。3。1 Java的作用 10
2。3。2 Java的安装及环境配置 10
2。3。3 Java的系统配置 11
2。4 SSH服务协议 12
2。4。1 SSH的作用 12
2。4。2安装及配置SSH服务 12
2。5 Hadoop技术框架 hadoop大数据分析框架分析客户行为模式:http://www.youerw.com/jisuanji/lunwen_81688.html