摘要随着信息技术和计算机网络的发展,数据正在以惊人的速度不断产生,面对日益庞大的数据规模,单一主机的容量和运行速度都已经不能满足大数据处理的需求。信息的形式越来越多样化,除了单纯文本,图片等媒体文件在人们生活中扮演的角色越来越重要。随着经济的发展,对图像的处理技术被应用到各行各业,集装箱的箱号识别就是其中之一。本文将大数据处理技术与图像识别技术结合起来,共同应用于集装箱的箱号识别。
本文的工作主要有以下两点:1、分析集装箱箱号的分布特点,对箱体图像进行数字处理,获得箱号字符;2、使用Spark为箱号字符的识别算法加速。
经研究发现,在处理自然环境下的集装箱图片时,虽然图片变化多端,但箱号字符的排列具有共性,根据这一特点可以过滤杂质获得字符;Spark技术适用于大规模的数据集操作,数据集越庞大,加速效果越明显。74288
毕业论文关键词 图像处理 模式识别 Spark 大数据
毕业设计说明书外文摘要
Title Acceleration algorithm of image similarity comparison based on Spark
Abstract With the development of information technology and computer network, the data is constantly being produced at an amazing speed。 The performance of a single machine cannot meet the needs of big data processing in the face of the size of data sets which is growing huger than ever。 The format of information also becomes persified。 The media files, such as pictures, are playing more important roles in people’s lives。 The technique of image processing is applied into the various walks of life with the development of economy。 The recognition of Container numbers is an example among them。 This paper makes combination of big data processing and image processing and take advantages of them to enhance the recognition of Container numbers。
This paper has the following points:1、Analyze the distribution law of Container numbers and do digital processing of Container image to get matrix of numbers; 2、Accelerate the algorithm of recognition of Container number by using Spark。
The study found that when processing Container images under the natural environment, the numbers arrangement is with great commonness even though the images are the most changeful。 And we can make full use of the commonness to filter the noise。 It also turns out that the Spark technology is apply to process large data sets。 The larger the data set, the better the effect of the acceleration。
Keywords image Processing pattern recognition Spark big data
目 次
1 绪论 1
1。1 课题背景、项目意义 1
1。2 相关知识简介 1
1。2。1 集装箱及箱号 1
1。2。2 图像处理相关技术介绍 3
1。2。3 KNN算法介绍 5
1。2。4 Spark介绍 5
1。3 全文篇章结构 6
2 研究发展现状 8
2。1 图像处理 8
2。2 集装箱箱号识别 9
2。3 模式识别 9
2。4 大数据处理技术 10
3 算法模块与定义 12