基于深度学习的复杂道路图像识别
2022-10-26 10:59:12
论文总字数:32826字
摘 要
在人工智能充当着越来越重要角色的现在,人们对人工智能所应用场景的广泛性和高效性要求逐日增高。驾驶是人们在现代生活中不可或缺的一部分,不论是选择哪一种通勤方式,如铁路交通、汽车、飞机更或者是摩托,大家都少不了希望自动驾驶能代替人工操作。自动驾驶在铁路交通和飞机上的应用已经有了很长一段时间的发展,原因很简单:这两者的运用环境相对简单,容错率高,只需要加以一定的人工监管就行。而相对而言,汽车的自动驾驶实施起来的难度提高了太多,首先车辆要识别出前方的可行驶区域,其次要符合交通行驶规则。本文主要介绍的是运用深度学习方法进行道路可行性区域识别的实现。
对于实现可行性道路的标记,涉及到图像分割的范畴,因此本文使用Mask R-CNN算法进行实现。该算法属于深度学习算法,通过一定数量的隐藏层和神经元对于训练图像进行训练,从而提取出图像的特征。算法实现首先要对目标区域进行预测标记预测框,然后在预测框中进行图像分割,是两个独立的过程。项目过程包含环境配置、对于数据集的获取及标注、数据集格式转换、数据集训练、超参数调整、训练模型导出以及实现使用模型对图片的预测并标注。
本文首先引出了本课题的研究方向:计算机视觉的内涵与研究意义,介绍了计算机视觉的研究背景,并分析了国内外对于此方面的应用情况,接着交代了文章的工作任务安排。使用Mask R-CNN是一个进阶的过程,最终的实现少不了对于机器学习和深度学习的知识背景,因而本文介绍了最终实现过程所需要的基础知识。其中虽然图像处理使用的方法和本文不同,但是操作遵循的原理是一样的,是基础中不可缺少的基础,因此本文不仅有机器学习相关的计算机视觉理论知识,也涵盖少量图像处理相关的内容。接着本文对计算机视觉相关的机器学习不同算法、函数和网络进行讲解、评测与分析。第四章中的内容是对于使用的Mask R-CNN算法的内部结构详解和最终使用数据集的获取方式和标记方式。最后是实现结果的具体操作过程,其中包括软件环境配置、数据集的格式转换、超参数的调整和训练过程。训练产生模型进行预测后,实现了最终的可视化,本文还介绍了不同道路情况的实现结果并展示与分析。
关键词:TensorFlow;可行性道路识别;深度学习;非结构性道路;Mask R-CNN
Complex Road Image Detection Based on Machine Learning Algorithms
Abstract
Nowadays, artificial intelligence plays a remarkable role in industry manufacturing and human’s daily lives. The requirements for various fields and high efficiency of artificial intelligence applications are increasing as time went by. Driving is an indispensable part of modern life. No matter what kind of commuting method to choose, such as railway transportation, cars, airplanes or motorcycles, people always seek to replace manual operation with machines. The application of automatic driving in railway traffic and aircraft has been developed for a long time. The reason is conspicuous: the application environment of the two is relatively simple and can be used on multiple scenarios with less errors. They only require a certain amount of manual supervision. Relatively speaking, implementing the automatic driving of the car is much more difficult. Firstly, machine needs to identify the drivable area, and then it needs to meet the traffic driving rules. This paper mainly introduces how to recognize drivable areas using deep learning algorithms.
The marking of the feasibility road involves the process of image segmentation. Thus, this paper uses the Mask R-CNN algorithm to realize the goal. It is a deep learning algorithm which contains a certain number of hidden layers and neurons to extract the features of the image. Firstly, the algorithm predicts the bounding box on the target area, and then do the image segmentation in the prediction frame. The two process both run independently. The project includes environment configuration, dataset acquisition and labeling, dataset format conversion, dataset training, hyperparameter adjustment, training model export, and labeling the areas on the prediction images.
First of all, this paper introduces the main research context of this subject: the meaning and significance of computer vision, the research background of computer vision. And analyzes the application of this aspect domestically and abroad. It also explains the work assignment of the article. The use of Mask R-CNN is an advanced process, and the background of machine learning and deep learning is indispensable for the final implementation. Therefore, it also introduces the basic knowledge required for the final implementation process. Although the method used in image processing is different from which used in the machine learning, the principle of the two operation is the same, and it is an essential foundation. Hence, this paper not only concludes knowledge related to computer vision, but also covers a small amount of image processing content. Besides, this paper evaluates and analyzes different algorithms, functions and networks of computer vision related machine learning. The fourth chapter has a detailed explanation of the internal structure of the Mask R-CNN algorithm and the process of obtaining and marking the data set. Finally, this paper introduces the operation process to achieve the results, including software environment configuration, data set format conversion, hyperparameter adjustment and training process. After the training process, the network produces the model for prediction which we use to realize the road image detection. Last but not the least, we show the detection results of different road conditions and analyzes them.
Keywords: TensorFlow, Drivable Roads Detection, Deep Learning, Unconstructed Roads, Mask R-CNN
目录
摘要 1
Abstract 2
第一章 计算机视觉 6
1.1计算机视觉的应用背景与发展史 6
1.2计算机视觉的国内外应用情况 7
1.3 本文工作任务安排 7
第二章 数字图像处理 9
2.1 图像预处理之空域分析 9
2.1.1中值变换 9
2.1.2 Sobel算子 9
2.1.3拉普拉斯算子 9
2.2图像预处理之频域分析 10
2.3图像特征及描述 10
2.3.1颜色特征 10
2.3.2几何特征 10
2.3.3纹理特征 10
2.3.4局部特征 10
第三章 深度学习在计算机视觉中的应用 11
3.1简单神经网络 11
3.1.1激活函数 11
3.1.2代价函数 13
3.1.3梯度下降法 14
3.2卷积神经网络CNN 15
3.2.1 PADDING 16
3.2.2池化 16
3.2.3比较传统神经网络和CNN网络 17
3.2.4 经典CNN神经网络 17
3.3递归神经网络RNN 18
3.3.1 RNN网络的改进算法LSTM 19
3.3.2对比LSTM网络和CNN网络 19
3.4用于目标检测的Faster R-CNN 19
第四章 基于Mask R-CNN和 TensorFlow 的图像分割的实现 21
4.1图像分割 21
4.2 Mask R-CNN 22
4.3基于Mask R-CNN和BDD100k实现的道路可能性区域识别 23
4.3.1 数据集的选取与收集 23
4.2.2 数据集的处理 27
4.2.3 Python语言 27
剩余内容已隐藏,请支付后下载全文,论文总字数:32826字