基于强化学习的全局路径规划方法研究与仿真毕业论文

2021-11-06 20:21:54

摘要

随着科技的发展，船舶智能化水平不断提高，智能船舶受到越来越多的关注。全局路径规划在智能船舶领域占据了重要地位，有效的全局路径规划方法可以缩短船舶航程，降低运输成本并提高航行安全性。本文研究了强化学习中的Q-learning算法理论以解决智能船舶在静态环境中的全局路径规划问题，并针对目前全局路径规划中主要存在的问题进行改进。本文主要研究内容如下：

(1)本文设计了基于经典Q-learning算法的全局路径规划方法并进行仿真分析，针对主要存在的问题进行改进。针对环境过于简单的问题，本文将环境模型扩展并细化障碍物模型，增加环境复杂度；针对船舶动作空间过度离散化的问题，本文增加了动作数来减小路径长度和总转角；针对收敛速度较慢的问题，本文完善动作选择策略，使贪婪因子随幕数增加而减小；针对未考虑船舶转弯半径的问题，本文使船舶在正常航速下的转弯半径小于或等于路径拐点过渡处的曲率半径，满足船舶在路径拐点处的操纵性要求；针对未考虑船舶与障碍物的安全距离的问题，本文引入“bumper”模型并对障碍物做膨胀处理，使船舶与障碍物之间有安全距离；针对规划出来的路径不够平滑的问题，本文加入B-样条曲线对路径进行平滑处理。

(2)本文对基于改进Q-learning算法的全局路径规划方法进行仿真并完善了算法评价指标，利用路径长度评价指标、航行时间评价指标、燃油消耗量评价指标、航行时间评价指标和综合评价指标来分析比较仿真结果。仿真实验表明，本文采用的改进方法较好地解决了上述智能船舶全局路径规划存在的问题，对实现智能船舶的智能航行具有工程意义。

关键词：智能船舶；智能航行；强化学习；改进Q-learning算法；全局路径规划

Abstract

With the advancement of science, the level of ship intelligence has been improved, and more and more attention has been paid to intelligent ships. Global path planning plays an important role in the field of intelligent ships. An effective global path planning method can shorten the ship's voyage, reduce transportation costs, and improve navigation safety. In this paper, the theory of Q-learning algorithm in reinforcement learning is studied to solve the global path planning problem of intelligent ships in static environment, and the main problems in global path planning are improved. The research contents of this paper are as follows.

(1) In this paper, a global path planning method based on the classical Q-learning algorithm is designed and simulated. Aiming at the problem of too simple environment, this paper extends and refines the obstacle model to increase the complexity of environment; Aiming at the problem of over discretization of ship action space, this paper increases the number of actions to reduce the path length and total angle; Aiming at the problem of slow convergence speed, this paper improves the action selection strategy to make the greedy factor decrease with the increase of the number of scenes; Aiming at the problem of not examined Considering the problem of turning radius of ships, this paper ensures that the turning radius of ships at normal speed is less than or equal to the curvature radius of the transition of the turning point of the path, and meets the maneuverability requirements of ships at the turning point of the path; For the problem of not considering the safe distance between ships and obstacles, this paper introduces the "bumper" model and makes expansion treatment for obstacles, so as to ensure that there is a safe distance between ships and obstacles; For the problem that the path is not flat, this paper adds a B-spline curve to smooth the path.

(2) In this paper, the global path planning method based on the improved Q-learning algorithm is simulated and the algorithm evaluation index is improved. The simulation results are analyzed and compared by using the path length evaluation index, navigation time evaluation index, fuel consumption evaluation index, navigation time evaluation index and comprehensive evaluation index. The simulation results show that the improved method can solve the problem of global path planning of intelligent ship, and it is of engineering significance to realize intelligent navigation of intelligent ship.

Key words: intelligent ship; intelligent navigation; reinforcement learning; improved Q-learning algorithm; global path planning

第1章绪论 1

1.1 选题背景及研究意义 1

1.2 全局路径规划方法的研究现状 1

1.2.1 传统方法 2

1.2.2 强化学习方法 2

1.3 本文主要工作与内容安排 3

第2章强化学习算法概述 5

2.1 基本原理 5

2.2 强化学习系统的组成 6

2.2.1 动作选择策略 6

2.2.2 收益信号 7

2.2.3 价值函数 8

2.2.4 环境模型 8

2.3 强化学习的经典算法 8

2.3.1 动态规划算法 8

2.3.2 蒙特卡罗算法 9

2.3.3 时序差分算法 9

2.4 本章小结 10

第3章基于Q-learning算法的全局路径规划 11

3.1 Q-learning算法 11

3.1.1 Q-learning算法概述 11

3.1.2 Q-learning算法收敛性分析 13

3.2 全局路径规划方法设计 14

3.2.1 环境模型 14

3.2.2 状态及动作空间 15

3.2.3 奖赏函数 16

3.2.4 动作选择策略 16

3.2.5 Q值表 16

3.3 仿真实验 17

3.3.1 仿真条件 17

3.3.2 仿真结果与分析 19

3.4 本章小结 22

第4章基于改进Q-learning算法的全局路径规划 23

4.1 全局路径规划方法改进与优化 23

4.1.1 环境建模及扩展 23

4.1.2 动作空间优化 24

4.1.3 船舶转弯半径 24

4.1.4 策略优化 26

4.1.5 路径后处理 26

4.2 算法的评价指标 27

4.2.1 路径长度评价指标 28

4.2.2 算法耗时评价指标 28

4.2.3 燃油消耗量评价指标 28

4.2.4 航行时间评价指标 29

4.2.5 综合评价指标 29

4.3 仿真实验 29

4.3.1 仿真条件 29

4.3.2 仿真结果与分析 30

4.4 本章小结 35

第5章总结与展望 36

5.1 全文总结 36

5.2 本课题研究展望 37

参考文献 38

致谢 40

第1章绪论

1.1 选题背景及研究意义

随着经济和科技的发展，以安全，环保，经济，高效为优势的智能船舶技术受到了各国造船业和航运业的高度重视。智能船舶的目标是结合传统的船舶设计和制造技术与最新的信息通信和人工智能技术，实现船舶及其配套设备和导航环境的智能化和自主开发^[1]。智能船舶系统框架如下图所示。智能航行是智能船舶的关键功能之一，指船舶使用传感器技术获取船舶航行所需状态信息，通过计算机处理分析，设计和优化船舶的航路和航速^[2]。

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码