基于集群的海事大数据存储与管理方法毕业论文
2021-11-01 21:10:30
摘 要
随着信息技术的革新和海事信息化的加深,先进船载服务设备得到普及,各种用于避碰、扫描、巡航、监控的信息系统广泛应用,各类海事信息平台得到推广,形成了“互联网 海事监管”的服务模式,这保障了水上交通安全,便于海事部门对船舶的监管与服务,而这同时也产生了规模庞大、种类繁多的海事数据,导致海事数据的复杂程度加深,数据量急剧增长,进入大数据时代。然而目前国内所推行的多数海事信息平台未能完全满足海事大数据的存储与管理需求,并在投入使用中凸显了一系列问题,如海事数据挖掘深度不够,数据量存储不足,各类数据之间存在壁垒等,而应用国外平台又存在网络安全、成本高昂等问题。因此为适应海事数据的变化和补足平台现有的缺陷,海事部门亟需升级海事数据存储与管理系统,这是现阶段海事信息化建设需要攻克的新课题。
本文研究便从数据库升级这一角度出发,深入调研国内外海事服务平台的应用和发展现状,分析海事大数据的特性,整理现阶段国内对海事数据的管理与应用需求。针对这些管理与应用需求,本文运用合理适用的集群的相关技术和架构部署方案,设计并实现了一种基于混合架构的分布式MySQL数据库集群,用于转化、存储与管理各类海事数据。该集群的存储模型分为用户层、数据分发层和数据存储层,数据分发层中HAProxy作为代理按轮询规则将用户层的请求转发至MyCAT节点,MyCAT集群按照配置完成的文件将请求分配至数据存储层中的相应集群节点,从而使得存储模型能够良好地实现负载均衡和读写分离;数据存储层合理部署主从复制集群和PXC集群,并规划出MySQL数据库的表结构,优化海事数据的管理方法,因此存储模型能够保证海事数据的有序存储与一致性。
分布式MySQL数据库集群已搭建完成,运行效果良好稳定,通过了各项功能测试与性能测试。测试结果表明,分布式MySQL数据库集群具备高吞吐量、高可用、高可扩展性的特性,并实现数据快速查询。这适用于现阶段海事大数据的变化,能够高效存储与管理海事大数据,实现海事大数据的实时存储、历史查询和高效管理,并能实际应用于水上交通、海事监管、货运物流、气象水文等领域,从而有力地推动海事信息化的进程。
关键词:海事信息化;海事大数据;分布式MySQL数据库集群;存储模型;管理方法
Abstract
With the innovation of a new generation of computer technology and the expanding of maritime information, advanced ship-borne service equipment has been popularized, various information systems have been widely used, which face at preventing collisions, scanning, cruising and monitoring, and many sorts of maritime information platforms have been promoted. So the service model of “Internet plus maritime affair supervision” has been formed, which enable to guarantee the water transportation safety and facilitate the supervision and service of maritime authorities toward vessels. At the same time, it has also produced a large scale and a wide variety of maritime data, which caused the maritime data much more complicated and the amount of data much larger than before, indicating that the maritime data has entered the era of big data. However, many maritime information platforms being promoted currently in China fail to fully meet the storage and management requirements of big maritime data, and even highlight a series of problems, such as insufficient depth of maritime data mining, inadequate data storage and barriers between different types of data. If foreign platforms were in used, some issues like network security and high cost should be taken into consideration. Therefore, in order to adapt to the changes of maritime data and make up for the existing defects of platforms, maritime authorities urgently need to upgrade storage and management system of the maritime data, which is a new research topic that needs to be overcome in the construction of maritime information at this stage.
This paper studies from the perspective database upgrade, deeply investigates the application and development status of maritime information platforms at home and abroad first, and then analyzes the characteristics of big maritime data and collates the domestic management and application requirements of maritime data at this stage. In response to these management and application requirements, this paper uses reasonable and applicable cluster related technologies and schema deployment plans to design and implement a distributed MySQL database cluster based on a hybrid architecture for conversion, storage, and management of various maritime data. The storage model of this cluster is made up of user layer, data distribution layer and data storage layer. In the data distribution layer, HAProxy acts as a proxy to forward the user layer’s requests to MyCAT nodes by round robin rules, and the MyCAT cluster distributes these requests to the corresponding cluster nodes in the data storage layer according to configured files so that the storage model can achieve good load balancing and read/write splitting. And it also can ensure the orderly storage and consistency of the data because of the reasonable deployment of master-slave replication clusters and the PXC cluster in the data storage layer, the planned list structures of MySQL database and the optimized management methods of maritime data.
The distributed MySQL database cluster has been built, and reaches a good and stable running effect. What’s more, it has made it to smoothly complete various function tests and performance tests. And the results of these tests show that the distributed MySQL database cluster possesses the characteristics of high throughput, high availability, and high scalability. Therefore, it can apply to changes in big maritime data at this stage, enable to efficiently store and manage big maritime data, and achieve goals of the real-time storage, historical query and efficient management of big maritime data, which can be practical used in different fields like water transportation, maritime supervision, freight logistics, meteorology and hydrology and so on, so as to effectively promote the process of maritime information.
Key Words:Maritime information; big maritime data; distributed MySQL database cluster; storage model; management methods
目 录
第1章 绪论 1
1.1 研究背景 1
1.2 国内外研究现状 1
第2章 集群的架构部署与技术运用 3
2.1 分布式MySQL数据库集群的相关概念 3
2.1.1 MySQL数据库 3
2.1.2 数据库集群 3
2.1.3 分布式数据库 3
2.1.4 分布式MySQL数据库集群 4
2.2 基本原理与技术运用 4
2.2.1 读写分离 4
2.2.2 数据库中间件 4
2.2.3 负载均衡 5
2.2.4 数据分片 5
2.3 分布式MySQL数据库集群的基础架构 5
2.3.1 架构的设计目的 5
2.3.2 基础架构的部署 6
2.4 本章小结 6
第3章 海事大数据存储与管理方法优化与实现 8
3.1 国内海事大数据管理与应用需求 8
3.1.1 海事大数据分析 8
3.1.2 管理与应用需求分析 8
3.2 海事大数据存储模型构建与实现 9
3.2.1 构建目标 9
3.2.2 存储模型的设计 10
3.2.3 存储模型的实现 13
3.3 海事大数据管理方法优化与实现 16