
  • 登录
  • 忘记密码?点击找回


  • 获取手机验证码 60
  • 注册


  • 获取手机验证码60
  • 找回
毕业论文网 > 毕业论文 > 计算机类 > 软件工程 > 正文


 2020-08-20 20:01:20  

摘 要








With the further popularization and promotion of the Internet, especially the "microblogging" "WeChat" "Baidu Post Bar" and many other social networking software and the rise of the network forum, the hot topic on the network by the traditional media positive response and in-depth mining. These software and forums are interactive and very fast transmission of the tools, their spread far more than the traditional media, we can see that the network interaction is changing the pattern of public opinion, it is the influence of the community is growing.

However, due to the openness of network speech and the quality of Internet users, the network has become a rumor spread, and even lead to social dissonance of the hardest hit, therefore, it is necessary to establish a related network of public opinion monitoring system to monitor the network of public opinion, In order to achieve the maintenance of "network society" stability. Students as the main driving force of the future society and now the most active network era is the largest number of groups, for the students to monitor the network of public opinion is essential.

Python language as a high-level computer language, code code is simple and clear, it is suitable for the actual situation to quickly modify the code, very adapt to the current content on the network, the layout of the situation, the use of Python language related to the rapid development of public opinion monitoring system, relative Other languages are more advantages.

As the student community mostly active in the "microblogging" "Baidu Post Bar" and other large-scale social networking forum, in view of this feature, this paper mainly through the Python language, based on the design of a Python crawler program design method, the use of Python language urllib Module on the "Baidu Post Bar - Wuhan University of Technology posted" page to crawl. The tool uses Python's regular expression keyword matching technique to match the "sticky" content that meets the specified criteria and fetch the content, sorting the content based on the descending order of the number of people who crawl the content To monitor and analyze the students' public opinion.

The experimental results show that the program has the advantages of strong pertinence, fast acquisition of data, easy embedding development, easy and easy maintenance of the code, and provides quick access to the researchers who are not good at programming. University paste "on the students of public opinion, is conducive to the monitoring of student opinion and follow-up data mining and research.

Key Words:Python; crawler; Baidu Post it; student public opinion


第1章 绪论 1

1.1 选题价值 1

1.2 学生舆情监控的意义 1

第2章 Python 1

2.1 Python语言的现状 2

2.2 Python语言的特点 2

2.2.1 简洁,易懂 2

2.2.2 开源,免费 2

2.2.3 可移植性 3

2.2.4 面向对象 3

2.2.5 解释性 3

2.2.6 可扩展性 4

2.2.7 可嵌入性 4

2.2.8 丰富的库 4

第3章 简单爬虫系统的设计与实现 4

3.1 开发环境搭建 4

3.1.1 系统开发环境 4

3.1.2 Python开发环境搭建 4

3.1.3 安装scrapy框架 5

3.1.4 安装requests第三方模块 5

3.1.5 安装BeautifulSoup第三方模块 5

3.1.6 安装lxml第三方解析器 5

3.1.7 在Eclipse上搭建pydev环境 6

3.2 爬虫设计思路 6

3.2.1 简单爬虫架构设计 6

3.2.1 简单爬虫运行流程 7

3.3 爬虫系统的具体实现 7

3.3.1 爬虫调度器模块 8

3.3.2 URL管理器模块 9

3.3.3 网页下载器模块 10

3.3.4 网页解析器模块 11

3.3.5 数据输出器模块 11

第4章 爬取数据并存入数据库 12

4.1 数据库环境搭建 12

4.1.1 系统环境 12

4.1.2 MySql安装和配置 12

4.1.3 安装MySql-python包 13

4.2 将爬取到的数据存入数据库 13

4.2.1 引入数据库模块 13

4.2.2 将爬取到的数据存入数据库 13

4.2.3 数据库中的数据进行降序排列 14

第5章 舆情分析系统运行 14

5.1 系统运行 14

5.2 系统运行结果分析及改进 14

第6章 结语 15

  1. 绪论


1.1 选题价值


1.2 学生舆情监控的意义


  1. Python

Python,是一种面向对象的解释型计算机程序语言,它的“父亲”是荷兰人Guido van Rossum。Python在1989年“出生”之后,就受到了广大编程人员的欢迎,并在之后不断的发生着变化。

2.1 Python语言的现状

早在90年代当最早的搜索引擎出现时,为了配合搜索引擎对互联网上大范围的数据进行搜索,也就衍生出了最早的网络爬虫。这个时候的网络爬虫在爬取的时候都是使用深度或广度优先的遍历方式。在搜索引擎中最为重要的一部就是要在庞大的互联网中爬取,采集用户所需要的信息,而作为完成这一系列工作核心的网络爬虫,其的性能的好坏,爬取效率的高低,爬取范围的大小等都将直接影响到整个搜索引擎对网页搜索的质量,数量,而网络爬虫的功能强弱也成为一个搜索引擎性能好坏的重要判断因素,也因此出现了后来的分布式网络爬虫。后来出现的分布式爬虫相对于之前的网络爬虫已经有了很大不同,不仅大大提高了爬取时的效率,而且其书写编译也更为简单。就目前而言,分布式网络爬虫已经有了不少比较成熟的应用,其中就有著名的Google和Alta Vista搜索引擎。


2.2 Python语言的特点

您需要先支付 80元 才能查看全部内容!立即支付


Copyright © 2010-2022 毕业论文网 站点地图