2021-11-06 20:11:33
摘 要
This article selects first to describe and analyze the closing prices of the 50 selected stocks of the 50 stocks. According to the different industries of the stocks, the overall weight analysis and contribution point analysis of each industry are carried out, and the financial industry is analyzed. The weight of the largest contribution point is the most, and then the dispersion of individual stocks in the financial industry is calculated.
In this paper, the stock closing price of SPD Bank is selected as the research object. In the time series analysis model, I first performed a smoothness test on the data, and the result was not stable. Therefore, I performed a differential calculation. Since the first-order check result was white Noise test, so I performed a second order difference operation on the original data and got a stationary non-white noise sequence. Next I established the ARIMA model. By establishing the AIC information table, I finally determined the model, and got the equation , model variance Theil coefficient , model prediction effect Ideally, this model is used to predict the testing set, and the predicted value is obtained.
In the support vector regression model, I used empirical methods to determine the parameters of the four kernel functions of the linear kernel function, polynomial kernel function, Sigmoid kernel function, and Gaussian kernel function, and then compared the prediction effects of the four kernel functions to get Gaussian. The mean square error MES of the kernel function SVM = 0.0043 and the goodness of fit are the best results of the four kinds, so the Gaussian kernel was used to predict.
The mean square error MSE = 0.066 of the time series analysis prediction sequence and the mean square error MSE = 0.0032 of the support vector regression prediction sequence, so in this model, the prediction accuracy of the support vector machine is higher. At the same time, I combined the facts to analyze the reason why the prediction sequence of the two methods differs greatly from the original data sequence in certain bands. I concluded that time series analysis is suitable for static prediction, that is, short-term Prediction is suitable for using the previous data to predict the values for the next few days, which is also consistent with the characteristics of time series analysis, and support vector regression is more suitable for long-term prediction.
However, since only one stock is selected as the research object in this article, the universality of the model needs further research, and other indicators of the stock, such as the price-earnings ratio, can be taken into consideration when conducting quantitative analysis. In addition, when selecting the parameters of the kernel function of the support vector machine, cross-tests can be considered to determine the optimal parameters.
Keywords: Mega Data; Stock price prediction; Time series analysis;SVR; Mean square error
第1章 绪论 1
1.1 研究背景及意义 1
1.1.1 股票收盘价研究的意义 1
1.1.2 机器学习在金融领域的发展现状及意义 1
1.2 文献综述 2
1.2.1 量化分析法 2
1.2.2 时间序列分析 3
1.2.3 支持向量机 3
1.3 本文结构与创新 4
1.3.1 本文结构 4
1.3.2 本文创新点 4
第2章 个股选择 6
2.1 上证50指数描述 6
2.2 金融行业个股分析 7
第3章 时间序列分析理论 9
3.1 时间序列简述 9
3.2 常用时间序列模型及建模基本步骤 9
3.3 数据预处理 10
3.3.1 序列检验 11
3.3.2 趋势平稳和差分平稳 12
3.4 ARMA和ARIMA原理及步骤 13
3.4.1 一般自回归模型AR() 13
3.4.2 移动平均模型MA() 13
3.4.3 自回归移动平均模型ARMA() 13
3.4.4 整合移动平均自回归模型ARIMA() 15
第4章 支持向量机相关理论 16
4.1 支持向量机SVM 16
4.2 支持向量分类问题SVC 16
4.2.1 线性可分SVC 16
4.2.2 非线性可分SVC 17
4.2.3 常用核函数Kernel Function 18
4.3 支持向量回归问题SVR 19
4.4 基于SVM预测股价的模型构建 19
第5章 实证分析 21
5.1 时间序列分析模型 21
5.1.1 平稳性检验 21
5.1.2 差分平稳 22
5.1.3 模型估计及诊断 24
5.1.4 时间序列分析数据 26
5.2 SVM模型 27
5.2.1 核函数选择 27
5.2.2 支持向量回归预测 28
5.3 结果对比分析 29
第6章 总结与展望 31
6.1 总结 31
6.2 不足与展望 31
参考文献 33
致谢 35
第1章 绪论
1.1 研究背景及意义
1.1.1 股票收盘价研究的意义
1.1.2 机器学习在金融领域的发展现状及意义