基于量化投资的稳定收益alpha策略研究——“圣杯”有可能存在吗?毕业论文
2021-10-22 21:52:12
摘 要
随着股市的不断发展,A股上的策略愈发趋向多元化,如何在股票市场稳定盈利,不受市场波动的影响;如何在股市普遍大跌时候仍能获取正的超额收益,拥有较小的最大回撤;如何在诸多投资策略中脱颖而出,寻找到投资界的“圣杯”,做到这些是每个投资者的梦想。虽然量化投资在国内仅仅只有不到二十年的发展历史,但其表现出的高稳定性、高收益性吸引了诸多投资者的注意。在本文中,本文提出了一个具体的多因子选股模型体系,囊括因子筛选、因子数据预处理、因子有效性检验、多因子结合选股。
首先本文给出了因子检验的全流程。以BP因子为例,首先对其缺失值进行填充前值处理,然后对因子值矩阵利用MAD中位数法去除极值,接着进行标准化处理来降低因子波动程度,最后再对其多多元回归,将得到的多元回归残差值作为新因子,以此来进行市值中性化、行业中性化处理。
在对因子进行完数据预处理后,本文来对其进行有效性检验,回测区间为2013年6月至2020年3月,本文将有效性检验分为了三个步骤,分别为多空组合法、IC值与IR值、累计收益率。多空组合法是指首先计算每个截面的各个股票因子值,然后根据因子值对股票排序,分别对排名靠前的股票做多,排名靠后的股票做空,然后计算这个多空组合的年化收益率、夏普比率、最大回撤。发现BP因子的多空组合收益图均是震荡起伏,各个指标表现均较差,年化收益率在2013年、2015年、2019年、2020年均为负数,亏损最多的一年达到-35%,同时最大回撤指标表现很差,最大值达到36%,而夏普比率也出现了多个负值,IC值也多为负值,代表因子值与股票收益率不相关。由此推断BP因子在本文的回测区间中并不是一个单独使用的好因子。
在测试完BP因子后,本文接下来给出了两个有效因子的示例,分别为纯技术因子Alpha_3与基本面价量结合因子ROE_return_decay,对其应用上述因子有效性测试步骤,其多空组合的累计收益率、IC值累计值、累计收益率稳步上升;年化收益率表现优秀、最大回撤小于10%、夏普比率几乎均为正值。
接下来,本文利用该因子回测方法选出了七个有效因子,并且利用打分法来构建了多因子选股模型,利用这个模型来选择投资组合,选出的投资组合自2013年6月至2020年3月间多年间年化收益率突破20%,最大回撤甚至出现了诸多2%、3%,夏普比率十分可观,可以证明本文选出的七个因子与构建的多因子选股模型是一个具备投资价值的模型。
关键词:量化投资;多因子选股;多空组合法
Abstract
With the continuous development of the stock market, the strategies of A-shares are becoming more and more diversified, how to make a stable profit in the stock market without being affected by market fluctuations, how to obtain positive excess returns when the stock market generally falls, and how to have a smaller maximum pullback; how to stand out among many investment strategies and find the "holy grail" of the investment community is the dream of every investor. Although quantitative investment has only a history of less than 20 years in China, its high stability and high profitability have attracted the attention of many investors. In this paper, This paper propose a specific multi-factor stock selection model system, including factor screening, factor data preprocessing, factor validity test, multi-factor combined stock selection.
First of all, this paper give the whole process of factor test. Taking the BP factor as an example, this paper first deal with its missing value before filling, then remove the extreme value of the factor value matrix by MAD median method, then standardize it to reduce the degree of factor fluctuation, and finally use the multiple regression residual value as a new factor to neutralize the market capitalization and industry neutralization.
After the data pre-processing of the factor, this paper test its validity. The back test interval is from June 2013 to March 2020. In this paper, the validity test is divided into three steps: multi-empty combination method, IC value and IR value, and cumulative rate of return. The multi-short combination method means that the value of each stock factor of each section is calculated first, then the stocks are sorted according to the factor value, the stocks at the top are long, and the stocks at the bottom are short, and then the annualized rate of return, Sharp ratio and maximum pullback of the long-short combination are calculated. It is found that the return chart of the long-short portfolio of the BP factor is fluctuating, and the performance of each index is poor. The annualized rate of return is negative in 2013, 2015, 2019 and 2020, and the year with the largest loss reaches-35%. At the same time, the performance of the maximum retracement index is very poor, the maximum value reaches 36%, and the Sharp ratio also has a number of negative values, and most of the IC values are negative, which means that the factor value is not related to the stock return. It is inferred that the BP factor is not a good factor to be used alone in our retest interval.
After testing the BP factor, This paper next give two examples of effective factors, namely the pure technical factor Alpha_3 and the basic face value volume combination factor ROE_return_decay, applying the above factor effectiveness test steps to it, its long and short Cumulative rate of return, cumulative value of IC value, and cumulative rate of return have risen steadily; the annualized rate of return is excellent, the maximum drawdown is less than 10%, and the Sharpe ratio is almost positive.
Next, this paper use the factor back test method to select seven effective factors, and use the scoring method to construct a multi-factor stock selection model, and use this model to select the investment portfolio. The annualized rate of return of the selected portfolio exceeded 20% from June 2013 to March 2020, and the maximum pullback even appeared a lot of 2% or 3%. The Sharp ratio is very considerable. It can be proved that the seven factors This paper selected and the multi-factor stock selection model This paper constructed are a model with investment value.
Key words: quantitative investment; multi-factor stock selection; multi-empty combination method
目录
第1章 绪论 1
1.1量化投资与“圣杯” 1
1.2研究方法 1
1.3研究意义 2
第2章 文献综述 3
2.1量化投资典型理论 3
2.2国外文献综述 4
2.3国内文献综述 5
第3章 因子数据处理 6
3.1数据选取 6
3.2数据预处理 6
3.2.1缺失值处理 6
3.2.2极端值处理 7
3.2.3标准化处理 8
3.2.4中性化处理 9
第4章 因子有效性检验 11
4.1多空组合法 11
4.2 IC值与IR值 14
4.3累计收益率 16
4.4有效因子示例 17
4.4.1价量因子Alpha_3 17
4.4.2基本面与价量结合因子ROE_return_decay 20
第5章 多因子结合选股模型 25
5.1 因子池构建 25
5.2 回归法多因子模型 26
5.3打分法多因子模型 26
5.4七因子模型测试实例 27
第6章 结论 30
6.1主要结论 30
6.2未来工作展望 31
参考文献 32
致谢 33
第1章 绪论
1.1量化投资与“圣杯”
量化投资,指用数理统计、计算机技术分析的方法来进行选股。近些年来,随着金融市场自由化的不断推进与计算机技术的大幅进步,量化投资迎来了黄金发展的时代,2010年股指期货的推出,结束了不可以做空的时代,大大推动了对冲策略的发展,也极大促进了量化对冲基金的进步。而近些年股市的持续低迷,也让越来越多的投资者把目光转向量化投资这颗冉冉升起的新星。
所谓“圣杯”,指的是稳赚不赔的策略,具体到技术指标上,就是最大回撤极低,但是年化收益率为正数,夏普比率较高,量化投资中的alpha策略目的就是追求极值的对冲,可以对冲掉所有风险,获取稳定的alpha收益。量化策略的回测与实盘会遇到各种各样的问题,有的是策略本身模型构建时候存在问题,有的是模型之外的风险,比如这次的疫情、汶川地震,这些都属于不可预料的天灾。而如何应对这些可能的风险,获取稳定的收益,是每个quant都要考虑的问题,而垂挂在天边金闪闪的“圣杯”,则是所有人毕生追寻的梦想。
1.2研究背景
投资组合的管理分为主动管理与被动管理两种方式。主动管理是指通过采用基本面分析方法、技术分析方法等,来人为主动的选取股票,来使投资组合的收益高于基准收益率;主动管理又分为定性管理与定量管理,定性管理指投资经理通过对市场与公司基本面的经验判断,来构建投资组合,定量管理指通过能够公开获取到的价量、基本面、宏观信息数据,基于数学与计算机的方式,来构建投资组合。被动管理是指投资标的尽可能的追踪某个指数基准,被动管理投资经理通过指数的成分与权重来对自己构建的投资组合进行调整,使投资组合的绩效与指数基本一致,也就是说被动管理构建的投资组合很难战胜市场。