趋同商品购买趋势预测毕业论文
2021-04-05 00:26:31
摘 要
随着当今社会互联网工程的发展,越来越多的新生活方式出现在我们眼前,现如今网络购物已经逐渐代替实体店经营成为一种主要的消费方式。最流行、最主要的大型网络购物平台,例如淘宝、亚马逊、京东都凝聚了大量的电商和用户。正因为如此,在这种虚拟购物空间中,了解购买商品用户的特征,然后研究那些拥有高销量以及广泛购买者的趋同商品的销量与购买者特征的关系,进而预测出用户的特征与趋同性商品销量的内在联系与本商品今后购买趋势,成为了一个必要的步骤。
本次实验,选取近期购买者较多的网红商品大白兔牌奶糖作为研究对象,调查了购买此商品的5000名用户的年龄、性别、婚姻状况、受教育程度、住房状况、月收入以及蚂蚁信用积分七个特征,从而研究大白兔奶糖的销量与购买者特征的关系,用迭代回归树模型(GBDT)进行训练与预测,进而得出奶糖的购买趋势。
实验结果表明,经过直接通过GBDT训练与进行交叉验证的GBDT训练两种方法,仅仅用客户的7个指标对产品销量进行预测,拟合优度超过了0.5,说明迭代回归树模型GBDT的效果还是不错的;而且从随机抽取样本训练和交叉验证结果对比分析,无论从指标重要程度还是从拟合优度看,差异都非常小,所以模型的稳定性不错。初步研究出了用户特征与购买行为的关系,达到了预期的期望。
关键词:趋同商品;购买趋势;迭代回归树模型;预测
Abstract
With the development of Internet engineering in today's society, more and more new lifestyles are appearing in front of us. Nowadays, online shopping has gradually replaced physical store operations as a major consumption method. The most popular and most important large-scale online shopping platforms, such as Taobao, Amazon, and Jingdong, have gathered a large number of e-commerce and users. Because of this, in this virtual shopping space, understand the characteristics of users who purchase goods, and then study the relationship between the sales volume of the products with high sales volume and the broad purchaser's convergence products and the characteristics of the buyers, and then predict the characteristics and convergence of users. The intrinsic link between product sales and the future purchase trend of this product has become a necessary step.
In this experiment, the net red rabbit brand white candy with more recent purchasers was selected as the research object, and the age, sex, marital status, education level, housing status, monthly income and the 5000 users who purchased the product were investigated. Seven characteristics of ant credit scores, in order to study the relationship between the sales of white rabbit toffee and the characteristics of buyers, using the iterative regression tree model (GBDT) for training and prediction, and then to obtain the purchase trend of toffee.
The experimental results show that after passing the GBDT training and cross-validation GBDT training, only the customer's 7 indicators are used to predict the product sales, and the goodness of fit exceeds 0.5, indicating that the effect of the iterative regression tree model GBDT is still Good; and from the random sample extraction training and cross-validation results comparison analysis, no matter from the importance of the index or the goodness of the fit, the difference is very small, so the stability of the model is good. The relationship between user characteristics and purchase behavior was initially studied and the expected expectations were met.
Key Words:Convergence commodity; purchase trend; GBDT; prediction
目 录
第1章 绪论 1
1.1研究目的及意义 1
1.2 国内外研究现状 2
1.3 研究内容说明 3
第2章 所需环境与技术 4
2.1机器学习简介 4
2.2 SK-learn组件简介 5
2.3迭代回归树算法模型(GBDT)简介 6
第3章 预测系统的设计 9
3.1趋同商品购买趋势预测的基本步骤 9
3.1.1预测目标的选取与确定 9
3.1.2 收集和整理数据 10
3.1.3 确定算法与建立模型 11
3.1.4 验证与评价 13
3.2 方法的比较与评估 15
3.2.1 直接预测结果 15
3.2.2 3折交叉验证预测结果 16
3.3 预测结果分析 18
第4章 部分功能的代码实现 19
4.1 训练及预测部分的代码实现 19
4.2 验证部分的代码实现 21
4.3 基本调参 22
第5章 总结 24
参考文献 25
致谢 26
第1章 绪论
网络购物是一种新的消费方式,自21世纪初以来,网络购物平台在世界上犹如雨后春笋般涌出。由于网购存在异地性,虚拟性等特点,其购买者数量又远远大于普通实体店。尤其是对于一个颇受用户欢迎的商品,即趋同商品,分析其购买者的异同点与其销量,弄清楚什么样的用户更倾向于购买本产品,或是说用户特征对销售量具体有什么影响,并针对分析结果来调整商品的上架情况,这可以有效地提升商家的利润也可以使消费更便捷。所以利用计算机技术来分析用户特征与商品销量的关系,并预测未来销量成为了一个必不可少的课题。
机器学习是人工智能技术的核心,不同于以往的人工预测,利用机器学习来分析过往数据,可以更便捷更准确地达到预测目的。而迭代回归树算法模型(GBDT)是机器学习中一个极为重要的算法模型[1]。本次,我们利用迭代回归树模型分析5000名大白兔奶糖购买者的个人特征以及销量数据,从而探究与预测趋同商品的购买趋势。