针对日供水量时间序列的非平稳性和耦合特征的复杂性,引入小波分解技术和随机森林回归模型,构建了基于尺度特征融合的随机森林模型(SF-RF)。首先,使用离散小波变换将单一尺度的原始时间序列分解为低、高频尺度的特征序列,提取耦合特征的多尺度信息;然后,使用随机森林回归模型拟合不同尺度特征;最后,线性融合各尺度的拟合结果获得总预测值。其中频率最高的尺度特征不参与预测。与单一RF模型、前馈神经网络(FFNN)和融合模型SF-FFNN相比,SF-RF模型具有最高的相关系数0.913和最低的标准均方差0.056,具有最高的预测精度,可用于城市日供水量预测。
Abstract
In view of the non-stationarity and complexity of coupling features of daily water supply time series, a random forest model based on scale feature fusion (SF-RF) was constructed by incorporating wavelet decomposition technique and random forest model. Firstly, the raw time series with a single scale was decomposed into multi-scale subsequences with both low and high frequencies using discrete wavelet transformation. Secondly, the multi-scale feature in each subsequence was simulated using the random forest model. Finally, the predicted value was obtained by linear fusion using the sub-results in each scale. Features in the highest frequency scale did not participate in the forecast. Compared with single RF model, feed-forward neural network (FFNN) and fusion model SF-FFNN, the proposed SF-RF model has the highest correlation coefficient 0.913 and the lowest normalized root mean square error 0.056, indicating that the proposed model has the highest forecasting accuracy and can be utilized for daily water supply forecasting.
关键词
日供水量 /
小波变换 /
随机森林 /
预测模型 /
尺度特征
Key words
daily water supply /
wavelet transform /
random forest /
forecasting model /
scale feature
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 张雅君,刘全胜,冯萃敏.多元线性回归分析在北京城市生活需水量预测中的应用[J].给水排水,2003(4):26-29.
[2] 徐 瑾,赵 涛.城市生活需水量预测中智能算法的应用研究[J].中国给水排水,2012,28(21):66-68.
[3] BAI Y, WANG P, LI C, et al. A Multi-scale Relevance Vector Regression Approach for Daily Urban Water Demand Forecasting[J]. Journal of Hydrology, 2014, 517: 236-245.
[4] 王 圃,白 云,李 川,等. 基于变结构支持向量回归的城市日用水量预测[J].应用基础与工程科学学报,2015,23(5):895-901.
[5] 杜 涛,叶 琰,李洪伟,等.基于灰色系统理论的几种需水量预测方法分析[J].raybet体育在线
院报,2010,27(7):12-16.
[6] 贺 波,马 静,高赫余.基于多粒度特征和XGBoost模型的城市日供水量预测[J].raybet体育在线
院报, 2020, 37(5): 43-49.
[7] BREIMAN L. Random Forests[J]. Machine Learning, 2001, 45: 5-32.
[8] HO T K. Random Decision Forest[C]//Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal,August 14-16,1995:278-282.
[9] 王 盼,陆宝宏,张瀚文,等.基于随机森林模型的需水预测模型及其应用[J].水资源保护,2014,30(1):34-37,89.
[10]康 有,陈元芳,顾圣华,等.基于随机森林的区域水资源可持续利用评价[J].水电能源科学,2014(3):34-38.
[11]ODAN F K, REIS L F R. Hybrid Water Demand Forecasting Model Associating Artificial Neural Network with Fourier Series[J]. Journal of Water Resources Planning and Management, 2012, 138(3): 245-256.
[12]SHAFAEI M, KISI O. Lake Level Forecasting Using Wavelet-SVR, Wavelet-ANFIS and Wavelet-ARMA Conjunction Models[J]. Water Resources Management, 2016, 30(1): 79-97.
[13]佟长福,史海滨,包小庆,等.基于小波分析理论组合模型的农业需水量预测[J].农业工程学报,2011,27(5):93-98.
[14]郝丽娜,粟晓玲,黄巧玲.基于小波广义回归神经网络耦合模型的月径流预测[J].水力发电学报,2016,35(5):49-56.
[15]EYNARD J,GRIEU S,POLIT M. Wavelet-based Multi-resolution Analysis and Artificial Neural Networks for Forecasting Temperature and Thermal Power Consumption[J]. Engineering Applications of Artificial Intelligence, 2011, 24(3): 501-516.
[16]MALLAT G S. A Theory for Multi-resolution Signal Decomposition: The Wavelet Representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, 11: 674-693.
[17]ZHONG S, XIE X, LIN L. Two-layer Random Forests Modelfor Case Reuse in Case-Based Reasoning[J]. Expert System with Application, 2015, 42(24): 9412-9425.
[18]ADUSUMILLI S, BHATT D, WANG H, et al. A Novel Hybrid Approach Utilizing Principal Component Regression and Random Forest Regression to Bridge the Period of GPS Outages[J]. Neurocomputing, 2015, 166: 185-192.
[19]KIM H, EYKHOLT R, SALAS J. Nonlinear Dynamics, Delay Times, and Embedding Windows[J]. Physica D Nonlinear Phenomena, 1999: 127: 48-60. DOI 10.1016/S0167-2789(98)00240-1.
[20]LIAW A, WIENER M. Classification and Regression by Random Forest[J]. R News, 2002, 2/3: 18-22.
基金
国家自然科学基金项目(71801044);教育部人文社科研究项目(17YJC630003);重庆市自然科学基金项目(cstc2018jcyjAX0436)