raybet体育在线 院报 ›› 2025, Vol. 42 ›› Issue (6): 60-70.DOI: 10.11988/ckyyb.20240254

• 水环境与水生态 • 上一篇    下一篇

基于CEEMDAN和IMSA的混合模型在水质预测中的应用

郭利进1,2(), 吴昊天1,2()   

  1. 1 天津工业大学 控制科学与工程学院,天津 300387
    2 天津工业大学 天津市电气装备智能控制重点实验室,天津 300387
  • 收稿日期:2024-03-15 修回日期:2024-05-22 出版日期:2025-06-01 发布日期:2025-06-01
  • 通信作者:
    吴昊天(2000-),男,河北唐山人,硕士研究生,研究方向为控制工程。E-mail:
  • 作者简介:

    郭利进(1970-),男,湖北黄冈人,教授,博士,硕士生导师,研究方向为过程参数的检测与控制。E-mail:

  • 基金资助:
    国家自然科学基金项目(52077155)

Application of a Hybrid Model Based on CEEMDAN and IMSA in Water Quality Prediction

GUO Li-jin1,2(), WU Hao-tian1,2()   

  1. 1 School of Control Science and Engineering, Tiangong University, Tianjin 300387, China
    2 Tianjin Key Laboratory of Intelligent Control of Electrical Equipment, Tiangong University, Tianjin 300387, China
  • Received:2024-03-15 Revised:2024-05-22 Published:2025-06-01 Online:2025-06-01

摘要:

水质预测是水污染防治的重要组成部分,但水质序列呈现出较强的随机性、不平稳性等特点,为进一步提高地表水质预测的精度,提出一种新型水质预测混合模型。首先采用自适应噪声完备集合经验模态分解(CEEMDAN)将原始水质序列分解,然后利用模糊散布熵(FuzzDE)将分量划分为高、中、低3种复杂度成分,其次分别利用改进螳螂算法(IMSA)优化后的双向长短时记忆网络(BiLSTM)、最小二乘支持向量机回归(LSSVR)、极限学习机(ELM)对高、中、低3种复杂度成分进行预测,并对预测结果进行组合重构,最后建立BiLSTM误差校正模型对误差进行修正,得到最终预测结果。利用沅江支流酉水两个断面的溶解氧浓度及湘江流域一个断面的pH值进行仿真验证,R2可达90%以上,结果表明混合模型预测的准确性优于其他对比预测模型。

关键词: 水质预测, CEEMDAN分解, 模糊散布熵, 螳螂算法, 混合模型

Abstract:

[Objectives] To enhance water quality prediction accuracy, this study aims to address the following challenges: (1) traditional prediction methods often rely on simple, elementary decomposition techniques, limiting their ability to extract meaningful data features. (2) Single models and basic optimization algorithms result in low prediction accuracy. (3) Most approaches fail to leverage the advantages of different networks to analyze components of varying complexity, leading to inefficient model utilization. (4) Few studies incorporate error correction after prediction. This study proposes a novel hybrid model for water quality prediction. [Methods] First, the original water quality sequence was decomposed using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN). Next, Fuzzy Dispersion Entropy (FuzzDE) categorized the components into high-, medium-, and low-complexity subsequences. Then, an Improved Mantis Search Algorithm (IMSA) optimized three distinct models: Bidirectional Long Short-Term Memory (BiLSTM) for high-complexity components, Least Squares Support Vector Regression (LSSVR) for medium-complexity components, and Extreme Learning Machine (ELM) for low-complexity components. The predictions were combined and reconstructed, and a BiLSTM-based error correction model further corrected the errors, yielding the final prediction results. [Results] The study introduced four key innovations to the original Mantis Search Algorithm (MSA): (1) combining Logistic-Tent chaotic mapping for population initialization, ensuring uniform and random distribution of initial solutions to enhance global search capability and convergence speed; (2) nonlinear acceleration factor, refining MSA’s core update formula to transition from global exploration to local exploitation, mitigating local optima entrapment; (3) elite-guided adaptive update strategy, addressing the excessive randomness in the position update strategy when Mantis attacks fail, improving late-stage search efficiency while preserving some randomness; (4) opposition-based learning, generating individuals opposite to the current individual to enhance global optimization. IMSA’s performance was validated using benchmark functions (Rosenbrock for unimodal, Michalewicz for multimodal), confirming improved global search and convergence precision. After determining the network hyperparameters, ablation experiments were conducted to analyze the contribution of each strategy to the network model, providing a clear understanding of how each strategy impacts prediction performance. Finally, the sequence of model usage was validated by using FuzzDE to calculate the complexity of each component, creating high-, medium-, and low-complexity subsequences. The learning capabilities of different networks for these subsequences were verified, with BiLSTM used to predict high-complexity components, LSSVR for medium-complexity components, and ELM for low-complexity components. [Conclusions] This study performed a simulation verification using dissolved oxygen (DO) concentrations from two sections of Youshui River (a tributary of the Yuanjiang River) and pH values from one station in the Xiangjiang River Basin. Missing values were addressed via linear interpolation. For outlier treatment, the study considered that outliers in the data might be caused by sudden pollution events and discontinuous non-point source pollution. Directly removing them could lead to information loss, so outliers were retained. After integrating decomposition, use of entropy, algorithm optimization, and error correction models, eleven comparative experiments were established to evaluate the effectiveness of each optimization method. The hybrid model’s effectiveness was validated using RMSE, R2, and MAPE metrics. Ultimately, the R2 reached over 90%, demonstrating that the prediction accuracy of the hybrid model outperformed other comparative models.

Key words: water quality prediction, CEEMDAN decomposition, fuzzy dispersion entropy, Mantis Search Algorithm, hybrid model

中图分类号: 

Baidu
map