Paper
21 July 2024 Air quality prediction and early warning based on Prophet-XGBoost combined model
Zihao Cui, Chao Huang, Zhiyu Huang, Aojie Chen
Author Affiliations +
Proceedings Volume 13219, Fourth International Conference on Applied Mathematics, Modelling, and Intelligent Computing (CAMMIC 2024); 1321926 (2024) https://doi.org/10.1117/12.3036677
Event: 4th International Conference on Applied Mathematics, Modelling and Intelligent Computing (CAMMIC 2024), 2024, Kaifeng, China
Abstract
Air pollution poses significant risks to human health, ecosystems, and socio-economic stability. Accurately predicting PM2.5 concentration and Air Quality Index (AQI) is crucial for understanding pollution factors and devising effective control measures. This paper addresses Question C of the 2023 Huazhong Cup, focusing on predicting air quality using a multivariate hybrid prediction model, Prophet-XGBoost, which combines the Prophet time series decomposition algorithm and the XGBoost machine learning model. To address problem 1, this study performed KNN interpolation and IQR outlier removal on the data in Annex 1 and Annex 2 to eliminate missing and outliers in the meteorological data, and then standardised the data. Then, Random Forest Regression (RFR) was used to filter out the features related to the changes of PM2.5 concentration. Firstly, the Random Forest model was trained to determine the decision tree and the optimal number of leaves of the model, and then regression analysis was performed to find out the importance of these features to PM2.5 concentration, and the three main features screened out were PM10, the average temperature and CO, with the scores of 0.7742, 0.1075 and 0.0910, respectively. Comparison with the multiple linear regression model in the model evaluation demonstrated the accuracy of the model, and the calculation of Pearson's correlation coefficient confirmed the reasonableness of the model. In order to solve problems 2 and 3, when constructing the multi-step model, this study divides the training set and test set according to the ratio of 8:2, and firstly trains the prediction results of the known data based on the Prophet timedecomposition algorithm and the XGBoost machine learning model respectively. The results show that the two models have their own advantages and disadvantages, in order to obtain the prediction results that can meet the cyclical and seasonal changes of meteorological features, as well as its unstable and nonlinear characteristics, this study combines the two models together and adopts the Prophet-XGBoost combined prediction model model. The indexes of the combined model are greatly improved compared with the single model, which proves the reasonableness of the hybrid model prediction. Finally, the Prophet-XGBoost model was used to predict the PM2.5 concentration and AQI at the given times in Annex 3, and the air quality warning levels were determined based on the prediction results, which provided a useful reference for the formulation of more effective air quality management strategies.[1]
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Zihao Cui, Chao Huang, Zhiyu Huang, and Aojie Chen "Air quality prediction and early warning based on Prophet-XGBoost combined model", Proc. SPIE 13219, Fourth International Conference on Applied Mathematics, Modelling, and Intelligent Computing (CAMMIC 2024), 1321926 (21 July 2024); https://doi.org/10.1117/12.3036677
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Atmospheric modeling

Random forests

Decision trees

Education and training

Air quality

Correlation coefficients

RELATED CONTENT


Back to Top