Loading...
Assessing and predicting water quality index with key water parameters by machine learning models in coastal cities, China
Xu, Jing ; Mo, Yuming ; Zhu, Senlin ; Wu, Jinran ; Jin, Guangqiu ; Wang, You-Gan ; Ji, Qingfeng ; Li, Ling
Xu, Jing
Mo, Yuming
Zhu, Senlin
Wu, Jinran
Jin, Guangqiu
Wang, You-Gan
Ji, Qingfeng
Li, Ling
Abstract
The water quality index (WQI) is a widely used tool for comprehensive assessment of river environments. However, its calculation involves numerous water quality parameters, making sample collection and laboratory analysis time-consuming and costly. This study aimed to identify key water parameters and the most reliable prediction models that could provide maximum accuracy using minimal indicators. Water quality from 2020 to 2023 were collected including nine biophysical and chemical indicators in seventeen rivers in Yancheng and Nantong, two coastal cities in Jiangsu Province, China, adjacent to the Yellow Sea. Linear regression and seven machine learning models (Artificial Neural Network (ANN), Self-Organizing Maps (SOM), K-Nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), Extreme Gradient Boosting (XGB) and Stochastic Gradient Boosting (SGB)) were developed to predict WQI using different groups of input variables based on correlation analysis. The results indicated that water quality improved from 2020 to 2022 but deteriorated in 2023, with inland stations exhibiting better conditions than coastal ones, particularly in terms of turbidity and nutrients. The water environment was comparatively better in Nantong than in Yancheng, with mean WQI values of approximately 55.3–72.0 and 56.4–67.3, respectively. The classifications "Good" and "Medium" accounted for 80 % of the records, with no instances of "Excellent" and 2 % classified as "Bad". The performance of all prediction models, except for SOM, improved with the addition of input variables, achieving R2 values higher than 0.99 in models such as SVM, RF, XGB, and SGB. The most reliable models were RF and XGB with key parameters of total phosphorus (TP), ammonia nitrogen (AN), and dissolved oxygen (DO) (R2 = 0.98 and 0.91 for training and testing phase) for predicting WQI values, and RF using TP and AN (accuracy higher than 85 %) for WQI grades. The prediction accuracy for "Medium" and "Low" water quality grades was highest at 90 %, followed by the "Good" level at 70 %. The model results could contribute to efficient water quality evaluation by identifying key water parameters and facilitating effective water quality management in river basins.
Keywords
Water quality, Key water parameters, Water quality index (WQI), Machine learning models, Coastal cities
Date
2024
Type
Journal article
Journal
Book
Volume
10
Issue
13
Page Range
1-19
Article Number
ACU Department
Institute for Positive Psychology and Education
Faculty of Education and Arts
Institute for Learning Sciences and Teacher Education (ILSTE)
Faculty of Education and Arts
Institute for Learning Sciences and Teacher Education (ILSTE)
Relation URI
Event URL
Open Access Status
Published as ‘gold’ (paid) open access
License
File Access
Open
Open
Open
Notes
© 2024 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Funding: This research was supported by the Belt and Road Special Foundation of The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China (2021491811, 2022491411).
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Funding: This research was supported by the Belt and Road Special Foundation of The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China (2021491811, 2022491411).
