Predictive Modeling of Iron Concentration in Groundwater Using Machine Learning Techniques: A Case Study in Part of Yenagoa, Bayelsa State
DOI:
https://doi.org/10.63623/ngc6ah56Keywords:
Iron concentration, Groundwater, Machine learning, Spatial analysis, RegressionAbstract
This study aimed to model and predict iron concentrations in groundwater within Yenagoa, Bayelsa State, Nigeria, using machine learning techniques. It focused on evaluating spatial variability and determining the most influential predictors to support groundwater quality management. A total of 50 groundwater samples were collected from spatially distributed boreholes across multiple towns in Yenagoa. Geolocation data and iron concentrations were recorded. Two supervised machine learning models Multiple Linear Regression (MLR) and Random Forest Regression (RFR) were implemented. One-hot encoding was applied to categorical town data, and models were evaluated using R², MAE, and Root Mean Square Error (RMSE) metrics. Feature importance was assessed to identify key predictors. A geospatial heatmap was developed using Inverse Distance Weighting (IDW) to visualize spatial trends. The MLR model slightly outperformed the RFR, achieving an Coefficient of Determination (R²) of 0.92, Mean Absolute Error (MAE) of 0.13 mg/L, and RMSE of 0.15 mg/L. Longitude and specific towns (notably Beta and Opolo) emerged as dominant predictors, confirming spatial clustering of high iron concentrations in the eastern region of the study area. Cross-validation confirmed the models’ robustness. The findings support the use of machine learning (ML) techniques for cost-effective water quality prediction and spatial monitoring. This study introduces a hybrid geo-categorical modeling approach, integrating both spatial coordinates and administrative town identifiers into ML frameworks. It demonstrates the feasibility of lightweight, interpretable models like MLR for real-time deployment in low-resource settings, offering a replicable solution for groundwater quality assessment in data-scarce regions. Future research should expand datasets and explore additional hydrogeological variables to enhance model robustness.
References
[1]McMahon PB, Chapelle FH. Redox processes and water quality of selected principal aquifer systems. Groundwater, 2008, 46(2), 259-271. DOI: 10.1111/j.1745-6584.2007.00385.x
[2]Gómez-Escalonilla V, Montero-González E, Díaz-Alcaide S, Martín-Loeches M, del Rosario MR, Martínez-Santos P. A machine learning approach to site groundwater contamination monitoring wells. Applied Water Science, 2024, 14(12), 250. DOI: 10.1007/s13201-024-02320-1
[3]Podgorski J, Araya D, Berg M. Geogenic manganese and iron in groundwater of Southeast Asia and Bangladesh–machine learning spatial prediction modeling and comparison with arsenic. Science of the Total Environment, 2022, 833, 155131. DOI: 10.1016/j.scitotenv.2022.155131
[4]Thomas MA. The effect of residential development on ground-water quality near Detroit, Michigan. JAWRA Journal of the American Water Resources Association, 2007, 36(5), 1023-1038. DOI: 10.1111/j.1752-1688.2000.tb05707.x
[5]Okiongbo KS, Akpofure E. Determination of aquifer properties and groundwater vulnerability mapping using geoelectric method in Yenagoa City and its environs in Bayelsa State, South South Nigeria. Journal of Water Resource and Protection, 2012, 4(6), 354-362. DOI: 10.4236/jwarp.2012.46040
[6]Pandey S, Duttagupta S, Dutta A. Machine learning models for mapping groundwater pollution risk: Advancing water security and sustainable development goals in Georgia, USA. Water, 2025, 17(6), 879. DOI: 10.3390/w17060879
[7]Smedley PL, Kinniburgh DG. A review of the source, behaviour and distribution of arsenic in natural waters. Applied geochemistry, 2002, 17(5), 517-568. DOI: 10.1016/S0883-2927(02)00018-5
[8]Ashagrie WA, Tarkegn TG, Ray RL, Tefera GW, Demessie SF, Tsegaye L, et al. Assessing the vulnerability of groundwater to pollution under different land management scenarios using the modified DRASTIC model in Bahir Dar City, Ethiopia. Heliyon, 2025, 11(4), e42660. DOI: 10.1016/j.heliyon.2025.e42660
[9]Karo OK, Egobueze FE, Egirani DE. Application of GIS in the assessment of groundwater quality in the Yenagoa watershed of the Niger delta region of Nigeria. Asian Journal of Physical and Chemical Sciences, 2019, 7(2), 1-15. DOI: 10.9734/ajopacs/2019/v7i230093
[10]Pandya H, Jaiswal K, Shah M. A comprehensive review of machine learning algorithms and its application in groundwater quality prediction. Archives of Computational Methods in Engineering, 2024, 31(8), 4633-4654. DOI: 10.1007/s11831-024-10126-2
[11]Abdulameer L, Al Maimuri NM, Nama AH, Rashid FL, Al-Dujaili AN. The role of artificial intelligence in managing sustainable water resources: a review of smart solution implementations. Water Conserv Manag, 2025, 9(2), 181-191. DOI: 10.26480/wcm.02.2025.281.291
[12]Abdulameer L, Al-Khafaji MS, Al-Awadi AT, Al Maimuri NM, Al-Shammari M, Al-Dujaili AN. Artificial intelligence in climate-resilient water management: A systematic review of applications, challenges, and future directions. Water Conservation Science and Engineering, 2025, 10(1), 44. DOI: 10.1007/s41101-025-00371-2
[13]Jonathan LE, Charles AU. Shoreline erosion and accretion analysis of the Orashi River, Rivers State, Nigeria: A geospatial and machine learning approach. Asian Journal of Geographical Research, 2025, 8(2), 27-44. DOI: 10.9734/ajgr/2025/v8i2260
[14]Oreikio AE, Harry AA, Charles AU, Rowland ED. Implication of landscape changes using google earth historical imagery in Yenagoa Bayelsa State, Nigeria. Journal of Scientific Research, 2022, 5(1), 20-31. DOI: 10.47752/sjsr.51.20.31
[15]Bamiekumo BP, Akpobome EO, Kemebaradikumo AN, Mene-Ejegi OO, Eteh DR. Machine learning-based flood extent mapping and damage assessment in Yenagoa, Bayelsa State, using Sentinel-1 and 2 imagery (2018-2022). Discovery Nature, 2025, 2(3), e2dn1041. DOI: 10.54905/disssi.v2i3.e2dn1041
[16]Eteh DR, Egobueze FE, Paaru M, Otutu A, Osondu I. The impact of dam management and rainfall patterns on flooding in the Niger Delta: using Sentinel-1 SAR data. Discover Water, 2024, 4, 123. DOI: 10.1007/s43832-024-00185-8
[17]Eteh DR, Japheth BR, Akajiaku CU, Osondu I, Mene-Ejegi OO, Nwachukwu EM, et al. Assessing the impact of climate change on flood patterns in downstream Nigeria using machine learning and geospatial techniques (2018–2024). Discover Geoscience, 2025, 3(1), 76. DOI: 10.1007/s44288-025-00178-7
[18]Jonathan LE, Winston AG, Chukwuemeka P. Machine Learning and Morphometric Analysis for Runoff Dynamics: Enhancing Flood Management and Catchment Prioritization in Bayelsa, Nigeria. Journal of Computational Systems and Applications, 2025, 2(2), 1-6. DOI: 10.63623/kkx1m906
[19]City Population. Yenagoa (Nigeria) population statistics. 2022. https://citypopulation.de/en/nigeria/
[20]Doust H, Omatsola E. Niger Delta. In: Edwards JD, Santogrossi PA Eds., Divergent/Passive Margin Basins, American Association of Petroleum Geologists, Tulsa, 1990, 239-248. DOI: 10.1306/M48508C4
[21]Short KC, Stäuble AJ. Outline of geology of Niger Delta. AAPG bulletin, 1967, 51(5), 761-779. DOI: 10.1306/5D25C0CF-16C1-11D7-8645000102C1865D
[22]Oborie E, Fatunmibi I, Otutu AO. Shoreline change assessment in the Orashi River, Rivers State, Nigeria, using the digital shoreline analysis system (DSAS). Sumerianz Journal of Scientific Research, 2023, 6(4), 70-77. DOI:10.47752/sjsr.64.70.77
[23]Okpobiri O, Akajiaku CU, Eteh DR, Moses P. Using machine learning and GIS to monitor sandbars along the River Niger in the Niger Delta, Nigeria. International Journal of Environment and Climate Change, 2025, 15(2), 182-203. DOI: 10.9734/ijecc/2025/v15i24721
[24]Yang S, Luo D, Tan J, Li S, Song X, Xiong R, et al. Spatial mapping and prediction of groundwater quality using ensemble learning models and shapley additive explanations with spatial uncertainty analysis. Water, 2024, 16(17), 2375. DOI: 10.3390/w16172375
[25]Wegahita NK, Ma L, Liu J, Huang T, Luo Q, Qian J. Spatial assessment of groundwater quality and health risk of nitrogen pollution for shallow groundwater aquifer around Fuyang city, China. Water, 2020, 12(12), 3341. DOI: 10.3390/w12123341
[26]merican Public Health Association. Standard methods for the examination of water and wastewater.23rd ed., American Public Health Association, American Water Works Association and Water Environment Federation, 2017, pp. 1976.
[27]Kressy DG. Prediction of Abidjan groundwater quality using machine learning approaches: An exploratory study. Intelligent Control and Automation, 2024, 15(4), 215-248. DOI: 10.4236/ica.2024.154010
[28]Wang J, Yan H, Xin K, Tao T. Risk assessment methodology for iron stability under water quality factors based on fuzzy comprehensive evaluation. Environmental Sciences Europe, 2020, 32(1), 81. DOI: 10.1186/s12302-020-00356-z
[29]Karimi H, Sahour S, Khanbeyki M, Gholami V, Sahour H, Shahabi-Ghahfarokhi S, et al. Enhancing groundwater quality prediction through ensemble machine learning techniques. Environmental Monitoring and Assessment, 2024, 197(1), 21. DOI: 10.1007/s10661-024-13506-0
[30]Nourani V, Ghaffari A, Behfar N, Foroumandi E, Zeinali A, Ke CQ, et al. Spatiotemporal assessment of groundwater quality and quantity using geostatistical and ensemble artificial intelligence tools. Journal of Environmental Management, 2024, 355, 120495. DOI: 10.1016/j.jenvman.2024.120495
[31]Dritsas E, Trigka M. Efficient data-driven machine learning models for water quality prediction. Computation, 2023, 11(2), 16. DOI: 10.3390/computation11020016
[32]Ahmad T, Aziz MN. Data preprocessing and feature selection for machine learning intrusion detection systems. ICIC Express Letters, 2019, 13(2), 93-101. DOI: 10.24507/icicel.13.02.93
[33]Jonathan EL, Imoni O, Chukwuemeka P, Eteh DR. Impact of oil spills on mangrove ecosystem degradation in the Niger Delta using remote sensing and machine learning. Journal of Geography and Cartography, 2025, 8(2), 11707. DOI: 10.24294/jgc11707
[34]Lantz B. Machine Learning with R (4th Edition) - Learn Techniques for Building and Improving Machine Learning Models, from Data Preparation to Model Tuning, Evaluation, and Working with Big Data. Packt Publishing, 2023. https://app.knovel.com/kn/resources/kpMLRL0001/toc
[35]Awad M, Khanna R. Efficient learning machines: theories, concepts, and applications for engineers and system designers. Springer Nature; 2015. DOI: 10.1007/978-1-4302-5990-9
[36]Pereira GW, Valente DS, Queiroz DM, Coelho AL, Costa MM, Grift T. Smart-map: an open-source QGIS plugin for digital mapping using machine learning techniques and ordinary kriging. Agronomy, 2022, 12(6), 1350. DOI: 10.3390/agronomy12061350
[37]Berhanu KG, Hatiye SD, Lohani TK. Coupling support vector machine and the irrigation water quality index to assess groundwater quality suitability for irrigation practices in the Tana sub-basin, Ethiopia. Water Practice and Technology, 2023, 18 (4), 884-900. DOI: 10.2166/wpt.2023.055
[38]Singh PK, Rajput J, Kumar D, Gaddikeri V, Elbeltagi A. Combination of discretization regression with data-driven algorithms for modeling irrigation water quality indices. Ecological Informatics, 2023, 75, 102093. DOI: 10.1016/j.ecoinf.2023.102093
[39]Rammohan B, Partheeban P, Ranganathan R, Balaraman S. Groundwater quality prediction and analysis using machine learning models and geospatial technology. Sustainability, 2024, 16(22), 9848. DOI: 10.3390/su16229848
[40]Apogba JN, Anornu GK, Koon AB, Dekongmen BW, Sunkari ED, Fynn OF, et al. Application of machine learning techniques to predict groundwater quality in the Nabogo Basin, Northern Ghana. Heliyon, 2024, 10(7), e28527. DOI: 10.1016/j.heliyon.2024.e28527
[41]Igwebuike N, Ajayi M, Okolie C, Kanyerere T, Halihan T. Application of machine learning and deep learning for predicting groundwater levels in the West Coast Aquifer System, South Africa. Earth Science Informatics, 2025, 18(1), 6. DOI: 10.1007/s12145-024-01623-w
[42]Mosavi A, Ozturk P, Chau KW. Flood prediction using machine learning models: Literature review. Water, 2018, 10(11), 1536. DOI: 10.3390/w10111536
[43]Chukwuemeka P, Kyrian O, Imoni O. Leveraging machine learning for the identification of Obfuscated javascript in phishing attacks. Asian Journal of Research in Computer Science, 2025, 18(6), 301-314. DOI: 10.9734/ajrcos/2025/v18i6700
[44]Rowland ED, Oseji S, Iziegbe E, Abaye ON, Oreikio E. Water quality assessment using GIS based multi-criteria evaluation (MCE) and analytical hierarchy process (AHP) methods in yenagoa bayelsa state, Nigeria. International Journal of Advanced Engineering Research and Science, 2023, 10(4). DOI: 10.22161/ijaers.104.9
[45]Berhanu KG, Lohani TK, Hatiye SD. Spatial and seasonal groundwater quality assessment for drinking suitability using index and machine learning approach. Heliyon, 2024, 10(9), e30362. DOI: 10.1016/j.heliyon.2024.e30362
[46]Shams MY, Elshewey AM, El-Kenawy ES, Ibrahim A, Talaat FM, Tarek Z. Water quality prediction using machine learning models based on grid search method. Multimedia Tools and Applications, 2024, 83(12), 35307-35334. DOI: 10.1007/s11042-023-16737-4
[47]Akajiaku UC, Ohimain EI, Olodiama EE, Eteh DR, Winston AG, Chukwuemeka P, et al. Identifying suitable dam sites using geospatial data and machine learning: a case study of the katsina-ala river in Benue State, Nigeria. Earth Science Informatics, 2025, 18(3), 497. DOI: 10.1007/s12145-025-01974-y
[48]Gupta AN, Kumar D, Singh A. Evaluation of water quality based on a machine learning algorithm and water quality index for mid gangetic region (south Bihar plain), India. Journal of the Geological Society of India, 2021, 97(9), 1063-1072. DOI: 10.1007/s12594-021-1821-0
[49]Oseji S, Chukwuemeka P, Imoni O. Artificial intelligence in 3D printed concrete: Sustainability assessment and implementation challenges. Journal of Materials Science Research and Reviews, 2025, 8(2), 515-528. DOI: 10.9734/jmsrr/2025/v8i2421
[50]Matsui K, Kageyama Y. Water pollution evaluation through fuzzy c-means clustering and neural networks using ALOS AVNIR-2 data and water depth of Lake Hosenko, Japan. Ecological Informatics, 2022, 70, 101761. DOI: 10.1016/j.ecoinf.2022.101761
[51]Malakar P, Mukherjee A, Bhanja SN, Saha D, Ray RK, Sarkar S, et al. Importance of spatial and depth-dependent drivers in groundwater level modeling through machine learning. Hydrology and Earth System Sciences Discussions, 2020, 1-22. DOI: 10.5194/hess-2020-208
[52]Asmoay AA, Shams EM, Galal WF, Mohamed A, Sawires R. Geochemical characterization and health risk assessment of groundwater in Wadi Ranyah, Saudi Arabia, using statistical and GIS-based models. Environmental Geochemistry and Health, 2025, 47(6), 208. DOI: 10.1007/s10653-025-02517-6
[53]Zenebe GB, Hailu G, Girmay A, Hussien A, Abrehe S. Evaluation of geostatistical interpolation methods on spatial representation of groundwater depth and nitrate concentration of Elalla-Aynalem wellfield, Northern Ethiopia. Discover Water, 2025, 5(1), 9. DOI: 10.1007/s43832-025-00189-y
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Charles U. Akajiaku, Comfort Oyindamola Agbabiaka, Okes Imoni, Prince Chukwuemeka, Desmond R. Eteh, Meremu Dogiye Amos (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.