Insights into the comparison of machine learning models on rice grain arsenic prediction: Interplay of rice cultivation systems and soil environmental factors
Article Type
Research Article
Publication Title
Environmental Pollution
Abstract
Arsenic (As) exposure to rice threatens food safety while transferring As to rice from paddy soils significantly impacts increasing As levels in rice. This study explores establishing an efficient model for predicting As accumulation in rice grain using ensemble machine learning algorithms. Paddy-field samples, including soil and rice grain were collected during the post-monsoon season from sixty plot locations in a rice-producing district of West Bengal (India). Field screening included two rice cultivation systems such as conventional flooding and intermittent watering. The model validation on the prediction of rice grain As was performed by machine learning-based binary predictive classification based on rice grown under two rice cultivation systems. Random forest (RF), artificial neural network (ANN), and logistic regression (LR) were used in this study while considering measured soil characteristics, pH, organic carbon, and As-bound geochemical fractions as predictor variables. The models were evaluated by determining the confusion matrix, receiver operating characteristic (ROC)- area under the curve (AUC). Results indicate that the RF exhibited the highest AUC (0.89), followed by ANN (0.86) and LR (0.75) on test data. ANN and RF demonstrated the best performance in terms of accuracy, kappa, F1 score, Matthews correlation coefficient and log loss. Among the predictors, exchangeable As and amorphous iron-bound As were determined to be the most important variables, influencing rice grain As accumulation. The estimated limiting values determined by the cut-off accuracy plot ranged from 162 μg kg−1 to 303 μg kg−1 for exchangeable As and 1490 mg/kg to 1690 mg/kg for amorphous iron-bound As, including RF and ANN. This study provides the usefulness of the machine learning models for differentiating rice grain As availability based on soil environmental factors that can synchronize to reduced As exposure to rice with the emphasis on soil environmental health and mode of cultivation practices.
DOI
10.1016/j.envpol.2025.126646
Publication Date
9-15-2025
Recommended Citation
Majumder, Supriya and Banik, Pabitra, "Insights into the comparison of machine learning models on rice grain arsenic prediction: Interplay of rice cultivation systems and soil environmental factors" (2025). Journal Articles. 5432.
https://digitalcommons.isical.ac.in/journal-articles/5432