| Issue |
E3S Web Conf.
Volume 708, 2026
7th International Conference on Smart Applications and Water Information Systems: “Intelligent Systems, Geospatial Technologies and Modeling for the Sustainable Management of Water Resources” (SAWIS 2025)
|
|
|---|---|---|
| Article Number | 02002 | |
| Number of page(s) | 10 | |
| Section | Water Quality, Treatment, and Environmental Processes | |
| DOI | https://doi.org/10.1051/e3sconf/202670802002 | |
| Published online | 30 April 2026 | |
Handling Multicollinearity in Hydrochemical Data: A Comparison of Penalized Regression Models
ENSA of Kenitra, Engineering Sciences Laboratory, Data Analysis, Mathematical Modeling and Optimization Team, Ibn Tofail University, 14000 Kenitra, Morocco
* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Abstract
We are looking at the surface waters of the Inaouen watershed. The dynamics of iron in this area are influenced by physico-chemical parameters that are connected to each other. This connection leads to a problem called multicollinearity. It makes traditional regression models not reliable. To solve this issue we compare three penalized regression techniques: Ridge, Lasso and Elastic Net. We apply these techniques to a dataset that includes the concentrations of HCO3−, CaCO3, Mg2+, Na+, K+ Cl−, Ca2+, SO42− and Fe. We use some indicators to assess the performance of the models. These indicators are root mean squared error, coefficient of determination mean absolute error and the AIC and BIC information criteria. The results show that the Lasso model is the best. It has a root mean squared error of 2.52 a coefficient of determination of 0.84, an absolute error of 1.98, an AIC of 285.12 and a BIC of 293.84. The Lasso model is good at identifying the important variables like Ca2+, Na+ and K+. It also eliminates the variables that are not needed. The Elastic Net model is similar to the Lasso model. However the Ridge model is not as good as the two models.
© The Authors, published by EDP Sciences, 2026
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

