Abstract:The soil water characteristic curve (SWCC) is fundamental for studying the permeability, strength prediction, and constitutive relationships of unsaturated soils. Machine learning algorithms are characterized by their efficiency in large dataset processing and feature extraction. This study used six machine learning algorithms (four ensemble learning and two traditional machine learning algo rithms) to simulate 154 SWCCs with 1976 data points from the United States Unsaturated Soil Data base. Four performance evaluation indicators (R2, EVS, MAE, and RMSE) were used to assess the algorithms' performance. Two types of data input methods were selected: one with logarithmic pro cessing of matric suction, and the other without any transformation. The results showed that, under both input types, the effect on the LightGBM, XGB, RF, and AdaBoost algorithms was minimal. However, the two traditional machine learning algorithms, GPR and SVM, were significantly affect ed. Without logarithmic transformation, R2 decreased noticeably, and in some cases, the SWCC could not be simulated. Additionally, LightGBM outperformed other models in simulating the SWCC for the test set, with higher trend evaluation indicators (R2 and EVS) and lower error measurement in dicators (MAE and RMSE). The ranking of the six algorithms in terms of SWCC simulation perfor mance was as follows: LightGBM, GPR, XGB, RF, AdaBoost, and SVM. Finally, the trained LightGBM model was used to predict 9 SWCC datasets not included in the original database. The re sults showed that LightGBM could effectively predict the soil water characteristics of unsaturated soils. The findings provide important guidance for improving SWCC predictions for different types of soils.