Abstract:It is important to study the spatial variability and distribution of soil properties for understanding ecosystems, formulating agricultural policies, conducting soil management and monitoring environmental changes caused by land use. The purpose of this paper is to explore the accuracy of the spatial prediction of soil properties at the provincial scale by the Random Forest (RF) model. Anhui Province in East China was selected as the study area, soil data obtained during the 2nd National Soil Survey and during 2010—2011 were used, the environmental variables were collected with GIS spatial analysis technique, and the correlation between environmental factors and soil properties was analyzed by RF model. The results showed that in the RF modeling process, SOC prediction model was the most robust and the prediction accuracy was the highest when the mtry value was 1 and the ntree value was 1 000; when the mtry value was 1 and the ntree value was 1 000 and 100 respectively, soil bulk density (BD) and clay content prediction models were the best. The elevation, NDVI, landform, muti-resolution index of valley bottom flatness (MrVBF) and soil type were the most important predictors of SOC content; Landform, mean annual precipitation (MAP), MrVBF, elevation and soil type were the most important prediction factors of soil BD; Elevation, MAP, MrVBF and plan curvature were the most important predictors of soil clay content; RF model can be used for spatial prediction of soil properties and has certain advantages in treating the qualitative variables such as soil type and landform; Multi-source environmental variable combinations explained 26% of SOC content, 23% of soil Bd and 22% of clay content, respectively. The use of machine learning for predicting soil properties and digital soil mapping is more efficient than traditional methods, it is of significance to use RF model in spatially predicting soil properties in the large-scale area.