TROPICAL GEOGRAPHY ›› 2018, Vol. 38 ›› Issue (3): 432-439.doi: 10.13284/j.cnki.rddl.003044

Previous Articles     Next Articles

Saltwater Intrusion Forecasting Based on Random Forest

SU Chengjia,CHEN Sha and CHEN Xiaohong   

  1. (Center for Water Resources and Environment//Key Laboratory of Water Cycle and Water Security in Southern China of Guangdong High Education Institute//Guangdong Engineering Technology Research Center of Water Security Regulation and Control for Southern China, Sun Yat-sen University,Guangzhou 510275,China)
  • Online:2018-05-05 Published:2018-05-05

Abstract: Saltwater intrusion forecast is one of the key measures to ensure the safety of water supply in coastal areas which are under serious influences of saltwater intrusion. The purpose of this study was to test and verify the feasibility of random forest (RF) in saltwater intrusion forecast. Thus, the predictors used for model construction were determined by the Pearson correlation coefficient on the basis of candidate predictors, and their lag time between the salinity and predictors was identified by the importance index of variable based on the RF. By taking Dachongkou, a station in the downstream of Modaomen waterway, as a case study, a model for salt water intrusion forecast based on the RF was constructed with the predictors and was applied to the dry season saltwater intrusion forecast. The results showed that among the 9 candidate predictors, there were significant correlations between the salinity and low tide level, flow at Wuzhou, water level at Wuzhou, tidal range at Denglongshan, water level at Shijiao, and flow of Shijiao and Wuzhou, and all of them passed a significant test at 0.05 level. Furthermore, the low tide level at Denglongshan has the highest correlation coefficient with the salinity among the 9 candidate predictors. The remaining candidate predictors, i.e., rainfall, high tide level at Denglongshan, and flow at Shijiao didn’t pass the 95% significant level test, that is, the correlation between salinity and these candidate predictors were not significant. Results of the importance index based on the RF showed that there was a lag time ranged from 1 to 3 days between the salinity and the 6 predictors which passed a significant test at 0.05 level. The results of prediction for the RF model with consideration to the lag time between the salinity and predictors not only satisfied the precision requirements of the saltwater intrusion forecast, but also had a better performance than the one without consideration of the lag time between the salinity and predictors, in which the Nash efficiency coefficient was increased by 0.55, the decision coefficient was increased by 0.33 and the average relative error was reduced by 26.7%. The model considering the lag effects between the salinity and predictors can significantly improve the model performance and is more practical. In addition, the RF model considering the lag effect between the salinity and predictors also had a higher performance when compared to that of the traditional Markoff chain; the average relative error was reduced by 80.4%, the deterministic coefficient and the Nash efficiency coefficient were increased by 0.03 and 0.88, respectively. The correlation between the salinity and predictors is more significant than that of the sequences of salinity itself is the cause for the advantage of the RF model in mid-term and long-term salinity forecast. This study can provide further technical support for saltwater intrusion forecast in coastal areas.

Key words: saltwater intrusion forecast, predictor, lag time, random forest, Modaomen waterway