热带地理 ›› 2021, Vol. 41 ›› Issue (4): 845-856.doi: 10.13284/j.cnki.rddl.003372

• 方法研究 • 上一篇    

基于XGBoost的多源降水数据融合方法研究

张钧民1,2,3(), 阮惠华4(), 许剑辉2,3, 戴晓爱1, 郑艳萍4, 张金标4   

  1. 1.成都理工大学 地球科学学院,成都 610059
    2.广东省科学院广州地理研究所//广东省遥感与地理信息应用重点实验室// 广东省地理时空大数据工程实验室,广州 510070
    3.南方海洋科学与工程广东省实验室(广州),广州 511458
    4.广东省气象探测数据中心,广州 510080
  • 收稿日期:2020-10-08 修回日期:2021-03-05 出版日期:2021-08-16 发布日期:2021-08-16
  • 通讯作者: 阮惠华 E-mail:zhangjunmin130@163.com;ruanhuihua@163.com
  • 作者简介:张钧民(1995—),男,成都人,硕士研究生,主要从事基于机器学习的多源数据融合研究,(E-mail)zhangjunmin130@163.com
  • 基金资助:
    广东省省级科技计划项目(2018B020207012);国家自然科学基金(41901371);广东省引进创新创业团队项目(2016ZT06D336)

An XGBoost-Merging Method for High-Resolution Daily Precipitation Estimation for a Regional Rainstorm Event

Junmin Zhang1,2,3(), Huihua Ruan4(), Jianhui Xu2,3, Xiao'ai Dai1, Yanping Zheng4, Jinbiao Zhang4   

  1. 1.School of Earth Sciences, Chengdu University of Technology, Chengdu 610059, China
    2.Key Laboratory of Guangdong for Utilization of Remote Sensing and Geographical Information System//Guangdong Engineering Laboratory for Geographic Spatio-temporal Big Data//Guangzhou Institute of Geography, Guangdong Academy of Sciences, Guangzhou 510070, China
    3.Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou 511458, China
    4.Guangdong meteorological observation data center, Guangzhou 510080, China
  • Received:2020-10-08 Revised:2021-03-05 Online:2021-08-16 Published:2021-08-16
  • Contact: Huihua Ruan E-mail:zhangjunmin130@163.com;ruanhuihua@163.com

摘要:

气象站点观测降水难以精确反映降水时空分布与变化,而雷达降水存在复杂地形区域精度不高等问题。为了最大限度发挥两者的优势,文章以广东省北部山区为研究区域,选择2018-08-26—30一次暴雨过程为研究对象,结合地形、与海岸线距离、植被指数、经纬度等地表辅助参量,分析地面站点降水与地表辅助参量、雷达降水的相关关系,利用XGBoost算法与克里金插值方法,构建地面-雷达日降水数据融合模型,得到了空间分辨率为1 km的日降水融合数据集。此外,采用多元线性回归(LM)与克里金插值方法,实现了地面-雷达日降水数据的融合,并利用地面降水数据分别对XGBoost与LM日降水融合性能进行精度验证。结果表明:1)地面降水与雷达降水存在显著的正相关,地面降水与地表辅助参量之间的相关性随时间变化;2)XGBoost预测精度整体上高于LM预测结果;经模型残差校正后,XGBoost融合模型的精度整体上优于LM融合模型,这是因为XGBoost方法在捕捉地面降水与地表辅助参量、雷达降水之间关系性能上优于LM方法。

关键词: 多源降水, 数据融合, 雷达降水, XGBoost算法, 粤北地区

Abstract:

Precipitation is a vital physical parameter of the earth surface system, and accurate estimation of spatiotemporal patterns of precipitation is essential for flood disaster monitoring, drought monitoring, and water management. However, regional precipitation, which is derived solely from rain gauges, remote sensing, and weather radar, is subject to large uncertainties, especially for topographically complex mountain areas. Multi-source precipitation data fusion is a practical method for achieving high-accuracy and high-resolution precipitation information. This study proposes an XGBoost-based geostatistical fusion method (XGBoost) for combining information from ground-based measurements, radar precipitation, and other auxiliary parameters, to improve the accuracy of the spatiotemporal distribution of precipitation in geographically complex mountain areas. In the XGBoost-based geostatistical fusion model, radar precipitation and terrestrial parameters, which include longitude, latitude, digital elevation model data, aspect, slope, enhanced vegetation index, and distance from the coastline, are considered as the independent variables. The XGBoost-based geostatistical fusion model was applied to a regional rainstorm event that lasted from August 26th to 30th, 2018, in northern Guangdong using daily measurements from 206 rain gauges and 51 stations for model training and validation. The fused results were further compared with the results obtained from the multiple linear regression kriging method (LM). Validation using ground-based precipitation measurements was applied for different data fusion methods based on the coefficient of determination (R2), Mean Absolute Error (MAE), and Root-Mean-Square Error (RMSE). The experimental results indicated that: (1) The ground-based precipitation data were positively associated with radar precipitation, and the correlation coefficient between the ground-based precipitation data and the terrestrial parameters varied significantly with measurement time over the regional rainstorm event. (2) The XGBoost produced 1 km precipitation prediction with higher accuracy than the LM before residual correction. (3) The accuracy of fused precipitation with the XGBoost-based geostatistical method was reduced after residual correction, but the accuracy of the LM was increased. The XGBoost-based geostatistical method produced 1-km precipitation with lower accuracy than the TsHARP utility on August 27th and 29th; however, in general, the XGBoost-based geostatistical method outperformed the LM because the nonlinear relationships between the ground-based precipitation data and the independent variables were considered in XGBoost. (4) The XGBoost-based geostatistical method captured the differences in precipitation for different land cover patterns and produced the spatial details of fused precipitation over the complex mountain areas.

Key words: multi-source precipitation, data fusion, Radar precipitation, XGBoost, Northern Guangdong Province

中图分类号: 

  • P407