基于机器学习算法的热带气旋灾害县级直接经济损失等级评估:以福建省为例
邵婧妍(1999—),女,河北沧州人,硕士研究生,主要从事热带气旋灾害评估研究,(E-mail)202121051221@mail.bnu.edu.cn; |
收稿日期: 2023-12-07
修回日期: 2024-02-26
网络出版日期: 2024-06-13
基金资助
国家重点研发计划项目(2022YFC3006404-02)
Assessment of Direct Economic Loss Levels Caused by Tropical Cyclone Disasters in County-Level Using Machine Learning: A Case Study of Fujian Province
Received date: 2023-12-07
Revised date: 2024-02-26
Online published: 2024-06-13
为了探索机器学习模型在热带气旋灾害损失评估中的作用,基于2009—2020年福建省县级热带气旋灾害损失数据,分别采用LightGBM (Light Gradient Boosting Machine, LightGBM)、随机森林(Random Forest, RF)、极限梯度提升(eXtreme Gradient Boosting, XGBoost)、支持向量机(Support Vector Machine, SVM)、BP神经网络(Back-Propagation Neural Network, BP)等5种算法,优化了直接经济损失等级评估模型参数,并用不同的热带气旋事件进行验证。结果表明:基于LightGBM算法性能最佳,其准确率、精确率、召回率和F1分数(精确率和召回率的调和平均值)均在79%以上,具有较好的泛化能力;最大小时降雨量、3 s极值风速是最重要的2个致灾指标,固定资本存量是比GDP更重要的指标;通过4种登陆点/路径和2种风雨强度的热带气旋事件的对比,发现评估结果与实际结果较为一致,模型具有较好的适用性。
邵婧妍 , 方伟华 . 基于机器学习算法的热带气旋灾害县级直接经济损失等级评估:以福建省为例[J]. 热带地理, 2024 , 44(6) : 1064 -1078 . DOI: 10.13284/j.cnki.rddl.20230962
China is frequently affected by tropical cyclones, which can lead to severe economic losses. Rapid disaster loss assessment is crucial for effective emergency response. A variety of factors affect tropical cyclone disaster losses, which can be roughly categorized into hazard, exposure, and vulnerability. In the past, traditional statistical methods were used as the main tools for disaster loss assessment. To explore the potential of machine learning models, we explored five algorithms: the Light Gradient Boosting Machine (LightGBM), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and Back-Propagation Neural Network (BP). The maximum gust wind and rainfall of tropical cyclones were selected to represent hazards, fixed capital stock data were used for the valuation of exposure, and the GDP of each county was collected to reflect capacity or vulnerability. In addition, river network density data were used as a simple proxy to demonstrate the contribution of flood-induced tropical cyclone rainfall. The relationship between these input variables and disaster loss at the county level was developed based on the data of 81 tropical cyclone events from 2009 to 2020 in Fujian Province. The performance of these models was compared using accuracy, precision, recall, and F1 scores. The accuracies of the LightGBM, RF, XGBoost, SVM, and BP models were 0.794 6, 0.772 6, 0.762 8, 0.251 8, and 0.268 1, respectively. The main findings are as follows: (1) The performance of the ensemble learning algorithms (RF, XGBoost, and LightGBM) was higher than that of the individual classifiers (BP and SVM). The LightGBM model exhibited the best performance, with accuracy, precision, recall, and F1 scores >79%. (2) Maximum hourly rainfall and maximum wind gust are two of the most important loss-inducing factors, and fixed capital stock is a better proxy for disaster exposure than GDP. (3) The modeled losses are consistent with the actual losses under different but typical tropical cyclone events, indicating that the models can be applied to future tropical cyclone events impacting Fujian Province. However, this study had some limitations. First, some natural hazards, such as floods, storm surges, and waves, were not fully considered, which introduced uncertainty into the model results. Second, the emergency response capacity and actual actions taken among counties may have varied dramatically and were neglected due to data unavailability. In the future, hazard and vulnerability variables should be obtained to extend the model inputs. In addition, whether the model parameters trained with data from Fujian Province can be applied to other provinces remains unaddressed. In the future, to develop an operational model for the whole of coastal China, county-level data of all typhoon-prone areas in China with long-term time series are needed.
图1 台风“苏力(201307)”3 s极值风速分布(a)、小时降雨量分布(UTC:2013-07-14 T 03:00)(b)、福建省2013年固定资本存量分布(c)、GDP分布(d)及河网密度分布(e)Fig.1 Typhoon "Soulik (201307)" maximum wind speed (3-seconds gust) distribution (a), hourly rainfall distribution (UTC: 2013-07-14 T03:00) (b), fixed capital stock distribution (c) and GDP distribution (d) in Fujian Province in 2013, river network density distribution (e) |
表1 5种机器学习算法的参数及其对模型性能的影响Table 1 Parameters of 5 machine learning algorithms and their impact on model performance |
模型 | 参数(英文名称) | 参数(中文名称) | 对模型影响 |
---|---|---|---|
LightGBM | n_estimators | 树木数量 | 值越大性能越高,但计算时间越长 |
max_depth | 树最大深度 | 值越大性能越高,但其过大会过拟合 | |
num_leaves | 每个决策树的叶子节点数 | 值越大准确率越高,但越易过拟合 | |
learning_rate | 学习率 | 值越大迭代速度越大,但越易过拟合 | |
RF | n_estimators | 树木数量 | 值越大性能越高,但计算时间越长 |
max_depth | 树最大深度 | 值越大性能越高,但其过大会过拟合 | |
min_samples_leaf | 叶子节点含有的最少样本数 | 值越大算法越简单,但过大会欠拟合 | |
min_samples_split | 分割叶子节点所需最小样本数 | 值越大算法越简单,但过大会欠拟合 | |
XGBoost | n_estimators | 树木数量 | 值越大性能越高,但计算时间越长 |
max_depth | 树最大深度 | 值越大性能越高,但过大会过拟合 | |
min_child_weight | 最小样本权重和 | 值越大算法越简单,但过大会欠拟合 | |
gamma | 节点分裂所需最小损失函数下降值 | 值越大算法越简单,但过大会欠拟合 | |
SVM | C | 惩罚系数 | 值越大复杂程度越大 |
gamma | 核函数范围大小 | 值越大越易过拟合 | |
BP | max_iter | 最大迭代次数 | 值越大效果越好,但计算时间越长 |
hidden_layer_sizes | 隐藏层神经元数量 | 值越大效果越好,但过大会过拟合 |
表2 基于5种机器学习算法构建的热带气旋灾害县级直接经济损失等级评估模型的性能对比Table 2 Performance comparison of direct economic loss grading assessment models for tropical cyclone disasters in county-based units using 5 machine learning algorithms |
模型 | 准确率 | 精确率 | 召回率 | F1分数 |
---|---|---|---|---|
LightGBM | 0.794 6 | 0.798 2 | 0.794 6 | 0.796 3 |
RF | 0.772 6 | 0.780 1 | 0.772 6 | 0.776 3 |
XGBoost | 0.762 8 | 0.765 0 | 0.762 8 | 0.763 9 |
SVM | 0.251 8 | 0.553 7 | 0.251 8 | 0.346 2 |
BP | 0.268 1 | 0.328 4 | 0.268 1 | 0.295 2 |
表3 针对4种登陆点/路径的热带气旋采用LightGBM构建的评估模型应用效果比较Table 3 Comparison of evaluation models' application effects based on LightGBM for tropical cyclone with 4 landfall points/tracks |
编号 | 名称 | 登陆点 | 热带气旋路径 | 登陆强度 | 评估准确率 |
---|---|---|---|---|---|
0903 | 莲花 | 福建晋江 | 南海北上 | 热带风暴 | 0.64 |
1205 | 泰利 | — | 穿过台湾海峡近海北上 | — | 0.64 |
1702 | 苗柏 | 广东省深圳市 | 南海北上 | 热带风暴 | 0.65 |
1911 | 白鹿 | 台湾省屏东县、福建省东山县 | 西北 | 强热带风暴 | 0.86 |
图7 台风“莲花(0903)”(a、b)、台风“泰利(1205)”(c、d)、台风“苗柏(1702)”(e、f)、台风“白鹿(1911)”(g、h)实际损失等级及预测损失等级Fig.7 Actual loss level and predicted loss level of Typhoon "Linfa (0903)" (a, b), Typhoon "Talim (1205)" (c, d), Typhoon "Merbok (1702)" (e, f), Typhoon "Bailu (1911) (g, h) |
台风名称(编号) | 实际损失等级 | 预测损失等级 |
---|---|---|
莲花(0903) | ![]() | ![]() |
泰利(1205) | ![]() | ![]() |
苗柏(1702) | ![]() | ![]() |
白鹿(1911) | ![]() | ![]() |
图8 台风“泰利(1205)”(a、c、e、g)、台风“莫拉克(0908)”(b、d、f、h)过程降雨量分布、3 s阵风风速分布、实际损失等级及预测损失等级Fig.8 Typhoon "Talim (1205)" (a、c、e、g)、Typhoon "Morakot (0908)" (b、d、f、h)cumulative rainfall distribution, maximum wind speed (3-second gust) distribution, actual loss level and predicted loss level |
泰利(1205) | 莫拉克(0908) | |
---|---|---|
总降雨量分布 | ![]() | ![]() |
3 s阵风风速分布 | ![]() | ![]() |
实际损失等级 | ![]() | ![]() |
预测损失等级 | ![]() | ![]() |
邵婧妍:承担数据处理,机器学习算法实现,图片绘制及论文撰写;
方伟华:提供或协调了本文数据,提供研究选题和思路,指导了论文修改。
应急管理部国家减灾中心张云霞等提供了灾害损失数据,北京师范大学张海霞等提供了固定资本存量数据。参考文献(References):
Breiman L. 1996. Bagging Predictors. Machine Learning, 24(2): 123-40.
|
Breiman L. 2001. Random Forests. Machine Learning, 45(1): 5-32.
|
Chen S C, Chen M, Zhao N, Hamid S, Saleem K, and Chatterjee K. 2008. Florida Public Hurricane Loss Model (FPHLM): Research Experience in System Integration. Digital Government Society of North America, 8: 99-106.
|
Chen S L, Tang D L, Liu X Q, and Hu C H. 2018. Assessment of Tropical Cyclone Disaster Loss in Guangdong Province Based on Combined Model. Geomatics, Natural Hazards and Risk, 9: 431-441.
|
Chen T Q and Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System. In: Balaji K and Mohak S. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. USA: Association for Computing Machinery: 785-794.
|
Cortes C and Vapnik V. 1995. Support-Vector Machine. Machine Learning, 20 (3) :273-97.
|
陈仕鸿,唐丹玲,隋广军. 2013. 基于SVR的广东省台风灾害损失评估. 海洋环境科学,32(6):939-943.
Chen Shihong, Tang Danling, and Sui Guangjun. 2013. Estimating of Typhoon Disaster Loss for Guangdong Province Based on SVR Model. Marine Environmental Science, 32(6): 939-943.
|
Dominguez C and Magana V. 2018. The Role of Tropical Cyclones in Precipitation Over the Tropical and Subtropical North America.(2018-06-09). [2023-08-20]. https://doi.org/10.3389/feart.2018.00019
|
邓生雄,雒江涛,刘勇,王小平,杨军超. 2015. 集成随机森林的分类模型. 计算机应用研究, 32(6):1621-1624,1629.
Deng Shengxiong, Luo Jiangtao, Liu Yong, Wang Xiaoping, and Yang Junchao. 2015. Classification Model Based on Ensemble Random Forests. Application Research of Computers, 32(6): 1621-1624, 1629.
|
杜文涛,周萍,赵萌醒,杨会贇. 2019. CMORPH数据在吉林省降雨侵蚀力计算中的应用. 中国水土保持,(6):31-33,47,69.
Du Wentao, Zhou Ping, Zhao Mengxing, and Yang Huiyun. 2019. Application of CMORPH Data to the Calculation of Rainfall Erosivity of Jilin Province. Soil and Water Conservation in China, (6): 31-33, 47, 69.
|
Freund Y and Schapire R E. 1995. A Desicion-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55(1): 119-139.
|
方伟华,林伟. 2013. 面向灾害风险评估的台风风场模型研究综述. 地理科学进展,32(6):852-867.
Fang Weihua and Lin Wei. 2013. A Review on Typhoon Wind Field Modeling for Disaster Risk Assessment. Progress in Geography, 32(6): 852-867.
|
巩在武,胡丽. 2015. 台风灾害评估中的影响因子分析. 自然灾害学报,24(1):203-213.
Gong Zaiwu and Hu Li. 2015. Influence Factor Analysis of Typhoon Disaster Assessment. Journal of Natural Disasters, 24(1): 203-213.
|
郭桂祯,赵飞,王丹丹. 2017基于脆弱性曲线的台风-洪涝灾害链房屋倒损评估方法研究. 灾害学,32(4):94-97.
Guo Guizhen, Zhao Fei, and Wang Dandan. 2017. A Method Research of House Damage in Typhoon-Flood Disaster Chain Based on Vulnerability Curve. Journal of Catastrophology, 32(4): 94-97.
|
候静惟. 2019. 面向危险性评估的中国沿海地区热带气旋多致灾因子联合概率分析. 北京:北京师范大学.
Hou Jingwei. 2019. Joint Probability Analysis of Main Tropical Cyclone Parameters for Integrated Hazard Severity Assessment in Coastal China. Beijing: Beijing Normal University.
|
Ke G L, Meng Q, Finley T, Wang T F, Chen W, Ma W D, Ye Q W, and Liu T Y. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree//Neural Information Processing Systems. USA: Long Beach: 3149–3157.
|
李春梅,罗晓玲,刘锦銮,何健. 2006. 层次分析法在热带气旋灾害影响评估模式中的应用. 热带气象学报,(3):223-228.
Li Chunmei, Luo Xiaoling, Liu Jinluan, and He Jian. 2006. Application of Analytical Hierarchy Process in the Assessment Model on Tropical Cyclone Disaster’s Influence. Journal of Tropical Meteorology, (3): 223-228.
|
李文韬,张明洁,张京红,张亚杰,杨静. 2022. 基于模糊综合评价法的海南热带气旋灾害经济损失影响评估. 热带农业科学,42(9):133-139.
Li Wentao, Zhang Mingjie, Zhang Jinghong, Zhang Yajie, and Yang Jing. 2022. The Assessment of Economic Loss of Tropical Cyclone Disaster in Hainan Based on Fuzzy and Comprehensive Evaluation. Chinese Journal of Tropical Agriculture, 42(9): 133-139.
|
李艳兰,金龙,史旭明,陈丹. 2021. 基于遗传-神经网络方法的广西台风灾害评估模型研究. 气象与环境学报,37(3):139-144.
Li Yanlan, Jin Long, Shi Xuming, and Chen Dan. 2021. Study on Assessment Model of Typhoon Disaster in Guangxi Based on Genetic-Neural Network Method. Journal of Meteorology and Environment, 37(3): 139-144.
|
李颖,方伟华. 2014. 热带气旋降水重现期估算研究. 自然灾害学报,23(6):58-69.
Li Ying and Fang Weihua. 2014. Estimation on Return Period of Tropical Cyclone Precipitation. Journal of Natural Disasters, 23(6): 58-69.
|
李智义. 2020. 改进随机森林模型参数优化算法研究. 阜新:辽宁工程技术大学.
Li Zhiyi. 2020. Research on Parameter Optimization Algorithm of Improved Random Forest Model. Fuxin: Liaoning Technical University.
|
林江豪,阳爱民. 2019. 基于BP神经网络和VSM的台风灾害经济损失评估. 灾害学,34(1):22-26.
Lin Jianghao and Yang Aimin. 2019. Economic Loss Assessment of Typhoon Based on BP Neural Network and VSM. Journal of Catastrophology, 34(1): 22-26.
|
林升梁,刘志. 2007. 基于RBF核函数的支持向量机参数选择. 浙江工业大学学报,(2):163-167.
Lin Shengliang and Liu Zhi. 2007. Parameter Selection in SVM with RBF Kernel Function. Journal of Zhejiang University of Technology, (2): 163-167.
|
Mcculloch W S and Pitts W. 1943. A Logical Calculus of the Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biophysics, 5: 115-133.
|
彭晋福,张定祥,白晓飞,张小桐. 2024. 2019年中国1 km格网河网密度数据集. 中国科学数据,9(1):284-292.
Peng Jinfu, Zhang Dingxiang, Bai Xiaofei and Zhang Xiaotong. 2024. A Dataset of 1km Grid Drainage Density in China (2019). Science Data Bank, 9(1): 284-292.
|
Shen Y, Zhao P, Pan Y and Yu J J. 2014. A High Spatiotemporal Gauge-Satellite Merged Precipitation Analysis over China. Journal of Geophysical Research, 119(6): 3063-3075.
|
邵佳丽,郑伟. 2018. 洪涝灾害危险性评估方法研究. 灾害学,33(2):58-63.
Shao Jiali and Zheng Wei. 2018. Study on the Flood Hazard Assessment Method. Journal of Catastrophology, 33(2): 58-63.
|
苏朝晖,方伟华. 2023. 1980—2015年中国沿海地区热带气旋暴露度变化分析. 自然灾害学报,32(3):102-117.
Su Zhaohui and Fang Weihua. 2023. Analysis of Tropical Cyclone Exposure Changes in Coastal Areas of China from 1980 to 2015. Journal of Natural Disasters, 32(3): 102-117.
|
Tan C Y and Fang W H. 2018. Mapping the Wind Hazard of Global Tropical Cyclones with Parametric Wind Field Models by Considering the Effects of Local Factors. International Journal of Disaster Risk Science, 9(1): 86-99.
|
Turing A M. 1950. Computing Machinery and Intelligence. Mind, 236(6): 433-460.
|
汤宝平,刘文艺,蒋永华. 2010. 基于交叉验证法优化参数的Morlet小波消噪方法. 重庆大学学报,33(1):1-6.
Tang Baoping, Liu Wenyi, and Jiang Yonghua. 2010. Parameter Optimized Morlet Wavelet De-Noising Method Based on Cross Validation Method. Journal of Chongqing University, 33(1): 1-6.
|
田旭光,宋彤,刘宇新. 2004. 结合遗传算法优化BP神经网络的结构和参数. 计算机应用与软件,(6):69-71.
Tian Xuguang, Song Tong, and Liu Yuxin. 2004. Optimizing the Structure and Parameters of BP Neural Based on Genetic Algorithm. Computer Applications and Software, (6): 69-71.
|
Vickery P J, Skerplj P F, Lin J, Twisdale L, Young M, and Lavelle F. 2006. HAZUS-MH Hurricane Model MethodologyII: Damage and Loss Estimation. Natural Hazards Review, 7: 94-103.
|
王兴玲,李占斌. 2005. 基于网格搜索的支持向量机核函数参数的确定. 中国海洋大学学报(自然科学版),(5):859-862.
Wang Xingling and Li Zhanbin. 2005. Identifying the Parameters of the Kernel Function in Support Vector Machines Based on the Grid-Search Method. Periodical of Ocean University of China,(5): 859-862.
|
吴小宁,方伟华,林伟,叶妍婷. 2015. 海南橡胶树热带气旋风灾易损性评估. 热带地理,35(3):315-323.
Wu Xiaoning, Fang Weihua, Lin Wei, and Ye Yanting. 2015. Empirical Curves of Rubber Tree Fragility to Tropical Cyclone Wind in Hainan. Tropical Geography, 35(3): 315-323.
|
夏子龙. 2021. 基于Google Earth Engine的中国沿海台风灾害灾情评估. 上海:华东师范大学. [Xia Zilong. 2021. China's Coastal Typhoon Disaster Assessment Based on Google Earth Engine. Shanghai: East China Normal University. ]
|
徐新良. 2017. 中国GDP空间分布公里网格数据集. [2023-09-10]. 资源环境科学数据注册与出版系统,https://www.resdc.cn/DOI/DOI.aspx?DOIID=33. DOI:10.12078/2017121102.
Xu Xinliang. 2017. China GDP Spatial Distribution Kilometer Grid Dataset. [2023-09-10]. Resource and Environmental Science Data Registration and Publishing System, https://www.resdc.cn/DOI/DOI.aspx?DOIID=33. DOI:10.12078/2017121102.
|
Ye M Q, Wu J D, Liu W H, He X, and Wang C L. 2020. Dependence of Tropical Cyclone Damage on Maximum Wind Speed and Socioeconomic Factors. Environmental Research Letters, 15(9): 9-15.
|
杨绚,张立生,王铸. 2022. 基于机器学习算法的县域台风灾害经济损失风险评估. 热带气象学报,38(5):651-661.
Yang Xun, Zhang Lisheng, and Wang Zhu. 2022. Risk Assessment for Typhoon Economic Losses in County-Based Units Using Machine Learning. Journal of Tropical Meteorology, 38(5): 651-661.
|
Zhang H X, Fang W H, Zhang H, and Yu L. 2021. Assessment of Direct Economic Losses of Flood Disasters Based on Spatial Valuation of Land Use and Quantification of Vulnerabilities: A Case Study on the 2014 Flood in Lishui City of China. Natural Hazards and Earth System Sciences, 21: 3161-3174.
|
张广平,张晨晓,谢忠. 2013. 基于T-S模糊神经网络的模型在台风灾情预测中的应用——以海南为例. 灾害学,28(2):86-89.
Zhang Guangping, Zhang Chenxiao and Xie Zhong. 2013. Typhoon Disaster Prediction Model Based on T-S Fuzzy Neural Network and its Application: A Case Study of Hainan Island. Journal of Catastrophology, 28(2): 86-89.
|
中国气象局. 2003—2018. 中国气象灾害年鉴. 北京:气象出版社.
China Meteorological Administration. 2003-2018. China Meteorological Disaster Yearbook. Beijing: China Meteorological Press.
|
周纳,刘强. 2022. 基于模糊神经网络的广东省台风灾害损失预测. 海洋环境科学,41(3):461-466.
Zhou Na and Liu Qiang. 2022. Prediction of Typhoon Disaster Losses in Guangdong Province Based on Fuzzy Neural Networks. Marine Environmental Science, 41(3): 461-466.
|
/
〈 |
|
〉 |