基于多任务学习的遥感影像语义分割和高度估计

刘文杰; 吴晓宁; 董福安; 李瑞欣; 罗 锋

doi:10.20278/j.jc2.2096-0204.2024.0123

基于多任务学习的遥感影像语义分割和高度估计

Semantic Segmentation and Height Estimation of Remote Sensing Images Based on Multi-task Learning

摘要

摘要: 遥感影像三维场景理解可以解耦为语义分割和高度估计两个像素级任务。针对现有方法大多独立研究这两个任务，忽略其潜在关联并未能充分利用互补信息以提升多任务性能的问题，本文提出了一种统一的多任务学习网络，用于从单目遥感影像联合实现这两个任务。具体包含：一个共享的主干网络负责提取多任务所需的特征信息；上下文重组模块负责将高层语义信息按照对象模式大小进行自学习重组，以获得更丰富的特征表示；跨任务门控交互模块通过执行任务间的特征交互，可以有效地缓解由相似的频谱特征引起的语义歧义；多任务解码器用于输出最终的分割映射和高度估计结果。算法在ISPRSVaihingen测试集上获得了优越的性能，语义分割任务达到91.16%的整体精度和82.88%的平均交并比，高度估计任务取得0.261相对误差与1.283均方根误差。显著提升了多任务性能，使地物要素边界保持更完整，高度回归结果更平滑。

Abstract: The three-dimensional scene understanding of remote sensing images can be decoupled into two pixel-level tasks: semantic segmentation and height estimation. Existing methods typically address these tasks independently, neglecting their inherent correlations, which limits the ability to fully leverage complementary information to enhance multi-task performance. This paper proposes a unified multi-task learning network for joint semantic segmentation and height estimation tasks from monocular remote sensing images. It specifically includes: a shared backbone network responsible for extracting feature information required for multi tasks; a context recombination module that adaptively recombine high-level semantic information based on object-scale patterns to enrich representations; a cross-task gating interaction module that effectively alleviates semantic ambiguities caused by similar spectral features through feature interactions between tasks; and a multi-task decoder used to output the final segmentation map and height estimation. This algorithm achieved superior performance on the ISPRS Vaihingen test set, with an overall accuracy of 91.16% and an average intersection to union ratio of 82.88% for semantic segmentation tasks, and 0.261 relative error and 1.283 root mean square error for height estimation tasks. Significantly improved multitasking performance, keeping the boundaries of ground features more complete and the height regression results smoother.

HTML全文

参考文献(0)

施引文献

资源附件(0)