摘要
随着遥感技术的快速发展,高精度的遥感影像要素提取在地理信息科学、城市规划和环境监测等领域变得尤为重要。然而,传统基于图像的遥感影像要素提取方法在面对复杂和多变的地表特征时,往往精度有限,难以适应多样化的应用需求。为了解决这一问题,文章提出了一种新型的基于图像和文本的多模态遥感影像语义分割框架(MMRSSEG)。该框架通过综合运用视觉和文本信息,利用深度学习技术,实现对遥感影像的高精度解析。经过在遥感影像建筑物数据集上进行的一系列实验,实验结果表明:与传统的图像分割方法相比,MMRSSEG能显著提高像素级遥感影像要素提取的准确性。在建筑物识别任务中,该方法比传统的单模态算法取得了更好的效果。实验结果充分证明了结合多模态的文本信息在遥感影像分割中的有效性和应用前景。
With the rapid development of remote sensing technology,high-precision remote sensing image feature extraction has become increasingly crucial in fields such as geographic information science,urban planning,and environmental monitoring.However,traditional image-based remote sensing image feature extraction methods often have limited accuracy when dealing with complex and variable surface features,making it difficult to meet diverse application needs.To address this issue,this paper proposes a novel multimodal remote sensing image semantic segmentation framework(MMRSSEG)that integrates both visual and textual information using deep learning techniques to achieve high-precision analysis of remote sensing images.We conducted a series of experiments on a remote sensing image dataset of buildings,and the results show that MMRSSEG significantly improves the accuracy of pixel-level remote sensing image feature extraction compared to traditional image segmentation methods.In the building recognition task,our method outperformed traditional unimodal algorithms.These experimental results fully demonstrate the effectiveness and prospects of integrating multimodal textual information in remote sensing image segmentation.
作者
董思俊
孟小亮
DONG Sijun;MENG Xiaoiang(School of Remote Sensing and Information Engineering,Wuhan University,Wuhan 430000,China)
出处
《航天返回与遥感》
CSCD
北大核心
2024年第3期82-91,共10页
Spacecraft Recovery & Remote Sensing
基金
国家自然科学基金(41971352)。
关键词
遥感影像
建筑物提取
多模态信息结合
深度学习
遥感大模型
remote sensing imagery
feature extraction
multimodal information fusion
deep learning
remote sensing fundamental model