基于多尺度特征和注意力机制的航空图像分割

宁芊; 胡诗雨; 雷印杰; 陈炳才

引用本文:	宁芊,胡诗雨,雷印杰,陈炳才.基于多尺度特征和注意力机制的航空图像分割[J].控制理论与应用,2020,37(6):1218~1224.[点击复制]
	NING Qian,HU Shi-yu,LEI Yin-jie,CHEN Bing-cai.Segmentation of aerial image with multi-scale feature and attention model[J].Control Theory and Technology,2020,37(6):1218~1224.[点击复制]

基于多尺度特征和注意力机制的航空图像分割

Segmentation of aerial image with multi-scale feature and attention model

摘要点击 2097 全文点击 818 投稿时间：2019-03-12 修订日期：2019-12-05

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2019.90133

2020,37(6):1218-1224

中文关键词航空图像分割建筑像素标记全卷积神经网络注意力机制多尺度特征

英文关键词 aerial image segmentation build marking fully convolution neural networks attention mechanism multiscale feature

基金项目国家自然科学基金项目(61771089)资助.

作者	单位	E-mail
宁芊	四川大学电子信息学院	ningq@scu.edu.cn
胡诗雨	四川大学电子信息学院
雷印杰^*	四川大学电子信息学院	yinjie@scu.edu.cn
陈炳才	大连理工大学计算机科学与技术学院

中文摘要

利用神经网络能通过进行建筑像素标记实现航空图像分割, 但也存在分割边界模糊的问题, 导致分割结果不理想. 为此, 本文以卷积神经网络U–net和FCN–8s基本网络模型, 实现端到端训练. 在此基础上, 结合建立了全卷积神经网络结合多尺度特征和注意力机制的网络模型, 提升了分割边界的清晰度. 将多尺度特征和注意力机制的模型与基本模型进行对比, 分析了真实与预测之间的相关度和相似度, 并将预测结果进行对比. 实验结果表明, 结合多尺度特征和注意力机制的分割模型, 分割边界更清晰, 相对于相同训练规模的全卷积网络交并比高2%, Dice系数高3%, 得到较好的分割结果.

英文摘要

Employing neural network to automatically segment aerial image by marking building pixels. However, there is also the problem of segmentation boundary blurring, resulting in the segmentation is not ideal. In this paper, the fully convolutional neural network (FCNs), U–Net and FCN–8s, are employed as basic model to train end-to-end. And then, a network model combining the multi-scale feature and attention mechanism is established, and the clarity of the segmentation boundary is improved. The model of multi-scale feature and attention mechanism is compared with basic model, and the relativity and similarity between the prediction and ground truth (GT) are analyzed, and the prediction results are compared. The results show that, when the model of combining multi-scale features and attention mechanism is adopted to aerial images, segmentation boundary is clearer and the boundary detail processing is better. Compared to the original full convolutional neural networks of same training scale, the Intersection over Union (IoU) is 2% higher and the Dice coefficient is 3% higher for model that combine multi-scale features and attention mechanism, and a better segmentation result is obtained.