面向点云配准和地点识别的多头旋转注意力网络

施成浩; 陈谢沅澧; 郭瑞斌; 肖军浩; 戴斌; 卢惠民

引用本文:	施成浩,陈谢沅澧,郭瑞斌,肖军浩,戴斌,卢惠民.面向点云配准和地点识别的多头旋转注意力网络[J].控制理论与应用,2023,40(12):2187~2197.[点击复制]
	SHI Cheng-hao,CHEN Xie-yuan-li,GUO Rui-bin,XIAO Jun-hao,DAI Bin,LU Hui-min.A novel multiplex rotational attention-based network for point cloud registration and place recognition[J].Control Theory and Technology,2023,40(12):2187~2197.[点击复制]

面向点云配准和地点识别的多头旋转注意力网络

A novel multiplex rotational attention-based network for point cloud registration and place recognition

摘要点击 866 全文点击 298 投稿时间：2023-05-02 修订日期：2023-11-28

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2023.30286

2023,40(12):2187-2197

中文关键词自动驾驶汽车三维点云配准深度学习地点识别闭环

英文关键词 autonomous vehicles 3D registration deep learning place recognition loop closing

基金项目国家自然科学基金项目(U1913202, U22A2059, U1813205), 湖南省自然科学基金项目(2021JC0004, 2021JJ40677)资助.

作者	单位	E-mail
施成浩	国防科大	shichenghao17@nudt.edu.cn
陈谢沅澧	国防科大
郭瑞斌	国防科大
肖军浩	国防科大
戴斌	军事科学院
卢惠民^*	国防科大

中文摘要

点云配准和地点识别是移动机器人和自动驾驶车辆实现自主定位的关键技术. 目前鲜有方法能够在实现高效地点识别的同时输出准确的6自由度位姿. 本文提出了一种新颖的多头网络, 该网络首先在统一的主干网络中提取稀疏而特征显著的点, 随后分别在稠密点匹配头中解决点云配准问题, 在全局描述头中解决地点识别问题. 其中, 在主干网络中创新应用3D-RoFormer机制, 以一种低计算和存储复杂度的方式显式地编码特征点之间的相对位姿信息, 从而学习到更显著和鲁棒的点特征, 有效提高了网络的特征表达能力. 在稠密点匹配头中, 首先, 构建稀疏点可靠的匹配关系, 并据此由粗至精地确定稠密点的匹配关系, 进而优化位姿估计. 在全局描述头中, 网络将稀疏的特征点及其特征向量进行压缩编码, 获得对相关点云的全局描述子, 实现高效的地点识别. 为了验证算法的有效性并评估其性能, 本文针对不同环境、不同传感器获得的数据集开展了实验研究. 实验结果表明, 本文方法在所有测试数据集中都具有很好的泛化能力, 并较当前先进方法都有更优或相当的表现, 降低连续点云配准误差约27%, 降低闭环点云配准误差约37%.

英文摘要

Point cloud registration and place recognition are critical tasks for localization in robotics and autonomous driving. There are few methods that can achieve efficient place recognition while providing accurate 6-degree-of-freedom pose. In this paper, we propose a novel multi-head network that simultaneously addresses both of these tasks. The network first extracts discriminative sparse points from the point cloud using a backbone network, and then solves the point cloud registration task in a dense point matching head and the place recognition task in a global descriptor head. In the backbone, we apply a novel 3D-RoFormer mechanism that explicitly encodes the relative pose information of points efficiently, resulting in more discriminative and robust point features and significantly improving network performance. In the dense point matching head, the network establishes reliable correspondences between sparse points and progressively finds coarse-tofine dense point correspondences to improve final pose estimation. In the global descriptor head, the network compresses the sparse point features into a global descriptor to describe the features of the current point cloud and achieves place recognition. We extensively evaluate our method on multiple datasets collected by different sensors in various environments. Experimental results show that our method depicts strong generalization ability on all the datasets, outperforming or performing comparably to the state-of-the-art methods, among which the continuous point cloud registration error is reduced by about 27%, and the closed-loop point cloud registration error is reduced by about 37%.