混合坐标注意力与改进空间金字塔池化融合的物体位姿估计
DOI:
CSTR:
作者:
作者单位:

1.桂林电子科技大学 电子工程与自动化学院 桂林 541004;2.广西智能综合自动化高校重点实验室 桂林 541004

作者简介:

通讯作者:

中图分类号:

TP391.4

基金项目:

国家自然科学基金(62263004,61863008)项目资助


Pose estimation of objects combining shuffle coordinate attention and improved spatial pyramid pooling
Author:
Affiliation:

1.School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, 541004, China; 2.Key Laboratory of Guangxi College Intelligent-Comprehensive Automation, Guilin, 541004, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在物体杂乱放置非遮挡和遮挡构成的复杂场景下,针对位姿实时、准确和稳定地估计的问题,提出了混合坐标注意力与改进空间金字塔池化融合的目标位姿估计算法。搭建了由坐标特征、通道特征和空间特征组成的混合坐标注意力残差模块,有效提高了关键点估计的准确率。改进了空间金字塔池化网络,并通过颈部位置的多尺度特征细化方法,获得边缘姿态及空间位置的高精确估计。将所制作的遮挡数据集,进一步验证所提出算法性能和泛化能力。在公开LineMod及Partial Occlusion遮挡数据集上,所提算法与基于组特征注意力(SA)算法相比ADD指标分别提高2.26%和2.57%,5cm5°指标分别提高5.16%和4.1%,达到了30FPS实时处理速度,为遮挡等复杂场景下的物体位姿估计提供一个有效的方法。

    Abstract:

    In the complex scene composed of non-occlusion and occlusion of objects placed in disorder, aiming at the problem of real-time, accurate and stable pose estimation, a target pose estimation algorithm combining shuffle coordinate attention and improved spatial pyramid pooling is proposed. A shuffle coordinate attention residual module consisting of coordinate features, channel features and spatial features has been built to effectively improve the accuracy of key point estimation. The spatial pyramid pooling network is improved, and the multi-scale feature thinning method of neck position is used to obtain highly accurate estimation of edge pose and spatial position. The produced occlusive dataset is used to further validate the performance and generalization capability of the proposed algorithm. On the public LineMod and Partial Occlusion occlusive datasets, the proposed algorithm improves ADD metrics by 2.26% and 2.57% respectively, and 5cm5° metrics by 5.16% and 4.1%, respectively, compared to the shuffle attention (SA)-based algorithm, reaching a real-time processing speed of 30 FPS, providing an effective method for object pose estimation in complex scenes such as occlusion.

    参考文献
    相似文献
    引证文献
引用本文

党选举,李启煌.混合坐标注意力与改进空间金字塔池化融合的物体位姿估计[J].国外电子测量技术,2023,42(01):178-186

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-05-21
  • 出版日期:
文章二维码