复杂环境下基于推抓协同操作的目标物体抓取

孙先涛; 唐思宇; 陈文杰; 贺春东; 智亚丽; 陈伟海

引用本文:	孙先涛,唐思宇,陈文杰,贺春东,智亚丽,陈伟海.复杂环境下基于推抓协同操作的目标物体抓取[J].控制理论与应用,2023,40(10):1713~1720.[点击复制]
	SUN Xian-tao,TANG Si-yu,CHEN Wen-jie,HE Chun-dong,ZHI Ya-li,CHEN Wei-hai.Target object grasp based on push-grasp cooperative operation in complex environment[J].Control Theory and Technology,2023,40(10):1713~1720.[点击复制]

复杂环境下基于推抓协同操作的目标物体抓取

Target object grasp based on push-grasp cooperative operation in complex environment

摘要点击 671 全文点击 237 投稿时间：2022-08-01 修订日期：2023-09-04

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2023.20682

2023,40(10):1713-1720

中文关键词深度强化学习神经网络机械臂抓取 Q网络

英文关键词 deep reinforcement learning neural network manipulator grasping Q-network

基金项目国家自然科学基金项目(52005001)

作者	单位	E-mail
孙先涛	安徽大学电气工程与自动化学院	xtsun@ahu.edu.cn
唐思宇	安徽大学电气工程与自动化学院
陈文杰^*	安徽大学电气工程与自动化学院	wjchen@ahu.edu.cn
贺春东	安徽大学电气工程与自动化学院
智亚丽	安徽大学电气工程与自动化学院
陈伟海	北京航空航天大学自动化科学与电气工程学院

中文摘要

针对现有抓取技术在复杂环境下难以进行有效的目标导向性抓取的问题, 本文提出了一种基于深度强化学习的推动和抓取协同操作的方法. 相对于以往的抓取方法, 本方法使用深度学习来处理Intel-D435i相机所获得的RGB-D图像数据, 同时又在视觉网络中引入了注意力机制, 用来提高系统对工作区域内目标物体的敏感性. 其次,使用深度Q网络来学习UR5机械臂与环境之间的交互过程, 提出了密集奖励策略来评判推动或抓取操作的好坏. 随着训练次数的不断增加, UR5机械臂在训练过程中不断地优化两种操作之间的协同策略, 从而更高效的进行决策.最后, 在V-rep仿真平台上设计了仿真场景, 并进行测试, 平均抓取成功率达到92.5%. 通过与其他几种方法进行对比, 证明该方法可以在复杂环境下较好的完成目标物体的抓取任务.

英文摘要

Aiming at the problem that the existing grasping technology is difficult to carry out effective target-oriented grasping in complex environments, this paper proposes a method of promoting and grasping collaborative operation based on deep reinforcement learning. Compared with the previous grasping methods, this method uses deep learning to process the RGB-D image data obtained by the Intel-D435i camera, and at the same time introduces an attention mechanism into the visual network to improve the system’s ability to detect target objects in the work area. Second, a deep Q network is used to learn the interaction process between the UR5 manipulator and the environment, and a dense reward strategy is proposed to judge the quality of pushing or grasping operations. It can also continuously optimize the collaborative strategy between the two operations, so as to capture more efficiently. Finally, the simulation scene is designed and tested on the V-rep simulation platform, and the average grasping success rate reaches 92.5%. By comparing with several other methods, it is proved that this method can better complete the grasping task of target objects in complex environments.