复杂场景下机械臂多动作协同抓取策略

李德平; 洪楷宣; 柳宁; 王高

doi:10.13976/j.cnki.xk.2024.2211

复杂场景下机械臂多动作协同抓取策略

Multi-action Cooperative Grasping Strategy for Robotic Arms in Complex Scenes

摘要

摘要: 在复杂和不确定的现实场景中，被抓取物体的无序、遮挡和自遮挡关系阻碍了机器人完成场景感知和精确抓取。研究人员提出了主动视觉框架来增强场景感知能力，通过协同视点调整和抓取来完成抓取任务，旨在在紧密堆叠的场景中更好地获得场景信息并分离对象。然而，目前大多数主动视觉与抓取动作都是自上而下的，这限制了机器人对场景的感知能力。本文提出了一种6自由度(DoF)姿态空间中主动视觉感知和抓取协同策略。该策略基于深度强化学习构建视点调整网络(VNet)、4DoF抓取网络(4GNet)和6DoF抓取网络(6GNet)来学习最优协作策略，并根据Q函数和约束选择合适的原始动作。为了将多个视点所捕获的信息融合，还提出了一种视点调整后场景融合方法，通过将多个视点捕获的场景信息集成到固定大小的高度图中来提高机器人的感知面积和质量。实验结果表明，与自上而下的视点调整动作相比，单次视点调整的情况下，该方法获取到的场景面积增大了8.93%，为抓取提供了更全面的场景信息。在包含10个目标对象的杂乱场景中，抓取成功率达到了89.53%，与先进算法VPG相比，本文提出的方法在抓取成功率上提升了12.02%。

Abstract: In complex, uncertain real-world environments, challenges such as disorder, occlusion, and self-occlusion of grasped objects hinder robots from effectively perceiving scenes and executing precise grasps. To tackle these issues, researchers have proposed an active visual framework to enhance scene perception strategies. By coordinating viewpoint adjustments with grasping tasks, this approach aims to improve scene information acquisition and object separation. However, most current methods employ top-down actions, limiting the robot's scene perception capabilities. This study introduces an active visual perception and grasping coordination strategy in a 6-degree-of-freedom (6DoF) pose space. Utilizing deep reinforcement learning, we develop a viewpoint adjustment network, a 4-degree-of-freedom (4DoF) grasping network, and a 6DoF grasping network to learn optimal collaborative strategies. Actions are determined using Q-functions and constraints to execute suitable primitive actions. To enhance scene perception, we propose a scene fusion method following viewpoint adjustment, which integrates information from multiple viewpoints into fixed-size height maps. Experiment results demonstrate an 8.93% increase in captured scene area compared to top-down methods in single viewpoint scenarios, providing comprehensive information for grasping tasks. In cluttered scenes containing ten target objects, the grasping success rate reaches 89.53%. Compared to the state-of-the-art VPG algorithm, our proposed method achieves a 12.02% increase in grasping success rate.

HTML全文

参考文献(26)

施引文献

资源附件(0)