Abstract:
To address real-time grasping in unstructured and cluttered environments, we propose a position-aware lightweight grasp detection network, PAG-Net (Position-aware grasping network). The method embeds PEG(Positional encoding generator)into MBConv (Mobile inverted bottleneck block convolution) to explicitly inject spatial position information, and further enhances network computational efficiency through the introduction of an improved LowFormer(Low-resolution Transformer). The network simultaneously predicts grasp quality, angle, and gripper width in a pixel-wise generation manner, effectively improving the accuracy and speed of grasp detection. Experimental results show that PAG-Net achieves an accuracy of 98.8% on the Cornell dataset and 96.1% on the Jacquard dataset. In Pybullet simulation tests, PAG-Net achieves approximately 94% grasping success rate in cluttered environments, proving the robustness of the proposed network in complex scenes.