Abstract:
The rapid development and continuing breakthroughs in artificial intelligence technology have greatly improved the grasping capabilities of intelligent robots. Grasp estimation is critical for robots to perform grasping tasks, which directly affects subsequent grasping planning and control systems. Unlike traditional approaches requiring step-by-step target localization and pose estimation, end-to-end grasping strategies directly learn and output grasping information from input data. We review vision-based end-to-end strategic grasping estimation methods, covering planar-level and spatial-level grasping methods. Planar-level grasping methods are categorized into estimating grasping contact points and estimating oriented rectangles. Spatial-level grasping methods are also divided into two categories: object-oriented and scene-oriented approaches. In addition, we introduce relevant datasets and grasping evaluation metrics and highlights the challenges and future directions in vision-based end-to-end grasping estimation for robots.