Rapid and reliable robot grasping for a diverse set of objects has applications from warehouse automation to home de-cluttering. One promising approach is to learn deep policies from synthetic training datasets of point clouds, grasps, and rewards sampled using analytic models with stochastic noise models for domain randomization. In this paper, we explore how the distribution of synthetic training examples affects the rate and reliability of the learned robot policy. We propose a synthetic data sampling distribution that combines grasps sampled from the policy action set with guiding samples from a robust grasping supervisor that has full state knowledge. We use this to train a robot policy based on a fully convolutional network architecture that evaluates millions of grasp candidates in 4-DOF (3D position and planar orientation). Physical robot experiments suggest that a policy based on Fully Convolutional Grasp Quality CNNs (FC-GQ-CNNs) can plan grasps in 0.625s, considering 5000x more grasps than our prior policy based on iterative grasp sampling and evaluation. This computational efficiency improves rate and reliability, achieving 296 mean picks per hour (MPPH) compared to 250 MPPH for iterative policies. Sensitivity experiments explore the effect of supervisor guidance level and granularity of the policy action space.
The FC-GQ-CNN project is an extension of the GQ-CNN project.
This is an ongoing project at the UC Berkeley AUTOLAB with contributions from:
Vishal Satish, Jeffrey Mahler, Ken Goldberg.
Please contact Vishal Satish at vsatish@berkeley.edu.
This website, created by the Berkeley AUTOLAB, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.