Deep Reinforcement Learning for Robotic Pushing and Picking in Cluttered Environment

Abstract

In this paper, a novel robotic grasping system is established to automatically pick up objects in cluttered scenes. A composite robotic hand composed of a suction cup and a gripper is designed for grasping the object stably. The suction cup is used for lifting the object from the clutter first and the gripper for grasping the object accordingly. We utilize the affordance map to provide pixel-wise lifting point candidates for the suction cup. To obtain a good affordance map, the active exploration mechanism is introduced to the system. An effective metric is designed to calculate the reward for the current affordance map, and a deep Q-Network (DQN) is employed to guide the robotic hand to actively explore the environment until the generated affordance map is suitable for grasping. Experimental results have demonstrated that the proposed robotic grasping system is able to greatly increase the success rate of the robotic grasping in cluttered scenes.

Our grasping system consists of a composite robotic hand for grasping, a UR5 manipulator for reaching the operation point, and a Kinect camera as a vision sensor. We introduce the strategy of active exploration applied on the environment for more promising grasping.

Pipeline

The pipeline of the proposed robotic grasping system is illustrated in the aboved image. The RGB image and depth image of the scene are obtained firstly. The affordance ConvNet is used to calculate the affordance map based on both images. A metric Phi is proposed to evaluate the credibility of the current affordance map. If Phi satisfies the metric, the composite robotic hand will implement the grasp operation. Otherwise, the obtained RGB image and depth image are fed into the DQN, which guides the composite robotic hand to give the environment an appropriate disturbance by pushing objects. This process will be iterated until all the objects in the environment are successfully picked.

Grasping Process

Compared with other suction grasping systems, the proposed composite robotic hand uses the two fingers to hold the object after the suction cup lifts the object, which increases the stability of the grasp.

Real experiments

We test our DQN model on real environment, including a Kinect V2 camera as the image acquisition tool to get the RGB image and depth image of the scene and a UR5 manipulator to carry our composite robotic hand. We select 40 different objects to build different scenes for our robotic hand to grasp.

BibTeX


        @inproceedings{deng2019grasp,
            title={Deep reinforcement learning for robotic pushing and picking in cluttered environment},
            author={Deng, Yuhong and Guo, Xiaofeng and Wei, Yixuan and Lu, Kai and Fang, Bin and Guo, Di and Liu, Huaping and Sun, Fuchun},
            booktitle={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
            pages={619--626},
            year={2019},
            organization={IEEE}
          }