New AI system gives robots ability to visualize objects using touch

Teaching artificial intelligence to connect senses like vision and touch.

Teaching artificial intelligence to connect senses like vision and touch
Teaching artificial intelligence to connect senses like vision and touch

We, the human beings, can easily tell how an object looks like by simply touching it, all thanks to our sense of touch, which gives us that capability. Also, we can certainly determine how an object will feel only by looking at it.

But doing the same thing can be difficult and a big challenge for the machines. Even the robots that are programmed to see or feel can’t do this. They can’t use these tactile signals quite as interchangeably.

Now, the researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a robot with a predictive artificial intelligence (AI) that can learn to see by touching and learn to feel by seeing.

The team used a KUKA robot arm and added a special tactile sensor called GelSight, which was previously designed by another MIT group led by Edward Adelson. GelSight is a slab of transparent, synthetic rubber whose one side is paint-coated containing tiny flecks of metal. While on the other side, cameras are mounted. Using web camera, the team recorded nearly 12,000 videos of 200 objects, including tools, household products, fabrics, and more, being touched.

Researchers, then, broke down these videos into static frames and compiled “VisGel,” a dataset of more than 3 million visual/tactile-paired images. These reference images have then helped the robot to encode details about the objects and the environment.

By looking at the scene, our model can imagine the feeling of touching a flat surface or a sharp edge,” says Yunzhu Li, CSAIL Ph.D. student and lead author on a new paper about the system. “By blindly touching around, our model can predict the interaction with the environment purely from tactile feelings. Bringing these two senses together could empower the robot and reduce the data we might need for tasks involving manipulating and grasping objects.

Yunzhu Li is a PhD student at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)./ Image: MIT
Yunzhu Li is a Ph.D. student at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)./ Image: MIT

Right now, the robot can only identify objects in a controlled environment. However, some details like the color and softness of objects, yet remain challenging for the new AI system to conclude. Still, the researchers hope that this new approach will pave the way or more seamless human-robot integration in manufacturing settings, especially concerning tasks that lack visual data.

The team’s next step for new AI system is to build a larger dataset by collecting data in more unstructured areas, or by using MIT’s newly designed sensor-packed glove so that the robot can work in more diverse settings.

This is the first method that can convincingly translate between visual and touch signals,” says Andrew Owens, a postdoc at the University of California at Berkeley. “Methods like this have the potential to be very useful for robotics, where you need to answer questions like ‘is this object hard or soft?’, or ‘if I lift this mug by its handle, how good will my grip be?’ This is a very challenging problem, since the signals are so different, and this model has demonstrated great capability.

The paper will be presented next week at The Conference on Computer Vision and Pattern Recognition in Long Beach, California.