For the last 4 decades, an eye-tracking system has been used to measure long points and the motion of an eye relative to the head. The eye-tracking system known as the eye tracker device measures eye positions and eye motions. It is used in the visual system, psychology, psycholinguistics, and marketing, as an input device for human-computer interaction, and in product design.
Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and the University of Georgia recently came together and developed new software. They develop this software with the purpose that this software can turn any smartphone into an eye-tracker device. This system could enable new computer interfaces and help to detect indications of developing a neurological disease or mental illness. It can also make the current eye-tracking system more approachable.
According to an MIT graduate student in electrical engineering and computer science, Aditya Khosla, “This system is like stuck in this chicken-and-egg loop.”
He said, “Most of the people have external devices. That’s why there is not so big incentive to make applications for them. Since there are no applications, there’s no incentive for people to buy the devices. We thought we should break this circle and try to make an eye tracker that works on a single mobile device, using just your front-facing camera.”
By using a machine learning system, researchers develop this new eye tracker. The machine learning system is a field of study that allows computers to learn without being explicitly programmed. Khosla and his colleague, Kyle Krafka of the University of Georgia, Wojciech Matusik, MIT professor of electrical engineering and Antonio Torralba, MIT professor of computer science and three others, develop this new software.
Strength in numbers
Khosla says, “Their training set includes examples of gaze patterns from 1,500 mobile-device users. Previously, the largest data sets used to train experimental eye-tracking systems had topped out at about 50 users.”
“For assembling data sets, other groups tend to call people into the lab. It’s really hard to scale that up. Calling 50 people in itself is already a fairly tedious process. But we realized we could do this through crowdsourcing”, he continued.
At the beginning of the experiment, researchers used training data drawn from 800 mobile-device users. Through it, researchers gain the system’s margin of error down to 1.5 centimeters. This is a double improvement in previous systems. After submitting the paper, the additional training data has been reduced. The margin of error becomes near about a centimeter by collecting data from another 700 people.
The researchers prepared and retained their system using different-sized subsets of their data for getting an idea about how better these training sets may enhance performance. The result becomes as: 10,000 training examples are enough to reduce the margin of error to a half-centimetre. This is good enough to develop the system profitably and reasonably.
The researchers developed a simple application for devices that use Apple’s iOS operating system, for collecting their training examples. To attract user’s attention, this application beams a small dot somewhere on the device’s screen. After that, it concisely replaces this dot with either an “R” or an “L,” by telling the user to tap on the either right or left on the screen. After tapping, the user has actually shifted his or her gaze to the intended location. The device camera continuously captures images of the user’s face, while this process is going on. The data set contains, on average, 1,600 images for each user.
Tightening the net
The researchers machine-learning system was a neural network. This neural network is a software abstraction but can be thought of as a huge network of very simple information processors arranged into discrete layers. Training improves the settings of different processors. Due to this, a data item can capture images of the mobile user and augment the bottom layer. This will be processed by the subsequent layers. The topmost layer output overcomes to the computational problems.
Researchers used the dark knowledge technique to decrease neural networks because the neural network is so large. Dark knowledge includes fully trained network output, which is generally approximate solutions. This can be used as the real solution to training a much smaller network. The technique decreased the size of the network by 80 percent. This makes it run more effectively on smartphones. This eye tracker can perform at about 15 frames per second with a reduced network. It is faster to record brief look.
Noah Snavely, an associate professor of computer science at Cornell University, ” In lots of cases, if you want to do a user study, in computer vision, in marketing, in developing new user interfaces, eye-tracking system is something people have been very interested in, but it hasn’t really been accessible. You need expensive equipment, or it has to be calibrated very well in order to work. So something that will work on a device everyone has seems very compelling. And from what I’ve seen, the accuracy they get seems like it’s in the ballpark that you can do something interesting.”
“Part of the excitement is that they’ve also created this way of collecting data, and also the data set itself. They did all the legwork that will make other people interested in this problem. And the fact that the community will start working on this will lead to fast improvements”, he explained.