Faster Big Data Analysis

System for performing “tensor algebra” offers 100-fold speedups over previous software packages.

Faster Big-Data Analysis
A new MIT computer system speeds computations involving “sparse tensors,” multidimensional data arrays that consist mostly of zeroes. Image: Christine Daniloff, MIT

Most of the big data in today’s date are sparse and thus analytic algorithms end up doing a lot of addition and multiplication by zero. Programmers neglect this by writing custom complex code to avoid zero entries, but it generally applied to a narrow range of problems.

At the Association for Computing Machinery’s Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH), researchers from MIT, the French Alternative Energies and Atomic Energy Commission, and Adobe Research recently presented a new system that automatically produces code optimized for sparse data.

Scientists called this system as Taco for a tensor algebra compiler. In a computer science parlance, a tensor is just a higher-dimensional analog of a matrix. For example, the Amazon table mapped all of Amazon’s customers against all of its products. If that Amazon table also mapped customers and products against the customers’ product ratings on the Amazon site and the words used in their product reviews, the result would be a four-dimensional tensor.

Saman Amarasinghe, an MIT professor said, “Sparse representations have been there for more than 60 years. But nobody knew how to generate code for them automatically. People figured out a few very specific operations — sparse matrix-vector multiply, sparse matrix-vector multiply plus a vector, sparse matrix-matrix multiply, sparse matrix-matrix-matrix multiply. The biggest contribution we make is the ability to generate code for any tensor-algebra expression when the matrices are sparse.”

Tensor algebra now has become complex for big data analysis and machine learning as well. For efficient operation on massive data sets, every sequence of tensor operations requires its own “kernel,” or computational template.

Fredrick Kjolstad explained, “If you do it in one kernel, you can do it all at once, and you can make it go faster, instead of having to put the output in memory and then read it back in so that you can add it to something else. You can just do it in the same loop.”

Kernels are for some of the tensor operations in machine learning and big-data analytics. But the number of possible kernels is infinite.

Moreover, many tensor operations require multiplying an entry from one tensor with one from another. If either entry is zero, so is their product, and programs for manipulating large, sparse matrices can waste a huge amount of time adding and multiplying zeroes.

Hand-optimized code for sparse tensors identifies zero entries and streamlines operations involving them. This makes tensor manipulations much faster.

The system Taco integrates all the code by itself. A programmer need to specify the size of a tensor, whether it’s full or sparse, and the location of the file from which it should import its values. For any given operation on two tensors, Taco builds a hierarchical map that indicates, first, which paired entries from both tensors are nonzero and, then, which entries from each tensor are paired with zeroes. All pairs of zeroes it simply discards.

It also uses an efficient indexing scheme to store only the nonzero values of sparse tensors.

Saday Sadayappan, a professor of computer science and engineering at Ohio State University said, “Many research groups over the last two decades have attempted to solve the compiler-optimization and code-generation problem for sparse-matrix computations but made little progress. The recent developments from Fred and Saman represent a fundamental breakthrough on this long-standing open problem.”

“Their compiler now enables application developers to specify very complex sparse matrix or tensor computations in a very easy and convenient high-level notation, from which the compiler automatically generates very efficient code. For several sparse computations, the generated code from the compiler has been shown to be comparable or better than painstakingly developed manual implementations. This has the potential to be a real game-changer. It is one of the most exciting advances in recent times in the area of compiler optimization.”