In a joint project, Computer scientists at Emory University, Carnegie Mellon University, and the Pelikan Foundation have invented a highly effective yet incredibly simple algorithm called SIEVE to speed cache sifting. The algorithm decide which items to toss from a web cache to make room for new ones.
Yazhuo Zhang, an Emory PhD student and co-first author of the paper said, “SIEVE is bigger and greater than just us. It is already performing well but we are getting a lot of good suggestions to make it even better. That’s the beauty of the open-source world.”
SIEVE is a straightforward enhancement of a well-established cache-eviction algorithm that has been utilized for many years. In addition to its speed and effectiveness, a key factor sparking interest is its Simplicity that lends it scalability.
Ymir Vigfusson, associate professor in Emory’s Department of Computer Science said, “Simplicity is the ultimate sophistication. The simpler the pieces are within a system designed to serve billions of people within a fraction of a second, the easier it is to efficiently implement and maintain that system.”
Like the Least Recently Used (LRU) and some other algorithms, SIEVE introduces a simple modification to the basic First-In-First-Out (FIFO) scheme.
In SIEVE, when an object is first requested, it’s labeled as “zero.” If the same object is requested again while it’s still in the cache, its label changes to “one.” As objects move through the cache, those labeled “one” eventually reach the end of the line, where they are reset to “zero” and then evicted.
Additionally, a pointer, known as the “moving hand,” scans the objects as they move along the cache line. Starting from the end and moving continuously in a circular manner to the beginning, whenever the pointer encounters an object labeled “zero,” that object is evicted from the cache.
Zhang said, “It’s important to evict unpopular objects as quickly as possible, and SIEVE is very fast at this task.”
SIEVE also manages to maintain popular objects in the cache with minimal computational effort, known as “lazy promotion” in computer terminology.
According to scientists, “SIEVE is the simplest cache-eviction algorithm to effectively achieve both quick demotion and lazy promotion.”
To test SIEVE’s effectiveness, researchers used real-world web-cache data from various sources, including Meta, Wikimedia, X, and four other large datasets. Their experiments revealed that SIEVE outperforms nine advanced algorithms on over 45% of the traces, achieving a lower miss ratio. In comparison, the next best algorithm only outperforms others on 15% of the traces.
The Simplicity and effectiveness of SIEVE raise questions about why this method wasn’t developed earlier. The researchers’ attention to evolving patterns in web traffic might explain the success of SIEVE, highlighting the importance of staying current with changing trends.
The team’s paper on SIEVE will be presented at the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI) in Santa Clara, California, in April.