Almost all websites store data in databases. But sometimes, databases queries run very slow and thus website owners choose to use cache servers. This makes common queries run faster. Data centers for a major web service such as Google or Facebook might have as many as 1,000 servers dedicated just to caching.
Flash memory consumes about 5 percent more energy as RAM and costs about one-tenth as much. It also has potential to store 100 times more data. Means, information can be packed into a little space. Additionally, flash caching system could dramatically reduce the number of cache servers required by data centers.
Arvind, the Charles and Jennifer Johnson Professor in Computer Science Engineering said, “That’s where the disbelief comes in. People say, ‘Really? You can do this with flash memory?’ Access time in flash is 10,000 times longer than in DRAM [dynamic RAM].”
Scientists named their system as BlueCache. It works by using the common computer science technique of pipelining. Before a flash-based cache server returns the result of the first query to reach it, it can begin executing the next 10,000 queries. The first query might take 200 microseconds to process, but later responses will emerge at .02-microsecond intervals.
In addition to the pipelining, scientists deployed some clever engineering tricks to make flash caching competitive with DRAM caching. During tests, they found BlueCache was 4.2 times as fast as the default implementation.
The researchers’ first trick is to add a little DRAM to every BlueCache flash cache— a few megabytes per million megabytes of flash. The DRAM stores a table which pair a database query with the flash-memory address of the corresponding query result. That doesn’t make cache lookups any faster, but it makes the detection of cache misses, the identification of data not yet imported into the cache much more efficient.
Due to all of its added efficiencies, BlueCache consumes only 4 percent as much power as the default implementation.
Inside a BlueCache server, the flash memory is connected to the central processor by a wire known as a bus. BlueCache accumulates enough questions to deplete that limit before sending them to memory. It thus ensures that the system is always using communication bandwidth as efficiently as possible.
BlueCache, like most data-center caching systems, is a so-called key-value store, or KV store. In this case, the key is the database query and the value is the response.
Vijay Balakrishnan said, “The flash-based KV store architecture developed by Arvind and his MIT team resolves many of the issues that limit the ability of today’s enterprise systems to harness the full potential of flash.”
“The viability of this type of system extends beyond caching since many data-intensive applications use a KV-based software stack, which the MIT team has proven can now be eliminated. By integrating programmable chips with flash and rewriting the software stack, they have demonstrated that a fully scalable, performance-enhancing storage technology, like the one described in the paper, can greatly improve upon prevailing architectures.”