Modern Cache Replacement Algorithms

An inherent limitation to application performance is the discrepancy between the computing speed of CPUs and the access latencies of storage devices. In the best case, we want to keep all data on fast memory like CPU caches or DRAM and avoid waiting for any filesystem operations. However when we are required to write data or process data which exceeds DRAM capacity we encounter waiting times which can severly limit the performance of our applications. Filesystems employ caching to improve write .and read performance for applications without requiring users to manually store and evict data from DRAM. The main motivation of these caches is to reduce "cache-misses", which is the number of requests which the cache cannot fulfill without fetching data from disk and reversely maximize "cache-hits".

Many caches used in practice nowadays like ARC[1] , CLOCK[2] or multi-generational LRU[3] are derived from very early concepts in computer science and modify the basic principle to minimize cache-misses further. They offer quite different performance and are heavily dependent on the actual I/O access patterns observed. Due to this, research has continued to find better cache replacement policies for specific workflows or general application.

The topic of this thesis is the implementation of selected caching algorithms, they may be for example: Shepherd[4], ML-Clock[5], Clock-Pro[6], Learning Page Replacement[7], or Fuzzy Page Replacement Algorithm[8]. A comparison to the exisiting CLOCK algorithm should be done with implemented algorithms in different workloads. The implementation will be done using the Rust programming language and integrated in the Haura ( storage stack. Haura already allows for exchangable caches which allows the thesis to be focused on cache selection and comparable benchmarks.


Contact: Johannes Wünsche

Last Modification: 16.02.2023 - Contact Person: Webmaster