OS Caching Reviews

Class Summary: Abhishek Chandra and Michael Bradshaw

Cooperative Caching

Cooperative Caching is a technique used by LAN's to take advantage of the greater transfer speed of data oer the network rather than from a disk acess. In these circumstances we can imagine the total cache of the LAN as one unified cache. There are two main techniques that distinguish algorithms to deal with the global cache. One is the centralized algorithms in which all data on the cache is located in some centralized server. At this point cache misses consult the server for possible locations for block locations. Another method of caching is decentralized algorithms. These algorithms are usually less accurate than their centralized cousins. However, decentralized algorithms do not place a direct burden on a centralized server and therefore tend to scale better. Questions concerning consistence have been largely ignored here. However, most forms of cooperative cache use a write through algorithm and implement a lock on the file for writers.

While the algorithms here use very specified techniques we can find 'close cousins' in other areas of computer science. In cooperative caching we find a similarity to victim caches which store blocks after the local CPU thinks that they are done with it. Cooperative caching is also comparable to the idea of shared memory found in multiprocessors. Each processor tries to use it's memory without hurting the other processors. Finally we can draw a parallel to the workings of load balancing and the determination of where to place blocks in the global cache.

Application Controlled Prefetching and Caching

Application controlled prefetching and caching allows applications to do local prefetching and caching based on the information they have about their future reference streams. The algorithm also uses a global cache management policy which uses the global cache intelligently. In doing so, it uses a hierarchical algorithm which segregates the global behavior from the local behavior, thus ensuring fairness and isolation. Further, it incorporates batching in disk scheduling to amortize costs over a number of prefetch requests, enabling faster prefetches.

The algorithm has some underlying similarities to other areas. For instance, the cache replacement and prefetching issues are very similar to those seen in virtual memory management, in particular, demand paging and swapping. The algorithm also uses prediction for future access patterns, which is an underlying theme in fields like architecture, where branch prediction and cache hit prediction are important issues. Web caching also tries to predict future access patterns of various URLs or web objects. Further, the global algorithm is similar to many scheduling algorithms which try to ensure fairness and isolation while trying to improve the overall system performance.

Reviews

Paper: Efficient Cooperative Caching Using Hints

Reviewer: Vijay Sundaram

This paper presents a low-overhead decentralized algorithm (no managers, Hint maintenance) for cooperative caching with performance comparable to existing centralized algorithms (N-chance, GMS), namely the Hint-based algorithm. The key features of this algorithm are that as the clients do local hint look up when facing a local cache miss the manager overhead is decreased, at the cost of maintaining local hints, and there isn't even a need to maintain global state as is there in existing cooperative caching algorithms.

N-chance and GMS (Global Memory Service) require that the manager be contacted for Duplicate Avoidance. The replacement policies used by existing cooperative caching algorithms are either random target-client(N-chance) or require that the manager be contacted(GMS), whereas the Hint-based algorithm uses the Best Guess LRU approach, this has the advantage of reducing the manager overhead, is not completely random and also does not need maintenance of global state. Also any deviations from the globally LRU block are masked by implementing Discard cache in the Server. In existing algorithms the server cache is the traditional cache whereas in the Hint-based algorithm the server cache is used as the Discard cache. The blocks in the discard cache are replaced in global LRU order. Thus the discard cache serves as a buffer to hold potential replacement mistakes.

One drawback of file-based consistency (used in the Hint-based algorithm) is that it does not handle concurrent write-sharing of a file by multiple clients as efficiently as block-based consistency (used in N-chance). Also as the paper mentions if several clients share a working set of blocks larger than the cooperative cache, the algorithm will not be effective as then there would be excessive forwarding of requests.

The key contribution of the paper is towards removing the centralized control of the cooperative cache, thus reducing manager overhead and relying on local hints rather than facts thus nullifying the need for maintaining global state. Overall I would rate the paper as well written and easy to understand. The key contributions and drawbacks of the scheme proposed in the paper have been outlined above.

Reviewer:M S. Raunak

Cooperative caching is a technique in which clients can access some other client's cache. Thus, if a local cache-miss occurs, a client can fetch the block from some other client's local cache. This reduces server load improves performance. To find out which client has a block in its cache requires some sort of coordination. Usually file systems use a coordinator (manager) to maintain the global information. Although centralized control has the advantage of accurate global information, it becomes a bottleneck and it is a potential point of failure. This paper proposes a hint-based technique to distribute the responsibility of the manager to all clients and thus reducing communication overhead to manager at the expense of dealing with inaccurate data sometimes. It argues that with judicious use of hints, the potential performance loss due to inaccurate data is outweighed by the improvement gained from reduced overhead otherwise needed for communicating with the manager.

In the hint-based algorithm, every clients keep hints that approximates the global state of the system. Since hints do not need to be consistent throughout the system, there is no need for centralized coordination. A predetermined master copy of a block gets forwarded when ejected from a client's local cache. This is done to reduce the overhead of deciding which copy of a block should be forwarded. However, this may lead to unnecessary forwarding in some cases. Since there is no centralized manager, cache block replacements are done using best-guesses by clients. The guesses are made on information that clients gather by exchanging data between themselves. The paper proposes file based consistency mechanisms for this algorithm.

The simulation results show that hint-based algorithm's block access times are as good as those with central coordination (N-chance and GMS) while the the manager load, block lookup and replacement traffic are reduced by considerable factors.

Although hint-based algorithm reduces manager load, it doesn't eliminate the role of manager. Moreover, the algorithm introduces client to client communication for sharing hints. It is not clear from the simulation how this new communication overhead affect the performance.

Reviewer: Akash Jain

The paper presents a low overhead decentralized algorithm for cooperative caching. It shows that the cooperative caching system that relies on local hints to manage the cooperative cache performs as well as a more tightly coordinated fact based system. Simulations described show that using the hint-based scheme, block access times are as good as those of other algorithms while reducing manager load, block lookup traffic and replacement traffic.

The hint-based algorithm uses a file based cache consistency mechanism where clients acquire a token from the manager prior to accessing a file. The block lookup is done is done by the client itself using its own hints about the locations of blocks within the cooperative cache thus avoiding the need to contact the manager on every local cache miss. The algorithm uses a forwarding mechanism in which only the master copy of a block is forwarded to the cooperative cache, while all other copies are discarded. Thus it does not require communication between the clients and the manager. It uses the best guess replacement policy where each client maintains an oldest block list that contains what the client believes to be the oldest on each client along with its age. The block is forwarded to the client that has the oldest block in the oldest block list.

There are certain shortcomings in the novel methods described in the paper. Firstly, filebased cache consistency does not handle concurrent write-sharing of a file by multiple clients as efficiently as block-based consistency. Secondly, if the working set of blocks is larger than the cooperative cache, the locations of the master copies will change rapidly as blocks move in and out of client caches causing the probable master copy locations to be inaccurate leading to excessive forwarding of requests. Thirdly, the master copy forwarding algorithm may lead to unnecessary forwarding. A block which is deleted before it is down to its last copy should not have been forwarded to its cooperative cache but will probably will using this algorithm.

Reviewer: Zhenlin Wang

The main contribution of this paper is to provide a decentralized algorithm for cooperative caching based on inexact information, called hints. Unfortunately, the algorithm still need the central manager to maintain consistency of hints. However, it distinguishes itself by much lower manager loads. The client itself performs lookup, replacement and forwarding using its hints. The concept, master copy, is introduced in the paper for hint maintenance. A master copy of a block is the its first cached copy. The manager obtains the set of hints of a file from the last client to acquire a token for the file. It then passes the hints to the next client obtaining the file token. When a client forward a mater copy of a block to the other client, both clients update their hints to show the location of the master copy. Lookup of clients thus first relied on the hints which keep track of the location of master copy. In the decentralized algorithm, forwarding is only applied for the master copy of block. So the client need not check with the manager if a block is singlet or not. The best-guess replacement algorithm approximates global LRU, where each client maintains an oldest block list. The erroneous replacement of the algorithm is compensated by discard cache which is similar to victim buffer. The paper also compares the performance of the hint-based algorithm, N-chance, GMS, global LRU, and optimal algorithm. The manager loads of N-chance, GMS and hint-based are also measured. Hint-based algorithm has near optimal performance and much lower manager load.

An interesting research motivation of this paper is that the basic ideas in this paper, such as using hints for block lookup and discard cache, come from the researches in different area. For example, discard cache is a level of file cache here but the idea of discard cache comes from victim buffer which is for high level memory cache. Thinking of applying caching techniques in different system to high level memory cache is my main objective in this class.

Reviewer: Sivakumar Murugesan

This paper deals with the cooperative caching which are distributed in the client sites and use inexact information or heuristics called hints to locate the missing cache block. The clients cooperate among themselves by exchanging hints so as to improve the performance. Every inaccurate hints results in the overloading of server. So the performance depends upon the accuracy of the hints and server memory acts as discard cache that offsets the impact of incorrect hints. This paper introduces other simillar algorithms like N-chance, GMS and their performance is compared with hint based algorithm. The main advantage of hint based algorithm is reduction in the load on manager. Hint maintainance and Lookup mechanism forms the basis of hint based algorithm. The drawback is if a working set of blocks are larger than the cooperative cache, it results in excessive forwarding of requests. Moreover since it supports file-based consistency as opposed to block-based, concurrent writes on the file is prohibited.

The trade off with this method is between forwarding costs among the clients with load on the manager. There can be chances of overloading a particular client if other clients believer that former has the oldest block. To best compare the performance of this scheme with other scheme average block access time and overhead required for implementation of cooperative caching is taken as metric. The simulation results are presented for N-chance, GMS, hint based Global LRU and optimal algorithms. As long as block read access is concerned hint based performs almost same as that of other algorithms but results in low load on manager as compared to other algorithms. To measure the sensitivity the same experiment is repeated for different client cache size and percentage of clients using cooperative clients. The performance of hint based algorithm fairely remains constant. Overall view is that this paper is well written and easy to understand. Background informations are provided inaddition to hint based algorithms which is useful in comparision section.

Paper: Implementation and Performance of Integrated Application-Controlled File caching, Prefetching and Disk Scheduling

Reviewer: Vijay Sundaram

This paper presents the design, implementation and performance of a file system that integrates application-controlled caching, prefetching and disk scheduling namely ACFS (Application Controlled File System). For the single process case prefetching is integrated with caching using the controlled-aggressive policy for prefetching, given the sequence of future block references. The approach presented i based on the assumption that the list of predicted accesses is more accurate about the application's near-term behavior, while the file-caching policy is more accurate about the long-term behavior.

A key contribution of the paper is the idea of combining disk scheduling with batch prefetching. The heuristic used is: limited batch scheduling. Every time the disk becomes idle, the prefetcher tries to issue a batch of prefetch requests, instead of just one request. There is a limit on the batch size and the disk driver sorts the requests issued to it by increasing logical block number. By reordering the requests disk scheduling can cancel some overlapping between I/O and CPU computation.

Another contribution of this paper towards the multi-process case is two-level cache management. The kernel uses a "global allocation policy" for deciding how to allocate cache blocks to processes, and each user-level process has a "local management policy" for deciding how to use blocks. The kernel uses LRU-SP (LRU with Swapping and Placeholders) as its global allocation policy. A drawback of this strategy is that, although the future file references of each of each process maybe known, the interaction between file caching, prefetching and CPU scheduling is complex, so it cannot be predicted how the processes' reference streams will be interleaved.

The results of the implementation carried out in the paper suggest improved performance for both individual application running times as well as for multi-process workloads.

Overall the paper was well written and easy to follow. But, as the concluding section points out, there is good spate of questions that need to be looked into, for instance the case of non-uniform fetch time while fetching from a real disk. Also unaddressed was the effect of disk scheduling on process scheduling. The key contributions of the paper have been discussed above.

Reviewer: M S. Raunak

This paper presents an augmented File System that incorporates, application-controlled file caching, prefetching and disk scheduling. It provides the design, implementation and some performance measures of this new file system. The motivation is to improve the file system performance by reducing application's running time.

Since processor speed is much higher than disk read, file systems prefetch and cache file blocks to improve performance. However, as the cache size is limited, prefetching a block is tied with the decision of replacing an existing block. Usually file systems use a fixed replacement policy like LRU to handle this. The paper argues that giving the control to the application that uses the prefetched or replaced block should be the approach for reducing the application's running time. Because, both prefetching and caching rely on knowledge of future access pattern. With this idea, the paper describes a new file system named Application-Controlled File System (ACFS).

ACFS integrates file caching and prefetching with an algorithm called "controlled-aggressive". The paper mentions previous studies that have found this algorithm to be near optimal with simplified theoretical model. In this algorithm, the application decides which blocks to bring in the cache. The application also provides prediction regarding which block in the cache it will need farthest away in future. This becomes a potential block that can be replaced. "Controlled-aggressive" works as a near optimal algorithm when complete future information is available to a program which is not the most of the time. So, the algorithm does not only depend on the program's predictions but also actually matches the predictions with the previous access pattern to improve the accuracy of the predictions. Moreover, ACFS also sends batches of prefetch request to the disk so that the disk can schedule them in logical order such that disk overhead is reduced. Thus ACFS integrates prefetching, caching and disk scheduling to get an overall performance boost.

In case of multiple processes, ACFS uses LRU-SP to allocate cache space among multiple competing processes. The paper mentions previous studies where LRU-SP has been found to be fair and robust in addition to good performance. With the two level cache management, applications can use the integrated techniques without harming each other or degrading the performance of the whole system.

The strength of the paper lies with the fact that it has identified some basic rules for optimal prefetching, replacement and how to achieve them. Then it integrated some simple techniqes in an attempt to achieve the optimal performance following these basic rules.

The results provided in the paper shows that with ACFS, sequential execution time was reduced by an average of 26% and concurrent executions time was reduced by an average of 32%. The variance of these results, however, are too high. Moreover, as the performance study was done on some specific synthetic workload, it is unclear whether the techniques will perform equally well under real life workloads.

Reviewer: Akash Jain

The paper presents the design, implementation and performance of "ApplicationControlled File System", a file system that integrates application controlled file caching with prefetching. It uses a two-level cache management strategy to allow applications to exert control over file cache replacement and to prefetch file data, and retains for the kernel the allocation of cache blocks to concurrent processes. It applies disk scheduling to further improve the performance of prefetching. It allocates file cache space among multiple processes properly so that applications can use these techniques without unduly harming each other or degrading the performance of the whole system.

The paper uses a two level strategy in which the kernel allocates cache blocks to processes and each process manages its own blocks. The kernel uses the LRU-SP policy to allocate blocks to processes and each process uses application-controlled file caching and prefetching, integrated by controlled aggressive policy. Each process improves its access efficiency by submitting its prefetches in batches, which are scheduled by the disk driver to reduce disk access latency. The paper also gives a prototype implementation of ACFS by modifying the Ultrix 4.3 file system code. Another interesting feature is limited batch scheduling in which every time the dick becomes idle, the prefetcher tries to issue a batch of prefetch requests, instead of just one request.

There is one shortcoming that can be noticed. The way it handles inaccuracies in application's predictions may not be optimal. It requires modeling the probability of occurrence of various inaccuracies.

Reviewer: Zhenlin Wang

The paper presents a file system, ACFS, application-controlled file system, which integrates application-controlled caching, prefetching and disk scheduling. The paper first talks the technique concern of the integrations and then goes to the implementation details of the file system. LRU-SP allocation policy is proposed to integrate global and local allocation policy. LRU-SP maintains a global LRU list. Global replacement decision compromises with the application's decision by swapping. Placeholders are used to detect when the application's choice is not as good as the default policy. To integrate disk scheduling and prefetching, the prefetcher tries to issue a batch of prefecth requests and thus provide space for disk scheduling. Control aggressive prefetching, which can perform close to optimal with perfect knowledge of future access, is discussed in the context that future knowledge is incomplete or inaccurate. In the implementation of the file system, the cache manager consists of three modules, a buffer cache module(BUF), an application control module(ACM) and a prefetch control module(PCM). BUF and ACM supports application controlled file caching by implementing LRU-SP. PCM implements controlled-aggressive prefetching policy. Performance is compared for different file system configurations both on the basis of single process and multiple processes. The integrated ACFS system shows its promise in its significant performance improvement.

I am in fact interested in the idea of application controlled caching which is not present in much detail in this paper. I'd like to know the techniques for analyzing access pattern of applications. I expect to use similar techniques for high level memory cache control.

Prashant Shenoy

Last modified: Fri Apr 28 11:24:56 EST 2000