Cache Conscious Algorithms for All-Pairs Shortest-Paths Problem.

Date of Submission

December 2000

Date of Award

Winter 12-12-2001

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science

Department

Advance Computing and Microelectronics Unit (ACMU-Kolkata)

Supervisor

Mukhopadhyaya, Krisnendu (ACMU-Kolkata; ISI)

Abstract (Summary of the Work)

Role of Cache on Optimizing Algorithms Traditional algoritm design and analysis has, for the most part, ignored caches. Algorithms are developed, analyzed, and optimized for the RAM computer model in which a computer has a single uniformly accessible memory. Comtemporary computers, however, have multiple levels of memory and the memory access time varies significantly from one memory level to the next. So, ignoring cache behavior can result in misleading conclusions regarding an algorithms performance.Contemporary SUN workstation has an L1 cache, L2 cache and a main memory (Figure 1). Typically, it takes 1 cycle to access data from L1 cache. When the desired data is not in L1 cache, we experience an L1 miss and the data is brought from L2 cache to Ll cache using 6 to 10 cycles. If the desired data is not in L2 cache either, then we experience an L2 miss and data is fetched from main memory into L2 cache at a cost of about 50 cycles, and from there to L1 cache. Since the introduction of caches, main memory has continued to grow slower relative to processor cycle time. Cache miss penalties have grown to the point where good overall performance cannot be achieved without good cache performance. as a consequence of this change in computer archi- tectures, algorithms that have been designed to minimize instruction count imay not achi performance of algorithms that take into account both instruction count and cache perfor.de. We can reduce run time by organizing our computations so as to mininize the number of L1 and L2 cache misses.In our work we propose a blocked formulation of Floyds dynamic programming algorithm to find the lengths of the shortest paths between all pairs of vertices in a directed graph. In the process we. develop some simple analytic techniques that enable us to predict the memory performance of these algorithms in terms of cache misses. Cache misses cannot be analyzed precisely due to a number of factors such as variations in process scheduling and the operating systems virtual to physical page-mapping policy. In addition, the memory behaviour of an algorithm may be too complex to analyze completely. For these reasons the analyses we present are only approximate and must be validated emperically. In this paper, our experimental cache performance data is gathered using trace-driven cache simulation tool Shade. Cache simulations have the benefit that they are easv to run and the results are accurate.

Comments

ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843280

Control Number

ISI-DISS-2000-71

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/6243

This document is currently not available here.

Share

COinS