Global ETD Search

1	Formal Approaches to Globally Asynchronous and Locally Synchronous Design Xue, Bin 30 September 2011 (has links) The research reported in this dissertation is motivated by two trends in the system-on-chip (SoC) design industry. First, due to the incessant technology scaling, the interconnect delays are getting larger compared to gate delays, leading to multi-cycle delays in communication between functional blocks on the chip, which makes implementing a synchronous global clock difficult, and power consuming. As a result, globally asynchronous and locally synchronous (GALS) designs have been proposed for future SoCs. Second, due to time-to-market pressure, and productivity gain, intellectual property (IP) block reuse is a rising trend in SoC design industry. Predesigned IPs may already be optimized and verified for timing for certain clock frequency, and hence when used in an SoC, GALS offers a good solution that avoids reoptimizing or redesigning the existing IPs. A special case of GALS, known as Latency-Insensitive Protocol (LIP) lets designers adopt the well-understood and developed design flow of synchronous design while solving the multi-cycle latency at the interconnects. The communication fabrics for LIP are synchronous pipelines with hand shaking. However, handshake based protocol has complex control logics and the unnecessary handshake brings down the system's throughput. That is why scheduling based LIP was proposed to avoid the hand-shakes by pre-calculated clock gating sequences for each block. It is shown to have better throughput and easier to implement. Unfortunately, static scheduling only exists for bounded systems. Therefore, this type of design in literatures restrict their discussions to systems whose graphic representation has a single strongly connected component (SCC), which by the theory is bounded. This dissertation provides an optimization design flow for LIP synthesis with respect to back pressure, throughput and buffer sizes. This is based on extending the scheduled LIP with minimum modifications to render it general enough to be applicable to most systems, especially those with multiple SCCs. In order to guarantee the design correctness, a formal framework that can analyze concurrency and prevent fallacious behaviors such as overflow, deadlock etc., is required. Among many formal models of concurrency used previously in asynchronous system design, marked graphs, periodic clock calculus and polychrony are chosen for the purpose of modeling, analyzing and verifying in this work. Polychrony, originally developed for embedded software modeling and synthesis, is able to specify multi-rate interfaces. Then a synchronous composition can be analyzed to avoid incompatibly and combinational loops which causes incorrect GALS distribution. The marked graph model is a good candidate to represent the interconnection network which is quite suitable for modeling the communication and synchronizations in LIP. The periodic clock calculus is useful in analyzing clock gating sequences because periodic clock calculus easily captures data dependencies, throughput constraints as well as buffer sizes required for synchronization. These formal methods help establish a formally based design flow for creating a synchronous design and then transforming it into a GALS implementation either using LIP or in a more general GALS mechanisms. / Ph. D. verification polychronous formalism throughput back pressure marked graph latency insensitive system scheduling
2	Efficient Minimum Cycle Mean Algorithms And Their Applications Supriyo Maji (9158723) 23 July 2020 (has links) <p>Minimum cycle mean (MCM) is an important concept in directed graphs. From clock period optimization, timing analysis to layout optimization, minimum cycle mean algorithms have found widespread use in VLSI system design optimization. With transistor size scaling to 10nm and below, complexities and size of the systems have grown rapidly over the last decade. Scalability of the algorithms both in terms of their runtime and memory usage is therefore important. </p> <p><br></p> <p>Among the few classical MCM algorithms, the algorithm by Young, Tarjan, and Orlin (YTO), has been particularly popular. When implemented with a binary heap, the YTO algorithm has the best runtime performance although it has higher asymptotic time complexity than Karp's algorithm. However, as an efficient implementation of YTO relies on data redundancy, its memory usage is higher and could be a prohibitive factor in large size problems. On the other hand, a typical implementation of Karp's algorithm can also be memory hungry. An early termination technique from Hartmann and Orlin (HO) can be directly applied to Karp's algorithm to improve its runtime performance and memory usage. Although not as efficient as YTO in runtime, HO algorithm has much less memory usage than YTO. We propose several improvements to HO algorithm. The proposed algorithm has comparable runtime performance to YTO for circuit graphs and dense random graphs while being better than HO algorithm in memory usage. </p> <p><br></p> <p>Minimum balancing of a directed graph is an application of the minimum cycle mean algorithm. Minimum balance algorithms have been used to optimally distribute slack for mitigating process variation induced timing violation issues in clock network. In a conventional minimum balance algorithm, the principal subroutine is that of finding MCM in a graph. In particular, the minimum balance algorithm iteratively finds the minimum cycle mean and the corresponding minimum-mean cycle, and uses the mean and cycle to update the graph by changing edge weights and reducing the graph size. The iterations terminate when the updated graph is a single node. Studies have shown that the bottleneck of the iterative process is the graph update operation as previous approaches involved updating the entire graph. We propose an improvement to the minimum balance algorithm by performing fewer changes to the edge weights in each iteration, resulting in better efficiency.</p> <p><br></p> <p>We also apply the minimum cycle mean algorithm in latency insensitive system design. Timing violations can occur in high performance communication links in system-on-chips (SoCs) in the late stages of the physical design process. To address the issues, latency insensitive systems (LISs) employ pipelining in the communication channels through insertion of the relay stations. Although the functionality of a LIS is robust with respect to the communication latencies, such insertion can degrade system throughput performance. Earlier studies have shown that the proper sizing of buffer queues after relay station insertion could eliminate such performance loss. However, solving the problem of maximum performance buffer queue sizing requires use of mixed integer linear programming (MILP) of which runtime is not scalable. We formulate the problem as a parameterized graph optimization problem where for every communication channel there is a parameterized edge with buffer counts as the edge weight. We then use minimum cycle mean algorithm to determine from which edges buffers can be removed safely without creating negative cycles. This is done iteratively in the similar style as the minimum balance algorithm. Experimental results suggest that the proposed approach is scalable. Moreover, quality of the solution is observed to be as good as that of the MILP based approach.</p><p><br></p> Computer Engineering Circuits and Systems Microelectronics and Integrated Circuits Minimum cycle mean Minimum balancing Graph algorithms Clock network design Slack distribution Latency insensitive system Buffer queue sizing Mixed integer linear programming Integer minimum balancing Hartmann-Orlin's MCM algorithm Young-Tarjan-Orlin's MCM algorithm Memory usage Runtime performance Graph re-weighting Cycle collapsing System-on-Chip (SoC) Timing violation

Search results

Formal Approaches to Globally Asynchronous and Locally Synchronous Design

Efficient Minimum Cycle Mean Algorithms And Their Applications