1 |
Developing Multi-Criteria Performance Estimation Tools for Systems-on-ChipVander Biest, Alexis GJE 23 March 2009 (has links)
The work presented in this thesis targets the analysis and implementation of multi-criteria performance prediction methods for System-on-Chips (SoC).
These new SoC architectures offer the opportunity to integrate complete heterogeneous
systems into a single chip and can be used to design battery powered handhelds, security
critical systems, consumer electronics devices, etc. However, this variety in terms of application
usually comes with a lot of different performance objectives like power consumption,
yield, design cost, production cost, silicon area and many others. These performance requirements
are often very difficult to meet together so that SoC design usually relies on
making the right design choices and finding the best performance compromises.
In parallel with this architectural paradigm shift, new Very Deep Submicron (VDSM)
silicon processes have more and more impact on the performances and deeply modify the
way a VLSI system is designed even at the first stages of a design flow.
In such a context where many new technological and system related variables enter
the game, early exploration of the impact of design choices becomes crucial to estimate
the performance of the system to design and reduce its time-to-market.
In this context, this thesis presents:
- A study of state-of-the-art tools and methods used to estimate the performances of
VLSI systems and an original classification based on several features and concepts
that they use. Based on this comparison, we highlight their weaknesses and lacks to
identify new opportunities in performance prediction.
- The definition of new concepts to enable the automatic exploration of large design
spaces based on flexible performance criteria and degrees of freedom representing
design choices.
- The implementation of a couple of two new tools of our own:
- Nessie, a tool enabling hierarchical representation of an application along with
its platform and automatically performs the mapping and the estimation of
their performance.
-Yeti, a C++ library enabling the defintion and value estimation of closed-formed
expressions and table-based relations. It provides the user with input
and model sensitivity analysis capability, simulation scripting, run-time building
and automatic plotting of the results. Additionally, Yeti can work in standalone mode to provide the user with an independent framework for model estimation and analysis.
To demonstrate the use and interest of these tools, we provide in this thesis several
case studies whose results are discussed and compared with the literature.
Using Yeti, we successfully reproduced the results of a model estimating multi-core
computation power and extended them thanks to the representation flexibility of our tool.
We also built several models from the ground up to help the dimensioning of interconnect
links and clock frequency optimization.
Thanks to Nessie, we were able to reproduce the NoC power consumption results of
an H.264/AVC decoding application running on a multicore platform. These results were
then extended to the case of a 3D die stacked architecture and the performance benets
are then discussed.
We end up by highlighting the advantages of our technique and discuss future opportunities
for performance prediction tools to explore.
|
2 |
Automatic Communication Synthesis with Hardware Sharing for Multi-Processor SoC DesignTAKADA, Hiroaki, TOMIYAMA, Hiroyuki, HONDA, Shinya, SHIBATA, Seiya, ANDO, Yuki 01 December 2010 (has links)
No description available.
|
3 |
An omni-directional design tool for series hybrid electric vehicle designShidore, Neeraj Shripad 17 February 2005 (has links)
System level parametric design of hybrid electric vehicles involves estimation of the power ratings as well as the values of certain parameters of the components, given the values of the performance parameters. The design is based on certain mathematical equations or design rules, which relate the component parameters and the performance parameters. The flow of the design algorithm is uni-directional and fixed, and cannot be altered.
This thesis proposes a new method for such parametric design, called omni- directional design, which does not have a fixed sequence like the conventional design, but can start with any parameters of the designers choice. The designer is also able to specify the input parameters over a range, instead of a point (one, fixed value) input. Scenarios having a point input, but values of an output which can vary over a range for the point input, can also be studied.
|
4 |
Evaluating the Design and Performance of a Single-Chip Parallel Computer Using System-Level Models and MethodologyLa Fratta, Patrick Anthony 12 May 2005 (has links)
As single-chip systems are predicted to soon contain over a billion transistors, design methodologies are evolving dramatically to account for the fast evolution of technologies and product properties. Novel methodologies feature the exploration of design alternatives early in development, the support for IPs, and early error detection — all with a decreasing time-to-market. In order to accommodate these product complexities and development needs, the modeling levels at which designers are working have quickly changed, as development at higher levels of abstraction allows for faster simulations of system models and earlier estimates of system performance while considering design trade-offs.
Recent design advancements to exploit instruction-level parallelism on single-processor computer systems have become exceedingly complex, and modern applications are presenting an increasing potential to be partitioned and parallelized at the thread level. The new Single-Chip, Message-Passing (SCMP) parallel computer is a tightly coupled mesh of processing nodes that is designed to exploit thread-level parallelism as efficiently as possible. By minimizing the latency of communication among processors, memory access time, and the time for context switching, the system designer will undoubtedly observe an overall performance increase. This study presents in-depth evaluations and quantitative analyses of various design and performance aspects of SCMP through the development of abstract hardware models by following a formalized, well-defined methodology. The performance evaluations are taken through benchmark simulation while taking into account system-level communication and synchronization among nodes as well as node-level timing and interaction amongst node components. Through the exploration of alternatives and optimization of the components within the SCMP models, maximum system performance in the hardware implementation can be achieved. / Master of Science
|
5 |
Network Processor specific Multithreading tradeoffsBoivie, Victor January 2005 (has links)
<p>Multithreading is a processor technique that can effectively hide long latencies that can occur due to memory accesses, coprocessor operations and similar. While this looks promising, there is an additional hardware cost that will vary with for example the number of contexts to switch to and what technique is used for it and this might limit the possible gain of multithreading.</p><p>Network processors are, traditionally, multiprocessor systems that share a lot of common resources, such as memories and coprocessors, so the potential gain of multithreading could be high for these applications. On the other hand, the increased hardware required will be relatively high since the rest of the processor is fairly small. Instead of having a multithreaded processor, higher performance gains could be achieved by using more processors instead.</p><p>As a solution, a simulator was built where a system can effectively be modelled and where the simulation results can give hints of the optimal solution for a system in the early design phase of a network processor system. A theoretical background to multithreading, network processors and more is also provided in the thesis.</p>
|
6 |
Design of High-performance DMA Controller for Multi-core PlatformWang, Tongtong January 2006 (has links)
<p>The DMA(direct memory access) controller is a special component in DSP processor used to offload the data transferring from CPU and improve the data access efficiency in the microprocessor.</p><p>This paper describes the design and implementation of DMA(direct memory access) device for microprocessor developed using C++ Language and SystemC libraries. The main facts covered within this report are the structure of a microprocessor with embedded DMA, and some interesting points of SystemC and TLM library that are useful for the design and implementation of the system level design.</p><p>This paper starts with an introduction of the theory of DMA , the structure of the microprocessor and the multicore microprocessor. Next it goes further into the DMA specification discussion. The next chapter is the implementation of DMA and the microsystem, later on in this chapter is an explanation of the SystemC methods I used in the system design.</p><p>At last, the simulation results of the whole system is presented and analyzed. The utility of the DMA is discussed and calculated.</p><p>With all these aspects covered in the paper, it is easy for the readers to understand the DMA theory , micro architecture as well as the fundamental knowledge of SystemC.</p>
|
7 |
Energy Efficient and Predictable Design of Real-Time Embedded SystemsAndrei, Alexandru January 2007 (has links)
This thesis addresses several issues related to the design and optimization of embedded systems. In particular, in the context of time-constrained embedded systems, the thesis investigates two problems: the minimization of the energy consumption and the implementation of predictable applications on multiprocessor system-on-chip platforms. Power consumption is one of the most limiting factors in electronic systems today. Two techniques that have been shown to reduce the power consumption effectively are dynamic voltage selection and adaptive body biasing. The reduction is achieved by dynamically adjusting the voltage and performance settings according to the application needs. Energy minimization is addressed using both offline and online optimization approaches. Offline, we solve optimally the combined supply voltage and body bias selection problem for multiprocessor systems with imposed time constraints, explicitly taking into account the transition overheads implied by changing voltage levels. The voltage selection technique is applied not only to processors, but also to buses with repeaters and fat wires. We investigate the continuous voltage selection as well as its discrete counterpart. While the above mentioned methods minimize the active energy, we propose an approach that combines voltage selection and processor shutdown in order to optimize the total energy. In order to take full advantage of slack that arises from variations in the execution time, it is important to recalculate the voltage and performance settings during run-time, i.e., online. However, voltage scaling is computationally expensive, and, thus, performed at runtime, significantly hampers the possible energy savings. To overcome the online complexity, we propose a quasi-static voltage scaling scheme, with a constant online time complexity O(1). This allows to increase the exploitable slack as well as to avoid the energy dissipated due to online recalculation of the voltage settings. Worst-case execution time (WCET) analysis and, in general, the predictability of real-time applications implemented on multiprocessor systems has been addressed only in very restrictive and particular contexts. One important aspect that makes the analysis difficult is the estimation of the system’s communication behavior. The traffic on the bus does not solely originate from data transfers due to data dependencies between tasks, but is also affected by memory transfers as result of cache misses. As opposed to the analysis performed for a single processor system, where the cache miss penalty is constant, in a multiprocessor system each cache miss has a variable penalty, depending on the bus contention. This affects the tasks’ WCET which, however, is needed in order to perform system scheduling. At the same time, the WCET depends on the system schedule due to the bus interference. In this context, we propose, an approach to worst-case execution time analysis and system scheduling for real-time applications implemented on multiprocessor SoC architectures.
|
8 |
Design of High-performance DMA Controller for Multi-core PlatformWang, Tongtong January 2006 (has links)
The DMA(direct memory access) controller is a special component in DSP processor used to offload the data transferring from CPU and improve the data access efficiency in the microprocessor. This paper describes the design and implementation of DMA(direct memory access) device for microprocessor developed using C++ Language and SystemC libraries. The main facts covered within this report are the structure of a microprocessor with embedded DMA, and some interesting points of SystemC and TLM library that are useful for the design and implementation of the system level design. This paper starts with an introduction of the theory of DMA , the structure of the microprocessor and the multicore microprocessor. Next it goes further into the DMA specification discussion. The next chapter is the implementation of DMA and the microsystem, later on in this chapter is an explanation of the SystemC methods I used in the system design. At last, the simulation results of the whole system is presented and analyzed. The utility of the DMA is discussed and calculated. With all these aspects covered in the paper, it is easy for the readers to understand the DMA theory , micro architecture as well as the fundamental knowledge of SystemC.
|
9 |
Network Processor specific Multithreading tradeoffsBoivie, Victor January 2005 (has links)
Multithreading is a processor technique that can effectively hide long latencies that can occur due to memory accesses, coprocessor operations and similar. While this looks promising, there is an additional hardware cost that will vary with for example the number of contexts to switch to and what technique is used for it and this might limit the possible gain of multithreading. Network processors are, traditionally, multiprocessor systems that share a lot of common resources, such as memories and coprocessors, so the potential gain of multithreading could be high for these applications. On the other hand, the increased hardware required will be relatively high since the rest of the processor is fairly small. Instead of having a multithreaded processor, higher performance gains could be achieved by using more processors instead. As a solution, a simulator was built where a system can effectively be modelled and where the simulation results can give hints of the optimal solution for a system in the early design phase of a network processor system. A theoretical background to multithreading, network processors and more is also provided in the thesis.
|
10 |
System-Level Techniques for Temperature-Aware Energy OptimizationBao, Min January 2010 (has links)
Energy consumption has become one of the main design constraints in today’s integrated circuits. Techniques for energy optimization, from circuit-level up to system-level, have been intensively researched. The advent of large-scale integration with deep sub-micron technologies has led to both high power densities and high chip working temperatures. At the same time, leakage power is becoming the dominant power consumption source of circuits, due to continuously lowered threshold voltages, as technology scales. In this context, temperature is an important parameter. One aspect, of particular interest for this thesis, is the strong inter-dependency between leakage and temperature. Apart from leakage power, temperature also has an important impact on circuit delay and, implicitly, on the frequency, mainly through its influence on carrier mobility and threshold voltage. For power-aware design techniques, temperature has become a major factor to be considered. In this thesis, we address the issue of system-level energy optimization for real-time embedded systems taking temperature aspects into consideration. We have investigated two problems in this thesis: (1) Energy optimization via temperature-aware dynamic voltage/frequency scaling (DVFS). (2) Energy optimization through temperature-aware idle time (or slack) distribution (ITD). For the above two problems, we have proposed off-line techniques where only static slack is considered. To further improve energy efficiency, we have also proposed online techniques, which make use of both static and dynamic slack. Experimental results have demonstrated that considerable improvement of the energy efficiency can be achieved by applying our temperature-aware optimization techniques. Another contribution of this thesis is an analytical temperature analysis approach which is both accurate and sufficiently fast to be used inside an energy optimization loop.
|
Page generated in 0.0757 seconds