• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 5
  • 4
  • 2
  • 1
  • Tagged with
  • 29
  • 29
  • 14
  • 10
  • 10
  • 9
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Multi-Functional Interfaces for Accelerators

Piccolboni, Luca January 2022 (has links)
Heterogeneous System-on-Chip (SoC) architectures combine general-purpose processors with many accelerators, which are application-specific computing engines. By having their hardware optimized to perform specific tasks, accelerators deliver massive speedups and energy savings compared to corresponding software executions on a processor. Heterogeneity and hardware specialization complicate accelerator design and integration, reducing regularity and reusability across platforms. The many system-level architectural aspects to consider make it hard to explore the design space and arrive to optimal solutions. Furthermore, integrating accelerators affects the programmability of the applications and the security of the entire SoC. In this dissertation, I present design methodologies and architectural contributions that use multi-functional interfaces to simplify many of the tasks that designers perform when designing and integrating accelerators in heterogeneous SoCs. The accelerator interfaces exploit latency-insensitive design to effectively explore the design space when multiple accelerators are integrated and to speed up the verification of accelerators. This improves their reusability across SoC platforms, while ensuring correctness when the accelerators are integrated with the various components of the SoC. In addition, the accelerator interfaces improve the integration with software by making it transparent and by establishing a strong layer of protection between accelerators and applications.The interfaces aim at securing the accelerators and the applications without requiring modifications to the accelerator implementations and without degrading their performance and energy efficiency.
12

Scheduling on-chip networks

Wu, Xiang 23 October 2009 (has links)
Networks-on-Chip (NoC) have been proposed to meet many challenges of modern Systems-on-Chip (SoC) design and manufacturing. At the architectural level, a clean separation of computation and communication helps integration and verification. Networking abstraction of the communication infrastructure also promotes reuse and fast development. But the benefit is most visible when it comes to circuit and physical design. Networks can be made sparse and regular and thus facilitate placement and route. It is also much easier to reach timing and power closure as NoC shield communication details away from complicating analysis. Last but not the least, networks are flexible at the design stage and adaptable post-silicon. Many techniques of tackling process variation and interconnect failure can be built upon NoC. However, when interconnects are time multiplexed in a NoC, the network’s performance will deteriorate if it is not scheduled properly. For a wide range of applications, the traffic on the network can be determined before run-time and offline scheduling offers guaranteed performance and enables simple design. We propose a synthesis flow that takes the data flow graph of the application and a network topology as inputs; and outputs an offline schedule that can be deployed directly to the NoC. We analyze the complexity of combinatorial problems that arise from this context and provide efficient heuristics when polynomial time algorithms are not available assuming P [not equal to] NP. Results on LDPC decoding and FFT designs are compared with previous ones. We further apply our findings to parallel shared memories (PSM) and formalize the PSM architecture and its scheduling problem. An efficient heuristic is derived from our algorithm for unbuffered networks. Another application exemplifies how the NoC can be reprogrammed after silicon is back from fab in order to avoid failed interconnects due to process variation. A simple statistical model is studied and the simulation result is rather interesting. We find out that high performance and yield are not always at conflict if we are able to change the network schedule based on silicon diagnosis. / text
13

A New Look at Retargetable Compilers

Burke, Patrick William 12 1900 (has links)
Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In the end, the potential for a simple retargetable solution like Hy-C is shown to be very simple, yet powerful enough to provide a very capable product in a very fast-growing market.
14

Optimal Network Topologies and Resource Mappings for Heterogeneous Networks-on-Chip

Chung, Haera 01 January 2013 (has links)
Communication has become a bottleneck for modern microprocessors and multi-core chips because metal wires don't scale. The problem becomes worse as the number of components increases and chips become bigger. Traditional Systems-on-Chips (SoCs) interconnect architectures are based on shared-bus communication, which can carry only one communication transaction at a time. This limits the communication bandwidth and scalability. Networks-on-Chip (NoC) were proposed as a promising solution for designing large and complex SoCs. The NoC paradigm provides better scalability and reusability for future SoCs, however, long-distance multi-hop communication through traditional metal wires suffers from both high latency and power consumption. A radical solution to address this challenge is to add long-range, low power, and high-bandwidth single-hop links between distant cores. The use of optical or on-chip RF wireless links has been explored in this context. However, all previous work has focused on regular mesh-based metal wire fabrics that were expanded with one or two additional link types only for long-distance communication. In this thesis we address the following main research questions to address the above-mentioned challenges: (1) What library of different link types would represent an optimum in the design space? (2) How would these links be used to design an application-specific NoC architecture? (3) How would applications use the resulting NoC architecture efficiently? We hypothesize that networks with a higher degree of heterogeneity, i.e., three or more link types, will improve the network throughput and consume less energy compared to traditional NoC architectures. In order to verify our hypothesis and to address the research challenges, we design and analyze optimal heterogeneous networks under different realistic traffic models by considering different cost and performance trade-offs in a comprehensive technology-agnostic simulation framework that uses metaheuristic optimization techniques. As opposed to related work, our heterogeneous links can be placed anywhere in the network, which allows to explore the entire search space. The resulting application-specific networks are then analyzed by using complex network techniques, such as community detection and small-worldness, to understand how heterogeneous link types are used to improve the NoCs performance and cost. Next, we use the application-specific networks as a target architecture for other applications. The goal is to evaluate the performance of our new NoCs for applications they have not been designed for by finding optimal resource allocations. Our results show that there is an optimal number of heterogeneous link types for each set of constraints and that networks with three or more heterogeneous link types provide significantly higher throughput along with lower energy consumption compared to both homogeneous link type and regular 2D mesh networks under three different traffic scenarios. Our evolved networks with three different technology-driven link types, namely metal wires, wireless, and optical links, provide 15% more throughput and fourteen times less energy consumption compared to homogeneous link type network. When ten different abstract link types are used in the design, 12% more throughput and 52% less energy consumption are obtained compared to networks with three different technology-driven link types. This shows that heterogeneous NoC designs based on traditional metal wires, wireless, and optical links, occupy a non-optimal spot in the entire design space. Our results further show that heterogeneous NoCs scale up significantly better in terms of performance and cost compared to mesh networks. We uncovered that network communities evolve robustly and that heterogeneous link types are efficiently establishing inter- and intra-subnet connections depending on their link type properties. We also show that mapping an application on our application-specific NoC architecture provides on average 45% more throughput at 70% less energy consumption compared to regular 2D mesh networks. The NoCs are therefore not only good for the application they were designed for, but for a broad range of other applications as well.
15

Conception des systèmes logiciel/matériel : du partitionnement logiciel/matériel au prototypage sur plateformes reconfigurables

Rousseau, F. 08 July 2005 (has links) (PDF)
Ce document retrace mes activités de recherche depuis ma thèse soutenue en juillet 1997. Certains des travaux présentés sont achevés, d'autres sont en cours ou encore dans un stade exploratoire. De 1993 à 1999, je me suis intéressé aux différents aspects du <br />partitionnement logiciel/matériel dans la conception de systèmes intégrés numériques de télécommunications. Depuis 1999, mes travaux ont porté sur la conception de systèmes multiprocesseurs monopuces, et plus particulièrement sur ce qui a trait aux relations entre <br />logiciel et matériel. Ces systèmes sont généralement dédiés à une application ou à une classe d'applications, ce qui permet d'optimiser l'architecture et les programmes. Mes recherches ses sont donc <br />focalisées sur l'architecture mémoire, les interfaces de <br />communication entre composants et le prototypage. Pour ces trois axes de recherche, des méthodes et des outils d'aide à la conception ont été définis et développés. Des travaux toujours en cours portent sur la généralisation d'une méthode de conception de composants d'interface matériels à partir <br />d'une spécification sous forme de services requis et fournis. Une telle spécification est déjà utilisée pour représenter des protocoles dans les réseaux de communication et pour le développement<br />des couches logicielles de communication. Son extension à la conception des interfaces matérielles homogénéiserait les langages, méthodes et outils de l'environnement de conception. Mes travaux futurs s'orientent vers deux axes : l'intégration <br />logiciel/matériel et l'adéquation entre architecture et système d'exploitation. Dans les deux cas, les relations étroites entre les ressources physiques de l'architecture et les couches logicielles qui y accèdent doivent permettre d'améliorer sensiblement les performances.
16

Physical synthesis for nanometer VLSI and emerging technologies

Cho, Minsik, 1976- 07 September 2012 (has links)
The unabated silicon technology scaling makes design and manufacturing increasingly harder in nanometer VLSI. Emerging technologies on the horizon require strong design automation to handle the large complexity of future systems. This dissertation studies eight related research topics in design and manufacturing closure in nanometer VLSI as well as design optimization for emerging technologies from physical synthesis perspective. In physical synthesis for design closure, we study three research topics, which are key challenges in nanometer VLSI designs: (a) We propose a highly efficient floorplanning algorithm to minimize substrate noise for mixed-signal system-on-a-chip designs. (b) We propose a clock tree synthesis algorithm to reduce clock skew under thermal variation. (c) We develop a global router, BoxRouter to enhance routability which is one of the classic but still critical challenges in modern VLSI. In physical synthesis for manufacturing closure, we propose the first systematic manufacturability aware routing framework to address three key manufacturing challenges: (a) We develop a predictive chemical-mechanical polishing model to guide global routing in order to reduce surface topography variation. (b) We formulate a random defect minimize problem in track routing, and develop a highly efficient algorithm. (b) We propose a lithography enhancement technique during detailed routing based on statistical and macro-level Post-OPC printability prediction. Regarding design optimization of emerging technologies, we focus on two topics, one in double patterning technology for future VLSI fabrication and the other in microfluidics for biochips: (a) We claim double patterning should be considered during physical synthesis, and propose an effective double patterning technology aware detailed routing algorithm. (b) We propose a droplet routing algorithm to improve routability in digital microfluidic biochip design. / text
17

Architecture and physical design for advanced networks-on-chip

Jang, Woo Young 01 June 2011 (has links)
The aggressive scaling of the semiconductor technology following the Moore’s Law has delivered true system-on-chip (SoC) integration. Network-on-chip (NoC) has been recently introduced as an effective solution for scalable on-chip communication since dedicated point-to-point (P2P) interconnection and shared bus architecture become performance and power bottlenecks in the SoCs. This dissertation studies three critical NoC challenges such as latency, power, and compatibility with emerging technologies in aspect of an architecture and physical design level. Latency is a key issue in NoC since the performance of applications considerably depends on resource sharing policies employed in an on-chip network. NoCs have been mainly developed to improve network-level performance that captures the inherent performance characteristics of a network itself, but the network-level optimizations are not directly related to application- or system-level performance. In addition, memory latency on NoC critically affects the performance of applications or systems. We propose a synchronous dynamic random access memory (SDRAM) aware NoC design to optimize memory throughput, latency, and design complexity. Furthermore, it is extended to an application-aware NoC design to provide the quality-of-service (QoS) of memory for various applications. NoC provides great on-chip communication. However, it brings no true relief to power budget when the on-chip network scales in terms of complexity/size and signal bandwidth. The combination of NoC and other techniques has the potential to reduce power. We study two power saving research topics for NoC: (a) we propose a voltage-frequency island (VFI) aware NoC optimization framework with a better tradeoff between power efficiency and design complexity to minimize both computation and on-chip communication power. (b) We formulate an application mapping problem to mixed integer quadratic programming (MIQP) with the purpose of reducing power consumption in various hard networks and develop highly efficient algorithms for the MIQP. Regarding NoC compatible with new technologies, we focus on three dimensional (3D) die integration based on through-silicon vias (TSVs). Since an on-chip network design has been subject to not only application constraints but also design/manufacturing constraints, a 3D NoC design is required for innovation in interconnection networks. We propose a chemical-mechanical polishing (CMP) aware application-specific 3D NoC design that minimizes TSV height variation, thus reduces bonding failure, and meanwhile optimizes conventional NoC design objectives such as hop count, wirelength, power, and area. / text
18

A Logic Test Chip for Optimal Test and Diagnosis

Niewenhuis, Benjamin T. 01 May 2018 (has links)
The benefits of the continued progress in integrated circuit manufacturing have been numerous, most notably in the explosion of computing power in devices ranging from cell phones to cars. Key to this success has been strategies to identify, manage, and mitigate yield loss. One such strategy is the use of test structures to identify sources of yield loss early in the development of a new manufacturing process. However, the aggressive scaling of feature dimensions, the integration of new materials, and the increase in structural complexity in modern technologies has challenged the capabilities of conventional test structures. To help address these challenges, a new logic test chip, called the Carnegie Mellon Logic Characterization Vehicle (CM-LCV), has been developed. The CM-LCV utilizes a two- dimensional array of functional unit blocks (FUBs) that each implement an innovative functionality. Properties including fault coverage, logical and physical design features, and fault distinguishability are shown to be composable within the FUB array; that is, they exist regardless of the size and composition of the FUB array. A synthesis ow that leverages this composability to adapt the FUB array to a wide range of test chip design requirements is presented. The connection between the innovative FUB functionality and orthogonal Latin squares is identified and used to analyze the universe of possible FUB functions. Two additional variants to the FUB array are also developed: heterogenous FUB arrays utilize multiple FUB functions to improve the synthesis ow performance, while pipelined FUB arrays incorporate sequential circuit elements (e.g., ip- ops and latches) that are absent from the original combinational FUB array. In addition to the design of the CM-LCV, methods for testing it are presented. Techniques to create minimal sets of test patterns that exhaustively exercise each FUB within the FUB array are developed. Additional constraints are described for the heterogenous and pipelined FUB arrays that allow these techniques to be applied for both variant FUB arrays. Furthermore, a simple built-in self test (BIST) scheme is described and applied to a reference design, resulting in a 88.0% reduction in the number of test cycles required without loss in fault coverage. A hierarchical FUB array diagnosis methodology (HFAD) is also presented for the CM- LCV that leverages its unique properties to improve performance for multiple defects. Experiments demonstrate that this HFAD methodology is capable of perfect accuracy in 93.1% of simulations with two injected faults, an improvement on the state-of-the-art commercial diagnosis. Additionally, silicon fail data was collected from a CM-LCV manufactured using a 14nm process by an industry partner. A comparison of the diagnosis results for the 1,375 fail logs examined shows that the HFAD methodology discovers additional defects during multiple defect diagnosis that the commercial tool misses for 40 of the diagnosed fail logs. Examination of these cases shows that the additional defects found by the HFAD methodology can result in improved diagnosis confidence and more precise descriptions of the defect behavior(s). The contributions of this dissertation can thus be summarized as the description of the design, test, and diagnosis of a new logic test chip for use in yield learning during process development. This CM-LCV can be adapted to meet a wide range of test chip requirements, can be efficiently and rigorously tested, and exhibits properties that can be used to improve diagnosis outcomes. All of these claims are validated through both simulated experiments and silicon data.
19

Método otimizado de arquitetura de coerência de cache baseado em sistemas embarcados multinúcleos. / Optimized method for cache coherence architecture based on multicore embedded systems.

Kofuji, Jussara Marândola 01 December 2011 (has links)
A tese apresenta um método de arquitetura de coerência de cache especializado por sistemas embarcados. Um das contribuições principais deste método é apresentar uma proposição de arquitetura CMP de memória compartilhada orientada a padrões de acesso a memória e de um protocolo de coerência híbrido. A contribuição principal é a especificação do novo componente de hardware, chamado tabela de padrões, o qual é validado por representação formal e pela implementação da estrutura da tabela de padrões. A partir desta tabela foi desenvolvido um modelo de transação de mensagens do protocolo híbrido que diferencia as mensagens em clássicas e especulativas. A contribuição final apresenta um modelo analítico do custo efetivo de desempenho do protocolo híbrido. / This thesis presents the optimized method of cache coherent architecture based on embedded systems. The main contribution of this method presents the proposal of shared memory architecture CMP oriented by memory access patterns and cache coherent hybrid protocol. The cache coherent architecture provided the hardware specification called pattern table which can be validated by formal representation and the first implementation of pattern table. Through pattern table was developed the model of messages transaction to hybrid protocol witch differ the messages in classical and speculative. The final contribution presents the analytic model of effective cost of hybrid protocol performance.
20

Método otimizado de arquitetura de coerência de cache baseado em sistemas embarcados multinúcleos. / Optimized method for cache coherence architecture based on multicore embedded systems.

Jussara Marândola Kofuji 01 December 2011 (has links)
A tese apresenta um método de arquitetura de coerência de cache especializado por sistemas embarcados. Um das contribuições principais deste método é apresentar uma proposição de arquitetura CMP de memória compartilhada orientada a padrões de acesso a memória e de um protocolo de coerência híbrido. A contribuição principal é a especificação do novo componente de hardware, chamado tabela de padrões, o qual é validado por representação formal e pela implementação da estrutura da tabela de padrões. A partir desta tabela foi desenvolvido um modelo de transação de mensagens do protocolo híbrido que diferencia as mensagens em clássicas e especulativas. A contribuição final apresenta um modelo analítico do custo efetivo de desempenho do protocolo híbrido. / This thesis presents the optimized method of cache coherent architecture based on embedded systems. The main contribution of this method presents the proposal of shared memory architecture CMP oriented by memory access patterns and cache coherent hybrid protocol. The cache coherent architecture provided the hardware specification called pattern table which can be validated by formal representation and the first implementation of pattern table. Through pattern table was developed the model of messages transaction to hybrid protocol witch differ the messages in classical and speculative. The final contribution presents the analytic model of effective cost of hybrid protocol performance.

Page generated in 0.0773 seconds