Global ETD Search

81	Performance and Energy Efficient Network-on-Chip Architectures Vangal, Sriram January 2007 (has links) The scaling of MOS transistors into the nanometer regime opens the possibility for creating large Network-on-Chip (NoC) architectures containing hundreds of integrated processing elements with on-chip communication. NoC architectures, with structured on-chip networks are emerging as a scalable and modular solution to global communications within large systems-on-chip. NoCs mitigate the emerging wire-delay problem and addresses the need for substantial interconnect bandwidth by replacing today’s shared buses with packet-switched router networks. With on-chip communication consuming a significant portion of the chip power and area budgets, there is a compelling need for compact, low power routers. While applications dictate the choice of the compute core, the advent of multimedia applications, such as three-dimensional (3D) graphics and signal processing, places stronger demands for self-contained, low-latency floating-point processors with increased throughput. This work demonstrates that a computational fabric built using optimized building blocks can provide high levels of performance in an energy efficient manner. The thesis details an integrated 80- Tile NoC architecture implemented in a 65-nm process technology. The prototype is designed to deliver over 1.0TFLOPS of performance while dissipating less than 100W. This thesis first presents a six-port four-lane 57 GB/s non-blocking router core based on wormhole switching. The router features double-pumped crossbar channels and destinationaware channel drivers that dynamically configure based on the current packet destination. This enables 45% reduction in crossbar channel area, 23% overall router area, up to 3.8X reduction in peak channel power, and 7.2% improvement in average channel power. In a 150-nm sixmetal CMOS process, the 12.2 mm2 router contains 1.9-million transistors and operates at 1 GHz at 1.2 V supply. We next describe a new pipelined single-precision floating-point multiply accumulator core (FPMAC) featuring a single-cycle accumulation loop using base 32 and internal carry-save arithmetic, with delayed addition techniques. A combination of algorithmic, logic and circuit techniques enable multiply-accumulate operations at speeds exceeding 3GHz, with singlecycle throughput. This approach reduces the latency of dependent FPMAC instructions and enables a sustained multiply-add result (2FLOPS) every cycle. The optimizations allow removal of the costly normalization step from the critical accumulation loop and conditionally powered down using dynamic sleep transistors on long accumulate operations, saving active and leakage power. In a 90-nm seven-metal dual-VT CMOS process, the 2 mm2 custom design contains 230-K transistors. Silicon achieves 6.2-GFLOPS of performance while dissipating 1.2 W at 3.1 GHz, 1.3 V supply. We finally present the industry's first single-chip programmable teraFLOPS processor. The NoC architecture contains 80 tiles arranged as an 8×10 2D array of floating-point cores and packet-switched routers, both designed to operate at 4 GHz. Each tile has two pipelined singleprecision FPMAC units which feature a single-cycle accumulation loop for high throughput. The five-port router combines 100 GB/s of raw bandwidth with low fall-through latency under 1ns. The on-chip 2D mesh network provides a bisection bandwidth of 2 Tera-bits/s. The 15-FO4 design employs mesochronous clocking, fine-grained clock gating, dynamic sleep transistors, and body-bias techniques. In a 65-nm eight-metal CMOS process, the 275 mm2 custom design contains 100-M transistors. The fully functional first silicon achieves over 1.0TFLOPS of performance on a range of benchmarks while dissipating 97 W at 4.27 GHz and 1.07-V supply. It is clear that realization of successful NoC designs require well balanced decisions at all levels: architecture, logic, circuit and physical design. Our results demonstrate that the NoC architecture successfully delivers on its promise of greater integration, high performance, good scalability and high energy efficiency. Chips MOS transistors Network-on-Chip (NoC) process technology FPMAC Electrical engineering Elektroteknik
82	Router Architecture for Junction Based Source Routing:Design and FPGA Prototyping Aslam, Muhammad Awais January 2012 (has links) The increase in the number of cores that can be integrated on a single chip has forced the designer to use computer network concepts for design of System on Chip (SoC). This idea led to development of Network on Chip (NoC) to deal with more cores on a single chip. NoC has three main parts, namely routers, link and network interface through which cores are connected to NoC. Router is one of the most important parts because cores communicate with other cores through routers. One of the important tasks for a NoC designer is to design router with low latency.Router design depends on the routing protocol and routing algorithm used. Two kinds of routing algorithms are source routing and distributed routing. In source routing, complete route information is available in Head flit while in distributed routing, routing decisions are taken inside every router on the path. Source routing has speed advantage over distributed routing because the packet itself contains the routing information. But source routing leads to overhead to store complete path information in the header of each packet. To overcome this flaw, junction based source routing has been introduced. If destination is far away from the source then first packet will go to a junction and get the new path information from the junction to the destination. Thus we need to store the path information only for a few hops in the packet header. This idea has been taken from the daily experience of train journey. In this thesis we have developed design of a router for junction based source routing. Main component of simple router includes buffering, header modification and making route decision. Router includes a table called Path Table which stores information about paths from junction to various destinations. JB router also includes, picking up the new path information from Path Table and modify the header by adding new path information.We have developed VHDL designs of two versions of the routers for Junction Based Routing. The delay performance of routers have been analysed through simulation. A simple prototype of the router has also been implemented in Altera FPGA to find out the resource requirements of the new router designs. JBR router router architecture NoC router design router architecture for JBR
83	Mobile Home Node: Improving Directory Cache Coherence Performance in NoCs via Exploitation of Producer-Consumer Relationships Soni, Tarun 2010 August 1900 (has links) The implementation of multiple processors on a single chip has been made possible with advancements in process technology. The benefits of having multiple cores on a single chip bring with it a new set of constraints for maintaining fast and consistent memory accesses. Cache coherence protocols are needed to maintain the consistency of shared memory on individual caches. Current cache coherency protocols are either snoop based, which is not scalable but provides fast access for small number of cores, or directory based, which involves a directory that acts as the ordering point providing scalability with relatively slower access. Our focus is on improving the memory access time of the scalable directory protocol. We have observed that most memory requests follow a pattern where in one of the processors, which we will dub the Producer, repeatedly writes to a particular memory location. A subset of the remaining cores, which we will dub the Consumers, repeatedly read the data from that same memory location. In our implementation we utilize this relationship to provide direct cache to cache transfers and minimize the access time by avoiding the indirection through the directory. We move the directory temporarily to the Producer node so that the consumer can directly request the producer for the cache line. Our technique improves the memory access time by 13 percent and reduces network traffic by 30 percent over standard directory coherence protocol with very little area overhead. cache coherence protocol directory based coherence snoop based coherence NoC CMP Mobile DCP
84	Network-on-chip architectures for scalability and service guarantees Grot, Boris 13 July 2012 (has links) Rapidly increasing transistor densities have led to the emergence of richly-integrated substrates in the form of chip multiprocessors and systems-on-a-chip. These devices integrate a variety of discrete resources, such as processing cores and cache memories, on a single die with the degree of integration growing in accordance with Moore's law. In this dissertation, we address challenges of scalability and quality-of-service (QOS) in network architectures of highly-integrated chips. The proposed techniques address the principal sources of inefficiency in networks-on-chip (NOCs) in the form of performance, area, and energy overheads. We also present a comprehensive network architecture capable of interconnecting over a thousand discrete resources with high efficiency and strong guarantees. We first show that mesh networks, commonly employed in existing chips, fall significantly short of achieving their performance potential due to transient congestion effects that diminish network performance. Adaptive routing has the potential to improve performance through better load distribution. However, we find that existing approaches are myopic in that they only consider local congestion indicators and fail to take global network state into account. Our approach, called Regional Congestion Awareness (RCA), improves network visibility in adaptive routers via a light-weight mechanism for propagating and integrating congestion information. By leveraging both local and non-local congestion indicators, RCA improves network load balance and boosts throughput. Under a set of parallel workloads running on a 49-node substrate, RCA reduces on-chip network latency by 16%, on average, compared to a locally-adaptive router. Next, we target NOC latency and energy efficiency through a novel point-to-multipoint topology. Ring and mesh networks, favored in existing on-chip interconnects, often require packets to go through a number of intermediate routers between source and destination nodes, resulting in significant latency and energy overheads. Topologies that improve connectivity, such as fat tree and flattened butterfly, eliminate much of the router overhead, but require non-minimal channel lengths or large channel count, reducing energy-efficiency and/or performance as a result. We propose a new topology, called Multidrop Express Channels (MECS), that augments minimally-routed express channels with multi-drop capability. The resulting richly-connected NOC enjoys a low hop count with favorable delay and energy characteristics, while improving wire utilization over prior proposals. Applications such as virtualized servers-on-a-chip and real-time systems require chip-level quality-of-service (QOS) support to provide fairness, service differentiation, and guarantees. Existing network QOS approaches suffer from considerable performance and area overheads that limit their usefulness in a resource-limited on-die network. In this dissertation, we propose a new QOS scheme called Preemptive Virtual Clock (PVC). PVC uses a preemptive approach to provide hard guarantees and strong performance isolation while dramatically reducing queuing requirements that burden prior proposals. Finally, we introduce a comprehensive network architecture that overcomes the bottlenecks of earlier designs with respect to area, energy, and QOS in future highly-integrated chips. The proposed NOC uses a topology-centric QOS approach that restricts the extent of hardware QOS support to a fraction of the network without compromising guarantees. In doing so, network area and energy efficiency are significantly improved. Further improvements are derived through a novel flow-control mechanism, along with switch- and link-level optimizations. In concert, these techniques yield a network capable of interconnecting over a thousand terminals on a die while consuming 47% less area and 26% less power than a state-of-the-art QOS-enabled NOC. The mechanisms proposed in this dissertation are synergistic and enable efficient, high-performance interconnects for future chips integrating hundreds or thousands of on-die resources. They address deficiencies in routing, topologies, and flow control of existing architectures with respect to area, energy, and performance scalability. They also serve as a building block for cost-effective advanced services, such as QOS guarantees at the die level. / text Network-on-chip NOC Interconnection network Quality-of-service QOS Topology Routing Flow control
85	Knowledge, perception and utilisation of chiropractic by National Olympic Committees Labuschagne, Kerry January 2009 (has links) A dissertation submitted in partial compliance with the requirements for a Masters Degree in Technology, in the Department of Chiropractic at the Durban University of Technology, 2009. / Introduction: National Olympic Committees (NOCs) select medical personnel to support their athletes at the Olympic Games. To best support athletes the knowledge, perception and utilisation of all medical professions is assumed to be high, however literature seems to indicate that this is not so. Objective: To determine the knowledge, perception and utilisation of Chiropractic by NOCs in order to develop a better relationship so that more athletes can benefit from Chiropractic care. Methods: A questionnaire was emailed to the 205 NOCs worldwide. Respective executive committee and medical commission members were asked to complete the questionnaires. Results: 76 NOCs responded (37%), returning 27 questionnaires. 30% of the respondents were high ranking members. 93% were highly educated with a bachelor’s degree or higher and 33% had represented their country as an athlete. Both committees agreed on the importance of a post-graduate sports qualification and perceived the profession to be one of spinal care specialists. Overall knowledge of Chiropractic was poor. A trend was observed among the medical commissions in their choice of Medical Doctors or Physiotherapists over Chiropractors and other professionals. The executive committees in contrast seemed more open-minded in their choice of professionals. No association was found between the knowledge and perception of Chiropractic and use of Chiropractic Conclusion: There is confusion regarding the role and scope of practice of Chiropractic by NOCs. In order to achieve a greater level of acceptance and utilisation of Chiropractic in international sports medical teams the profession needs to clarify their role, better educate NOC members on the benefits of Chiropractic, and obtain sports specific post-graduate programmes that are recognised internationally. Chiropractic Knowledge National Olympic Committees NOC Olympics Perceptions Chiropractic--Evaluation Chiropractic--Utilization review Olympics--Societies, etc
86	Adaptive NoC for reconﬁgurable SoC Pratomo, Istas 08 November 2013 (has links) (PDF) Chips will be designed with billions of transistors and heterogeneous components integrated to provide full functionality of a current application for embedded system. These applications also require highly parallel and flexible communicating architecture through a regular interconnection network. The emerging solution that can fulfill this requirement is Network-on-Chips (NoCs). Designing an ideal NoC with high throughput, low latency, minimum using resources, minimum power consumption and small area size are very time consuming. Each application required different levels of QoS such as minimum level throughput delay and jitter. In this thesis, firstly, we proposed an evaluation of the impact of design parameters on performance of NoC. We evaluate the impact of NoC design parameters on the performances of an adaptive NoCs. The objective is to evaluate how big the impact of upgrading the value on performances. The result shows the accuracy of choosing and adjusting the network parameters can avoid performance degradation. It can be considered as the control mechanism in an adaptive NoC to avoid the degradation of QoS NoC. The use of deep sub-micron technology in embedded system and its variability process cause Single Event Upsets (SEU) and ''aging'' the circuit. SEU and aging of circuit is the major problem that cause the failure on transmitting the packet in a NoC. Implementing fault-tolerant routing techniques in NoC switching instead of adding virtual channel is the best solution to avoid the fault in NoC. Communication performance of a NoC is depends heavily on the routing algorithm. An adaptive routing algorithm such as fault-tolerant has been proposed for deadlock avoidance and load balancing. This thesis proposed a novel adaptive fault-tolerant routing algorithm for 2D mesh called Gradient and for 3D mesh called Diagonal. Both algorithms consider sequences of alternative paths for packets when the main path fails. The proposed algorithm tolerates faults in worst condition traffic in NoCs. The number of hops, the number of alternative paths, latency and throughput in faulty network are determined and compared with other 2D mesh routing algorithms. Finally, we implemented Gradient routing algorithm into FPGA. All these work were validated and characterized through simulation and implemented into FPGA. The results provide the comparison performance between proposed method with existing related method using some scenarios. [SPI:OTHER] Engineering Sciences/Other NoC Network on chip Adaptive routing Fault tolerant
87	Exploração do paralelismo em arquiteturas para processamento de imagens e vídeo / Parallelism exploration in architectures for video and image processing Soares, Andre Borin January 2007 (has links) O processamento de vídeo e imagens é uma área de pesquisa de grande importância atualmente devido ao incremento de utilização de imagens nas mais variadas áreas de atividades: entretenimento, vigilância, supervisão e controle, medicina, e outras. Os algoritmos utilizados para reconhecimento, compressão, descompressão, filtragem, restauração e melhoramento de imagens apresentam freqüentemente uma demanda computacional superior àquela que os processadores convencionais podem oferecer, exigindo muitas vezes o desenvolvimento de arquiteturas dedicadas. Este documento descreve o trabalho realizado na exploração do espaço de projeto de arquiteturas para processamento de imagem e de vídeo, utilizando processamento paralelo. Várias características particulares deste tipo de arquitetura são apontadas. Uma nova técnica é apresentada, na qual Processadores Elementares (P.E.s) especializados trabalham de forma cooperativa sobre uma estrutura de comunicação em rede intra-chip / Nowadays video and image processing is a very important research area, because of its widespread use in a broad class of applications like entertainment, surveillance, control, medicine and many others. Some of the used algorithms to perform recognition, compression, decompression, filtering, restoration and enhancement of the images, require a computational power higher than the one available in conventional processors, requiring the development of dedicated architectures. This document presents the work developed in the design space exploration in the field of video and image processing architectures by the use of parallel processing. Many characteristics of this kind of architecture are pointed out. A novel technique is presented in which customized Processing Elements work in a cooperative way over a communication structure using a network on chip. Microeletrônica Processamento : Imagem Image processing NOC Image processing architectures Image processing hardware
88	Segmentace v kontextu rozvoje publika / Segmentation in the context of audience development Kocianová, Barbora January 2016 (has links) The Diploma thesis – Segmentation in the context of audience development – examines approaches towards deepening relationship between audience and cultural organizations based on usage of segmentation models. These models provide better understanding of audience’s needs and their reasons to participate in cultural activities. The theoretical background follows historical perspective of audience development and its tools, the connection between marketing and audience development, criteria and types of segmentation and also cultural particicpation factors, which influence audience’s behaviour and their decision making. In order to elaborate on this topic thoroughly, the British audience segmentation models created by the organisations Morris Hargreaves McIntyre, Arts Council England and The Audience Agency are analysed. Author’s own segmentation model is presented along with recommendations for audience development tools suitable for each segment. This model is based mainly on interviews with representatives of aforementioned British organizations and on data collected via questionnaires from audience participating in Theatre Night 2014 in the Czech Republic.
89	Estudo sobre o impacto da hierarquia de memória em MPSoCs baseados em NoC Silva, Gustavo Girão Barreto da January 2009 (has links) Ao longo dos últimos anos, os sistemas embarcados vêm se tornando cada vez mais complexos tanto em termos de hardware quanto de software. Ultimamente têm-se adotado como solução o uso de MPSoCs (sistemas multiprocessados integrados em chip) para uma maior eficiência energética e computacional nestes sistemas. Com o uso de diversos elementos de processamento, redes-em-chip (NoC - networks-on-chip) aparecem como soluções de melhor desempenho do que barramentos. Nestes ambientes cujo desempenho depende da eficiência do modelo de comunicação, a hierarquia de memória se torna um elemento chave. Baseando-se neste cenário, este trabalho realiza uma investigação sobre o impacto da hierarquia de memória em MPSoCs baseados em NoC. Dentro deste escopo foi desenvolvida uma nova organização de memória fisicamente centralizada com diferentes espaços de endereçamentos denominada nDMA. Este trabalho também apresenta uma comparação entre a nova organização e outras três organizações bastante difundidas tais como memória distribuída, memória compartilhada e memória compartilhada distribuída. Estas duas ultimas adotam um modelo de coerência de cache baseado em diretório completamente desenvolvido em hardware. Os modelos de memória foram implementados na plataforma virtual SIMPLE (SIMPLE Multiprocessor Platform Environment). Resultados experimentais mostram uma forte dependência com relação à carga de comunicação gerada pelas aplicações. O modelo de memória distribuída apresenta melhores resultados conforme a carga de comunicação das aplicações é baixa. Por outro lado, o novo modelo de memória fisicamente compartilhado com diferentes espaços de endereçamento apresenta melhores resultados conforme a carga de comunicação das aplicações é alta. Também foram realizados experimentos objetivando analisar o desempenho dos modelos de memória em situações de alta latência de comunicação na rede. Resultados mostram melhores resultados do modelo de memória distribuída quando a carga de comunicação das aplicações é alta e, caso contrário, o modelo nDMA apresenta melhores resultados. Por fim, foram analisados os desempenhos dos modelos de memória durante o processo de migração de tarefas. Neste caso, os modelos de memória compartilhada e compartilhada distribuída apresentaram melhores resultados devido ao fato de que não se faz necessária o envio dos dados da aplicação nestes modelos e também devido ao menor tamanho de código se comparado com os outros modelos. / In the past few the years, embedded systems have become even more complex both on terms of hardware and software. Lately, the use of MPSoCs (Multi-Processor Systems-on-Chip) has been adopted on these systems for a better energetic and computational efficiency. Due to the use of several processing elements, Networks-on-Chip arise as better performance solutions than buses. Considering this scenario, this work performs an investigation on the impact of memory hierarchy in NoC-based MPSoCs. In this context, a new physically centralized and shared memory organization with different address spaces named nDMA was developed. This work also presents a comparison between the new memory organization and three different well-known memory hierarchy models such as distributed memory and shared and distributed shared memories that make use of a fully hardware cache coherence solution. The memory models were implemented in the SIMPLE (SIMPLE Multiprocessor Platform Environment) virtual platform. Experimental results shows a strong dependency on the application communication workload. The distributed memory model presents better results as the application communication workload is low. On the other hand, the new memory model (physically shared with different address spaces) presents better results as the application communication workload is high. There were also experiments aiming at observing the performance of the memory models in situations where the communication latency on the network is high. Results show better results of the distributed memory model when the application communication workload is high, and the nDMA model presents better results otherwise. Finally, the performance of the memory models during a task migration process were evaluated. In this case, the shared memory and distributed shared memory models presented better results due to the fact that in this case the data memory does not need to be transferred from one point to another and also due to the low size of the memory code in these cases if compared to other memory models. Microeletrônica MPSoC NoC Embedded systems Multiprocessor system-on-chip Network-on-chip Cache coherence Task migration
90	Management kojení nedonošených novorozenců s aplikací klasifikačních systémů NANDA International, NIC, NOC / Management of breast-feeding premature infants with the application of classification systems NANDA International, NIC, NOC MELNIČÁKOVÁ, Bernadetta January 2013 (has links) In a theoretical part of the paper, the main focus is on characteristics and terminology of premature babies, breastfeeding, classification systems NANDA International, NIC and NOC, about also management which is an integral part of nursing process. Both quantitative and qualitative researches were used to obtain and process all data. In quantitative method was used a method of document was used, content analysis and quasiexperiment. Results of quantitative research were processed in SPSS software, version 16.0 Statistical Package for Social Sciences, where a method of nonparametric correlation was used. A pen and paper interview with open questions was used in qualitative section of research. Before the research itself, a deputy of nursing care in hospital Nemocnice České Budějovice, a.s. was addressed. The research was implemented between February 2013 and April 2013. First research cluster consisted of nine mothers and eleven premature babies in which two of the mothers had twins. Basic criteria for entering the research, was a necessary hospitalization of both the mother and the child, which the hospital Nemocnice České Budějovice, a. s. offers. The major criterion was an initiated lactation. Second research sample consisted of nurses working at Stanice intermediární péče II. (IMP II ? rooming) of Nemocnice České Budějovice a. s. First of goals was to create files from classification systems NANDA International, NIC, NOC for development and support of breastfeeding for premature babies. The main goal of this work was to create a comprehensive documentation mediated by classification systems NANDA International, NIC, NOC for growth and support of premature babies. The theoretical content of the paper itself was used to create documentation and its usage in praxis. It was not mentioned to create new, but to apply already well known, verified and proven classification systems, which can make not only breastfeeding, but also the daily routine of a nurse more efficient. The suggested documentation solves the issues of breastfeeding of premature babies to help mothers, but also to increase a professional prestige of nurses. Documentation was compiled to include all the necessary information needed for the correct nursing process method. Significant part of documentation is a preliminary case history of mother and child, which offered valuable information. Further attribute were nursing diagnoses dealing with breastfeeding issues of premature babies and intervention realized during hospitalization. It was necessary to address each entry concerned with premature babies individually, based on complex evaluation of overall condition. Many pieces of knowledge from specialized literature on the topic of premature babies and personal experience of doc. PhDr. Mária Bolendovičová, PhD. were used to create the documentation. Second aim of the study was verification of selected specimens on clinical praxis. Third aim was to monitor nurses in Stanice intermediární péče II. (IMP II ? rooming) and their attitude to usage of classification systems NANDA International, NIC and NOC. Based on interviews with nurses it was proven that nurses have better knowledge of NANDA Internationals classification systems than knowledge of NIC and NOC. Final documentation was tested in the perspective of medical staff and its exploitability in clinical praxis. It was found that nurses sense the classification systems NANDA International, NIC and NOC as an appropriate tool, but only with more staff present to work on station. From statements of nurses was clear that establishing a diagnoses is a responsibility of each nurse taking part in nursing process.

Search results