• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 9
  • 6
  • 1
  • 1
  • Tagged with
  • 17
  • 9
  • 9
  • 9
  • 8
  • 7
  • 7
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Hardware-software design methods for security and reliability of MPSoCs

Patel, Krutartha , Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links)
Security of a Multi-Processor System on Chip (MPSoC) is an emerging area of concern in embedded systems. MPSoC security is jeopardized by Code Injection attacks. Code Injection attacks, which are the most common types of software attacks, have plagued single processor systems. Design of MPSoCs must therefore incorporate security as one of the primary objectives. Code Injection attacks exploit vulnerabilities in \trusted" and legacy code. An architecture with a dedicated monitoring processor (MONITOR) is employed to simultaneously supervise the application processors on an MPSoC. The program code in the application processors is divided into basic blocks. The basic blocks in the application processors are statically instrumented with special instructions that allow communication with the MONITOR at runtime. The MONITOR verifies the execution of all the processors at runtime using control flow checks and either a timing or instruction count check. This thesis proposes a monitoring system called SOFTMON, a design methodology called SHIELD, a design flow called LOCS and an architectural framework called CUFFS for detecting Code Injection attacks. SOFTMON, a software monitoring system, uses a software algorithm in the MONITOR. SOFTMON incurs limited area overheads. However, the runtime performance overhead is quite high. SHIELD, an extension to the work in SOFTMON overcomes the limitation of high runtime overhead using a MONITOR that is predominantly hardware based. LOCS uses only one special instruction per basic block compared to two, as was the case in SOFTMON and SHIELD. Additionally, profile information is generated for all the basic blocks in all the application processors for the MPSoC designer to tune the design by increasing or decreasing the frequency of loop basic blocks. CUFFS detects attacks even without application processors communicating to the MONITOR. The SOFTMON, SHIELD and LOCS approaches can only detect attacks if the application processors communicate to the MONITOR. CUFFS relies on the exact number of instructions in basic blocks to determine an attack, rather than time-frame based measures used in SOFTMON, SHIELD and LOCS. The lowest runtime performance overhead was achieved by LOCS (worst case of 37.5%), while the SOFTMON monitoring system had the least amount of area overheads of about 25%. The CUFFS approach employed an active MONITOR and hence detected a greater range of attacks. The CUFFS framework also detects bit flip errors (reliability errors) in the control flow instructions of the application processors on an MPSoC. CUFFS can detect nearly 70% of all bit flip errors in the control flow instructions. Additionally, a modified CUFFS approach is proposed to ensure reliable inter-processor communication on an MPSoC. The modified CUFFS approach uses a hardware based checksum approach for reliable inter-processor communication and incurred a runtime performance overhead of up to 25% and negligible area overheads compared to CUFFS. Thus, the approaches proposed in this thesis equip an MPSoC designer with tools to embed security features during an MPSoC's design phase. Incorporating security measures at the processor design level provides security against software attacks in MPSoCs and incurs manageable runtime, area and code-size overheads.
2

Design methodologies for pipelined MPSoCs targeting multimedia applications

Javaid, Haris , Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links)
The semiconductor industry has seen a paradigm shift from Application Specific Integrated Circuits to Multiprocessor System on Chip systems over the last decade, primarily due to the miniaturization of the transistor. However, billion of transistors available on a single chip need to be used efficiently to provide more functionalities in portable devices, yet minimize power and chip area, which increases the design complexity of multiprocessor systems. Tighter time to market deadlines further pressurizes the designer, requiring a comprehensive automation of the design process of such complex multiprocessor systems. This thesis presents a design automation methodology for the design of Multiprocessor System on Chip (MPSoC) systems for multimedia applications. This thesis introduces a heterogeneous multiprocessor system where processing elements are connected in a pipelined fashion. A multimedia application is executed very efficiently on a pipelined system due to the stream oriented data flow nature of such applications. Application Specific Instruction set Processors (ASIPs) are used as the elementary processing elements in the multiprocessor system as they can be customized according to the application tasks assigned to them. The problem of selecting a processor configuration for each of the ASIPs in the pipelined system is formalized. We present three different techniques to select processor configurations by exploring the design space of an ASIP based pipelined system, and integrating them into a flexible and designer driven design flow for efficient exploration of large design spaces in order of 10^16 design points. The first two techniques are based on Integer Linear Programming (ILP), named Exact ILP formulation (EIF) and Reduced ILP formulation (RIF), while the third technique is based on a novel heuristic. We also developed a design space pruning algorithm that can enable the use of EIF and RIF to obtain optimal or near optimal design points from large design spaces. For four multimedia applications, we show that RIF and the heuristic can explore the design space and reveal the Pareto front in several hours, while EIF took several days to obtain the Pareto front. The quick availability of the Pareto front of a design space will help the designer to make early changes in the design. Furthermore, it is shown that, on average, the error incurred by RIF and the heuristic is within 1.25% and 2.25% of the optimal design points obtained via EIF for all the four multimedia applications. In the worst case, RIF introduced an error of 17.08% while the heuristic had an error of 11.39%.
3

HW-SW components for parallel embedded computing on Noc-based MPSoCs

Joven Murillo, Jaume 15 March 2010 (has links)
Recentment, en el camp del sistemes encastats, estem assistint al creixement de sistemes Multi-Processor System-on-Chip (MPSoC). El paradigma de Network-on-chip (NoC) s'ha proposat una solució viable, eficient, escalable, predictible i flexible per connectar components dins un xip, o inclús sistemes complets basats en busos dins al xip amb la finalitat de crear sistemes altament complexos. Així, el paradigma de computació encastada d'altres prestacions està arribant a través d'integrar hardware altament paral·lel amb llibreries software per obtenir una màxima integració a nivell de plataforma utilitzant de components prèviament dissenyats (IP cores), en la forma de arquitectures NoC-based MPSoCs. No obstant, quan el nombre de components augmenta hi ha diversos desafiaments i problemes a resoldre. El primer repte és el disseny d'una xarxa d'interconnexió que proporcioni qualitat de servei assegurant un cert ample de banda i latència entre cada bloc del sistema, amb el mínim area i consum possible. Ja que l'espai de disseny en arquitectures NoCs és enorme, s'han de desenvolupar entorns de simulació, i verificació per explorar validar i optimitzar múltiples NoC arquitectures. El segon objectiu, que és actualment un forat de recerca, és proveir models de programació paral·lela flexibles i eficients sobre les arquitectures NoC-based MPSoCs. Així, és obligatori l'ús de llibreries software lleugeres capaces d'explotar la capacitats del hardware present a la plataforma d'execució. Fent servir aquestes llibreries software permetrà els programadors reutilitzar i programar de manera fàcil aplicacions paral·leles dins un xip. Finalment, per obtenir un sistema eficient, un punt clau és el disseny de les interfícies HW-SW apropiades. Aquest fet és crucial in multi processadors heterogenis on els paradigmes de programació paral·lela and middleware han d'abstreure els recursos de comunicació durant l'especificació d'aplicacions software. El principal objectiu d'aquesta tesis és enriquir les emergents arquitectures NoC-based MPSoC explorant i fent contribucions de caire científic afrontant els nous reptes apareguts aquest últims anys. Aquesta tesis es focalitza en els següents temes: Descripció of un entorn experimental anomenat NoCMaker per realitzar exploració arquitectural de sistemes NoC-based MPSoC, permetent alhora una validació i prototipatge ràpid. Extensió de les interfícies de xarxa per controlar tràfic heterogeni de diferents estàndards (AMBA AHB, OCP-IP) amb la finalitat de reutilitzar i comunicar de manera transparent múltiple IP cores des del punt de vista de l'usuari. Proporcionar qualitat de servei en temps d'execució a traves de components hardware a la NoC, i de rutines middleware en software. Exploració de les interfícies HW-SW i la compartició de recursos quan una unitat de punt flotant es connecta com a coprocessador a un sistema NoC-based MPSoC. Migració de paradigmes de programació paral·lela, com memòria compartida i pas de missatges en arquitectures NoC-based MPSoCs. En aquesta tesis presentem el desenvolupament d'un model de programació paral·lela basat en pas de missatges (MPI), anomenat on-chip MPI. Això permet el disseny de programes paral·leles distribuïts a nivell de tasca o funció fent servir la programació paral·lela explicita amb els mètodes de sincronia entre els elements integrats en el xip. Proporcionant qualitat de servei en temps d'execució a sobre d'una llibreria OpenMP dissenyada per sistemes de memòria compartida amb la finalitat d'accelerar o balancejar aplicacions critiques i fils d'execució durant la seva execució. Tots els reptes explorats durant aquesta tesi doctoral estan formalitzats en una metodologia hardware-software centrada en la infraestructura de comunicació de la plataforma. Així, el resultat d'aquest treball d'investigació serà una plataforma cluster-on-chip per una computació paral·lela encastada d'altes prestacions, on els components hardware and software poden ser reutilitzats a diverses nivells d'abstracció. / Recently, on the on-chip and embedded domain, we are witnessing the growing of the Multi-Processor System-on-Chip (MPSoC) era. Network-on-chip (NoCs) have been proposed to be a viable, efficient, scalable, predictable and flexible solution to interconnect IP blocks on a chip, or full-featured bus-based systems in order to create highly complex systems. Thus, the paradigm to high-performance embedded computing is arriving through high hardware parallelism and concurrent software stacks to achieve maximum system platform composability and flexibility using pre-designed IP cores. These are the emerging NoC-based MPSoCs architectures. However, as the number of IP cores on a single chip increases exponentially, many new challenges arise. The first challenge is the design of a suitable hardware interconnection to provide adequate Quality of Service (QoS) ensuring certain bandwidth and latency bounds for inter-block communication, but at a minimal power and area costs. Due to the huge NoC design space, simulation and verification environments must be put in place to explore, validate and optimize many different NoC architectures. The second target, nowadays a hot topic, is to provide efficient and flexible parallel programming models upon new generation of highly parallel NoC-based MPSoCs. Thus, it is mandatory the use of lightweight SW libraries which are able to exploit hardware features present on the execution platform. Using these software stacks and their associated APIs according to a specific parallel programming model will let software application designers to reuse and program parallel applications effortlessly at higher levels of abstraction. Finally, to get an efficient overall system behaviour, a key research challenge is the design of suitable HW/SW interfaces. Specially, it is crucial in heterogeneous multiprocessor systems where parallel programming models and middleware functions must abstract the communication resources during high level specification of software applications. Thus, the main goal of this dissertation is to enrich the emerging NoC-based MPSoCs by exploring and adding engineering and scientific contribution to new challenges appeared in the last years. This dissertation focuses on all of the above points: by describing an experimental environment to design NoC-based systems, xENoC, and a NoC design space exploration tool named NoCMaker. This framework leads to a rapid prototyping and validation of NoC-based MPSoCs. by extending Network Interfaces (NIs) to handle heterogeneous traffic from different bus¬based standards (e.g. AMBA, OCP-IP) in order to reuse and communicate a great variety off-the-shelf IP cores and software stacks in a transparent way from the user point of view. by providing runtime QoS features (best effort and guaranteed services) through NoC-level hardware components and software middleware routines. by exploring HW/SW interfaces and resource sharing when a Floating Point Unit (FPU) co¬processor is interfaced on a NoC-based MPSoC. by porting parallel programming models, such as shared memory or message passing models on NoC-based MPSoCs. We present the implementation of an efficient lightweight parallel programming model based on Message Passing Interface (MPI), called on-chip Message Passing Interface (ocMPI). It enables the design of parallel distributed computing at task-level or function-level using explicit parallelism and synchronization methods between the cores integrated on the chip. by provide runtime application to packets QoS support on top of the OpenMP runtime library targeted for shared memory MPSoCs in order to boost or balance critical applications or threads during its execution. The key challenges explored in this dissertation are formalized on HW-SW communication centric platform-based design methodology. Thus, the outcome of this work will be a robust cluster-on-chip platform for high-performance embedded computing, whereby hardware and software components can be reused at multiple levels of design abstraction.
4

EXPLORATION OF RUNTIME DISTRIBUTED MAPPING TECHNIQUES FOR EMERGING LARGE SCALE MPSOCS / EXPLORATION DE TECHNIQUES D’ALLOCATION DE TÂCHES DYNAMIQUES ET DISTRIBUÉES POUR MPSOCS DE LARGE ÉCHELLE

Grandi Mandelli, Marcelo 13 July 2015 (has links)
MPSoCs (systèmes multiprocesseurs sur puces) avec des centaines de cœurs sont déjà disponibles sur le marché. Selon le ITRS, ces systèmes intégreront des milliers de cœurs à la fin de la décennie. La définition du cœur, où chaque tâche sera exécutée dans le système, est une question majeure dans la conception de MPSoCs. Dans la littérature, cette question est définie comme allocation de tâches. La croissance du nombre de cœurs augmente la complexité de l'allocation de tâches. Les principales préoccupations en matière d'allocation de tâches dans des grands MPSoCs incluent: (i) l'évolutivité; (ii) la charge de travail dynamique; et (iii) la fiabilité. Il est nécessaire de distribuer la décision d'allocation de tâches à travers le système afin d'assurer l'évolutivité. La charge de travail de grands MPSoCs peut être dynamique, à savoir, de nouvelles applications peuvent commencer à tout moment, conduisant à différents scénarios d'allocation. Par conséquent, il est nécessaire d'exécuter le processus d'allocation à l'exécution pour soutenir une charge de travail dynamique. La fiabilité est étroitement liée à la distribution de la charge de travail du système. Un déséquilibre de charge peut générer des hotspots et autres implications thermiques, ce qui peut entraîner un fonctionnement peu fiable du système. Dans de grands MPSoCs, les problèmes de fiabilité empirent puisque l'augmentation du nombre de cœurs sur la même puce augmente la densité de puissance et, par conséquent, la température du système. La littérature présente différentes techniques d'allocation de tâches pour améliorer la fiabilité du système. Cependant, ces techniques utilisent des approches d'allocation centralisées, qui ne sont pas évolutives. Pour répondre à ces trois défis, l'objectif principal de cette Thèse est de proposer et évaluer des heuristiques d'allocation de tâches distribuées et dynamiques en assurant l'évolutivité et une distribution équitable de la charge de travail. Une distribution équitable de la charge de travail et du trafic du NoC (réseau sur puce) augmente la fiabilité du système dans le long terme, en raison de la minimisation des régions de hotspot. Pour permettre l'exploration de l'espace de conception de grands MPSoCs, la première contribution de cette Thèse se situe dans le cadre d'une modélisation multi-niveaux, qui prend en compte différents modèles et de capacités de débogage qui enrichissent et facilitent la conception des MPSoCs. La simulation de modèles de niveau inférieur (par exemple RTL) génère des paramètres de performance utilisés pour calibrer des modèles abstraits (sans précision d'horloge). Les modèles abstraits permettent d'explorer des heuristiques d'allocation de tâches dans de grands systèmes. La plupart des techniques d'allocation de tâches se focalisent sur l'optimisation du volume de communication, ce qui peut compromettre la fiabilité du système, en raison d'une surcharge des processeurs. D'autre part, une heuristique qui optimise seulement la distribution de la charge de travail peut surcharger le NoC et compromettre sa fiabilité. La deuxième contribution importante de cette Thèse est la proposition d'heuristiques d'allocation de tâches dynamiques et distribuées, qui réalisent un compromis entre le volume de communication (liens du NoC) et la distribution de la charge de travail (de l'utilisation des processeurs). Des résultats liés au temps d'exécution, au volume de la communication, à la consommation d'énergie, aux traces de puissance et à la distribution de la température dans les grands MPSoCs (144 processeurs) confirment l'hypothèse de compromis. Faire un compromis entre la réduction du volume de communication et une distribution équitable de la charge de travail améliore le système de manière fiable grâce à la réduction des régions de hotspots, sans compromettre la performance du système. / MPSoCs with hundreds of cores are already available in the market. According to the ITRS roadmap, such systems will integrate thousands of cores by the end of the decade. The definition of where each task will execute in the system is a major issue in the MPSoC design. In the literature, this issue is defined as task mapping. The growth in the number of cores increases the complexity of the task mapping. The main concerns in task mapping in large systems include: (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decision across the system to ensure scalability. The workload of emerging large MPSoCs may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support a dynamic workload. Reliability is tightly connected to the system workload distribution. Load imbalance may generate hotspots zones and consequently thermal implications, which may result in unreliable system operation. In large scale MPSoCs, reliability issues get worse since the growing number of cores on the same die increases power densities and, consequently, the system temperature. The literature presents different task mapping techniques to improve system reliability. However, such approaches use a centralized mapping approach, which are not scalable. To address these three challenges, the main goal of this Thesis is to propose and evaluate distributed mapping heuristics, executed at runtime, ensuring scalability and a fair workload distribution. Distributing the workload and the traffic inside the NoC increases the system reliability in long-term, due to the minimization of hotspot regions. To enable the design space exploration of large MPSoCs the first contribution of the Thesis lies in a multi-level modeling framework, which supports different models and debugging capabilities that enrich and facilitate the design of MPSoCs. The simulation of lower level models (e.g. RTL) generates performance parameters used to calibrate abstract models (e.g. untimed models). The abstract models pave the way to explore mapping heuristics in large systems. Most mapping techniques focus on optimizing communication volume in the NoC, which may compromise reliability due to overload processors. On the other hand, a heuristic optimizing only the workload distribution may overload NoC links, compromising its reliability. The second significant contribution of the Thesis is the proposition of dynamic and distributed mapping heuristics, making a tradeoff between communication volume (NoC links) and workload distribution (CPU usage). Results related to execution time, communication volume, energy consumption, power traces and temperature distribution in large MPSoCs (144 processors) confirm the tradeoff hypothesis. Trading off workload and communication volume improves system reliably through the reduction of hotspots regions, without compromising system performance.
5

Reduzindo o consumo de energia em MPSoCs heterogêneos via clock gating / Reducing energy consumption in heterogeneous MPSoCs through clock gating

Motta, Rodrigo Bittencourt January 2008 (has links)
Nesse trabalho é apresentada uma arquitetura que habilita a geração de MPSoCs (Multiprocessors Systems-on-Chip) heterogêneos escaláveis, baseados em barramento, suportando ainda o uso de diferentes organizações de memória. A comunicação entre as tarefas é especificada por meio de uma estrutura de memória compartilhada, que evita colisões e promove ganhos energéticos através do disparo dinâmico de clock gating. Também é introduzida a técnica DCF (Dynamic Core Freezing), que incrementa a eficiência energética do MPSoC tirando proveito dos ciclos ociosos dos processadores durante os acessos à memória. Mais, a combinação das organizações de memória propostas habilita a exploração de migração de tarefas na arquitetura proposta, por meio da troca de contexto das tarefas na memória compartilhada. Além disso, é mostrado o simulador de alto-nível, baseado na arquitetura proposta, criado com o propósito de extrair os ganhos energéticos propiciados com o uso do clock gating e da técnica DCF. O simulador aceita como entrada arquivos de trace de execução de aplicações Java, com os quais ele gera um novo arquivo contendo o mapeamento das instruções encontradas nos arquivos de trace para diferentes classes de instrução. Dessa forma, podem ser modeladas diferentes arquiteturas de processadores, usando o arquivo com o mapeamento para simular o MPSoC. Mais, o simulador habilita ainda a exploração das diferentes organizações de memória da arquitetura proposta, de maneira que se pode estimar o seu impacto no número de instruções executadas, contenções no barramento, e consumo energético. Experimentos baseados em uma aplicação sintética, executando em um MPSoC composto por diferentes versões de um processador Java mostram um grande aumento na eficiência energética com um custo mínimo em área. Além disso, também são apresentados experimentos baseados em aplicações do benchmark SPECjvm98, que mostram o impacto causado na eficiência energética quando o tipo de aplicação é alterado. Mais, os experimentos mostram drásticos ganhos energéticos obtidos com a aplicação da técnica DCF sobre as memórias do MPSoC. / In this work we present an architecture that enables the generation of bus-based, scalable heterogeneous Multiprocessor Systems-on-Chip (MPSoCs), supporting different memory organizations. Intertask communication is specified by means of a shared memory structure that assures collision avoidance and promotes energy savings through a dynamic clock gating triggering. We also introduce a Dynamic Core Freezing (DCF) technique, which boosts energy savings taking advantage of processor idle cycles during memory accesses. Moreover, the combination of the memory organizations enables the architecture to exploit easy task migration by means of the task context saving in the shared data memory. Moreover, we show the high-level simulator, based on the proposed architecture, created in order to extract the energy savings enabled with the clock gating and the DCF techniques. The simulator accepts as input execution trace files of Java applications, from which it generates a new file that contains the mapping of the instructions found in the trace file for different instruction classes. This way, we can model different processor architectures, using the mapping file to simulate the MPSoC. Also, the simulator enables us to experiment with different memory organizations to estimate their impact on the executed instructions, bus contention, and energy consumption. As case study we have modeled different versions of a Java processor in order to experiment with different execution patterns over different memory organizations. Experiments based on a synthetic application running on an MPSoC containing different versions of a Java processor show a large improvement in energy efficiency with a minimal area cost. Besides that, we also present experiments based on applications of the SPECjvm98 benchmark, which show the impact on the energy efficiency when we change the application type. Moreover, the experiments show a huge improvement in the energy efficiency when applying the DCF technique to the MPSoC memories.
6

Reduzindo o consumo de energia em MPSoCs heterogêneos via clock gating / Reducing energy consumption in heterogeneous MPSoCs through clock gating

Motta, Rodrigo Bittencourt January 2008 (has links)
Nesse trabalho é apresentada uma arquitetura que habilita a geração de MPSoCs (Multiprocessors Systems-on-Chip) heterogêneos escaláveis, baseados em barramento, suportando ainda o uso de diferentes organizações de memória. A comunicação entre as tarefas é especificada por meio de uma estrutura de memória compartilhada, que evita colisões e promove ganhos energéticos através do disparo dinâmico de clock gating. Também é introduzida a técnica DCF (Dynamic Core Freezing), que incrementa a eficiência energética do MPSoC tirando proveito dos ciclos ociosos dos processadores durante os acessos à memória. Mais, a combinação das organizações de memória propostas habilita a exploração de migração de tarefas na arquitetura proposta, por meio da troca de contexto das tarefas na memória compartilhada. Além disso, é mostrado o simulador de alto-nível, baseado na arquitetura proposta, criado com o propósito de extrair os ganhos energéticos propiciados com o uso do clock gating e da técnica DCF. O simulador aceita como entrada arquivos de trace de execução de aplicações Java, com os quais ele gera um novo arquivo contendo o mapeamento das instruções encontradas nos arquivos de trace para diferentes classes de instrução. Dessa forma, podem ser modeladas diferentes arquiteturas de processadores, usando o arquivo com o mapeamento para simular o MPSoC. Mais, o simulador habilita ainda a exploração das diferentes organizações de memória da arquitetura proposta, de maneira que se pode estimar o seu impacto no número de instruções executadas, contenções no barramento, e consumo energético. Experimentos baseados em uma aplicação sintética, executando em um MPSoC composto por diferentes versões de um processador Java mostram um grande aumento na eficiência energética com um custo mínimo em área. Além disso, também são apresentados experimentos baseados em aplicações do benchmark SPECjvm98, que mostram o impacto causado na eficiência energética quando o tipo de aplicação é alterado. Mais, os experimentos mostram drásticos ganhos energéticos obtidos com a aplicação da técnica DCF sobre as memórias do MPSoC. / In this work we present an architecture that enables the generation of bus-based, scalable heterogeneous Multiprocessor Systems-on-Chip (MPSoCs), supporting different memory organizations. Intertask communication is specified by means of a shared memory structure that assures collision avoidance and promotes energy savings through a dynamic clock gating triggering. We also introduce a Dynamic Core Freezing (DCF) technique, which boosts energy savings taking advantage of processor idle cycles during memory accesses. Moreover, the combination of the memory organizations enables the architecture to exploit easy task migration by means of the task context saving in the shared data memory. Moreover, we show the high-level simulator, based on the proposed architecture, created in order to extract the energy savings enabled with the clock gating and the DCF techniques. The simulator accepts as input execution trace files of Java applications, from which it generates a new file that contains the mapping of the instructions found in the trace file for different instruction classes. This way, we can model different processor architectures, using the mapping file to simulate the MPSoC. Also, the simulator enables us to experiment with different memory organizations to estimate their impact on the executed instructions, bus contention, and energy consumption. As case study we have modeled different versions of a Java processor in order to experiment with different execution patterns over different memory organizations. Experiments based on a synthetic application running on an MPSoC containing different versions of a Java processor show a large improvement in energy efficiency with a minimal area cost. Besides that, we also present experiments based on applications of the SPECjvm98 benchmark, which show the impact on the energy efficiency when we change the application type. Moreover, the experiments show a huge improvement in the energy efficiency when applying the DCF technique to the MPSoC memories.
7

Desenvolvimento e avaliação de redes-em-chip hierárquicas e reconfiguráveis para MPSoCs / Development and evaluation of hierarchical and reconfigurable networks-on-chip for MPSoCs

Reinbrecht, Cezar Rodolfo Wedig January 2012 (has links)
Com o advento dos processos submicrônicos, a capacidade de integração de transistores numa mesma pastilha de silício atingiu níveis que possibilitaram a construção dos sistemas com múltiplos processadores num chip (MPSoCs, do inglês MultiProcessor System-on-Chip). Essa possibilidade de integração permite inserir dezenas de Elementos de Processamento (EPs) nos circuitos integrados atuais, e já se projeta centenas de EPs para os sistemas da próxima década (ITRS, 2011). Nesse cenário, um dos principais desafios se refere ao serviço de interconexão dos EPs, que deve apresentar um desempenho de comunicação necessário para as aplicações em execução sem comprometer as limitações de consumo de área e energia do circuito. Nos primeiros sistemas multiprocessados, com poucos nodos, arquiteturas baseadas em barramento foram suficientes para cumprir esses requisitos. Porém, o número de elementos nos sistemas recentes aumentou rapidamente, tornando as redes-em-chip a solução mais apropriada, por aliar escalabilidade e reuso na mesma estrutura. Contudo, diante da previsão de que essa tendência de aumento se manterá retorna a discussão se as redes-em-chip atuais continuarão adequadas para os futuros sistemas. De fato, o custo das redes-em-chip convencionais pode se tornar proibitivo para as escalas dos circuitos em um futuro próximo. Novas propostas têm sido apresentadas na literatura científica onde se podem destacar duas principais estratégias de projeto às redes de interconexão: reconfiguração arquitetural e organização hierárquica da topologia. A reconfiguração arquitetural permite obter uma grande eficiência, independente do tipo de aplicação em execução, pois uma das alternativas é projetar o circuito para que ele se auto adapte conforme os requisitos de desempenho para cada aplicação. Por outro lado, arquiteturas organizadas em topologias hierárquicas são desenvolvidas para uma estrutura computacional definida em tempo de projeto, sendo mais eficazes para uma classe de aplicações. O presente trabalho explora a sinergia da combinação das potencialidades das duas soluções e propõe uma nova estrutura que oferece melhor desempenho para uma classe maior de aplicações apropriada para os futuros sistemas. Como resultado foi implementada uma arquitetura adaptativa chamada MINoC (Multiple Interconnections Networks-on-Chip), uma arquitetura organizada em hierarquia, chamada HiCIT (Hierarchical Crossbar-based Interconnection Topology) e uma simbiose de ambas culminando na arquitetura hierárquica adaptativa HASIN (Hierarchical Adaptive Switching Interconnection Network). São apresentados resultados que mostram a eficiência desses conceitos validando a proposta hierárquica adaptativa. / With the advent of submicron processes, the number of transistors integrated on a single chip has reached levels that allowed the design of Multiprocessor Systems-on-Chip (MPSoCs). This capability allows the integration of several processing elements (PEs) in integrated circuits designed nowadays. In the next decade it is expected that hundreds of PEs will be integrated on a single chip. In this scenario, a key challenge is the interconnection network between PEs, which must provide the communication service required to run applications without compromising the limitations of area and energy consumption. In the first multiprocessor systems, with few nodes, bus-based approaches have been sufficient to meet these requirements. However, current systems increased quickly the number of elements, making the Networks-on-Chip (NoCs) the most appropriate solution, because it handles scalability and reusability in the same structure. Nevertheless, ITRS roadmap predicts that this increase will continue (ITRS, 2011), which resumes the discussion if present NoC architectures will be the most adequate for future systems, since its costs could be prohibitive. Therefore, new proposals have been presented in the literature with two main design strategies: architectural reconfiguration and hierarchical organization of the topology. With the architectural reconfiguration it is possible to obtain an application independent high efficiency structure, because the circuit is designed to adapt itself to satisfy performance requirements. On the other hand, architectural organizations in hierarchical topologies are defined at design time to have the most appropriate features for a class of applications, being very effective. The current work identified the synergy of both approaches and proposes a new symbiotic structure suitable for a broader class of applications. As a result, it was implemented an adaptive architecture called MINoC (Multiple Interconexions Networks-on-chip), an architecture organized in hierarchy called HiCIT (Hierarchical Crossbar-based Interconnection Topology) and a mix of both ending up with the hierarchical adaptive architecture HASIN (Hierarchical Interconnection Network Adaptive Switching). Results show the efficiency of these concepts validating the proposed hierarchical adaptive architecture.
8

Middleware adaptativo para sistemas embarcados e de tempo-real / Adaptive middleware for real-time embedded systems

Silva Júnior, Elias Teodoro da January 2008 (has links)
Um dos principais desafios no desenvolvimento de ferramentas e metodologias para sistemas multiprocessados, embarcados e de tempo-real é o reuso de software já desenvolvido, mantendo baixa utilização de recursos como memória, energia e desempenho de CPU, e ainda atendendo às restrições temporais. O presente trabalho procura atacar este problema no nível do middleware, comumente utilizado como forma de integrar componentes de software reusáveis, diminuindo o tempo e o esforço desprendido no desenvolvimento de aplicações e serviços com alta qualidade. Este trabalho especifica e implementa um middleware para uma plataforma MPSoC voltada para sistemas embarcados e de tempo-real, permitindo adaptações durante o projeto e/ou execução da aplicação, a fim de otimizar o uso dos recursos e atender às restrições de projeto. Ao projetista da aplicação é permitido reusar os serviços do middleware e da plataforma em diferentes aplicações. Igualmente, aplicações escritas sobre o middleware podem ser portadas para outras plataformas onde o middleware possa ser executado. O middleware proposto oferece serviços implementados em hardware e encapsulamento da comunicação hardware-software na própria aplicação. Além disso, são oferecidos meios para gerenciamento de requisitos não funcionais de energia e tempo-real, como deadline e tempo de execução. / One of the main challenges in the development of tools and methodologies for a multiprocessor real-time embedded system is to reuse already developed software, but at the same time obtaining low memory footprint, low energy consumption, and minimal area, obviously addressing the real-time constraints. This work aims at facing these problems at the middleware level, frequently used to integrate components of reusable software, accelerating development cycle and reducing the effort to develop applications and services with high quality. The present work specifies and implements a middleware for an MPSoC platform oriented to real-time and embedded systems, providing adaptations at development and execution time, in order to optimize resources usage and fulfill design restrictions. The designer can reuse middleware services and the platform as well, when developing different applications. Likewise, applications developed under the middleware can be ported to run in other platforms where the middleware was ported to. The proposed middleware offers hardware implemented services and encapsulates hardware-software communication in the application. Moreover, it permits to specify non-functional requirements of energy and real-time, as deadline and execution time.
9

Reduzindo o consumo de energia em MPSoCs heterogêneos via clock gating / Reducing energy consumption in heterogeneous MPSoCs through clock gating

Motta, Rodrigo Bittencourt January 2008 (has links)
Nesse trabalho é apresentada uma arquitetura que habilita a geração de MPSoCs (Multiprocessors Systems-on-Chip) heterogêneos escaláveis, baseados em barramento, suportando ainda o uso de diferentes organizações de memória. A comunicação entre as tarefas é especificada por meio de uma estrutura de memória compartilhada, que evita colisões e promove ganhos energéticos através do disparo dinâmico de clock gating. Também é introduzida a técnica DCF (Dynamic Core Freezing), que incrementa a eficiência energética do MPSoC tirando proveito dos ciclos ociosos dos processadores durante os acessos à memória. Mais, a combinação das organizações de memória propostas habilita a exploração de migração de tarefas na arquitetura proposta, por meio da troca de contexto das tarefas na memória compartilhada. Além disso, é mostrado o simulador de alto-nível, baseado na arquitetura proposta, criado com o propósito de extrair os ganhos energéticos propiciados com o uso do clock gating e da técnica DCF. O simulador aceita como entrada arquivos de trace de execução de aplicações Java, com os quais ele gera um novo arquivo contendo o mapeamento das instruções encontradas nos arquivos de trace para diferentes classes de instrução. Dessa forma, podem ser modeladas diferentes arquiteturas de processadores, usando o arquivo com o mapeamento para simular o MPSoC. Mais, o simulador habilita ainda a exploração das diferentes organizações de memória da arquitetura proposta, de maneira que se pode estimar o seu impacto no número de instruções executadas, contenções no barramento, e consumo energético. Experimentos baseados em uma aplicação sintética, executando em um MPSoC composto por diferentes versões de um processador Java mostram um grande aumento na eficiência energética com um custo mínimo em área. Além disso, também são apresentados experimentos baseados em aplicações do benchmark SPECjvm98, que mostram o impacto causado na eficiência energética quando o tipo de aplicação é alterado. Mais, os experimentos mostram drásticos ganhos energéticos obtidos com a aplicação da técnica DCF sobre as memórias do MPSoC. / In this work we present an architecture that enables the generation of bus-based, scalable heterogeneous Multiprocessor Systems-on-Chip (MPSoCs), supporting different memory organizations. Intertask communication is specified by means of a shared memory structure that assures collision avoidance and promotes energy savings through a dynamic clock gating triggering. We also introduce a Dynamic Core Freezing (DCF) technique, which boosts energy savings taking advantage of processor idle cycles during memory accesses. Moreover, the combination of the memory organizations enables the architecture to exploit easy task migration by means of the task context saving in the shared data memory. Moreover, we show the high-level simulator, based on the proposed architecture, created in order to extract the energy savings enabled with the clock gating and the DCF techniques. The simulator accepts as input execution trace files of Java applications, from which it generates a new file that contains the mapping of the instructions found in the trace file for different instruction classes. This way, we can model different processor architectures, using the mapping file to simulate the MPSoC. Also, the simulator enables us to experiment with different memory organizations to estimate their impact on the executed instructions, bus contention, and energy consumption. As case study we have modeled different versions of a Java processor in order to experiment with different execution patterns over different memory organizations. Experiments based on a synthetic application running on an MPSoC containing different versions of a Java processor show a large improvement in energy efficiency with a minimal area cost. Besides that, we also present experiments based on applications of the SPECjvm98 benchmark, which show the impact on the energy efficiency when we change the application type. Moreover, the experiments show a huge improvement in the energy efficiency when applying the DCF technique to the MPSoC memories.
10

Middleware adaptativo para sistemas embarcados e de tempo-real / Adaptive middleware for real-time embedded systems

Silva Júnior, Elias Teodoro da January 2008 (has links)
Um dos principais desafios no desenvolvimento de ferramentas e metodologias para sistemas multiprocessados, embarcados e de tempo-real é o reuso de software já desenvolvido, mantendo baixa utilização de recursos como memória, energia e desempenho de CPU, e ainda atendendo às restrições temporais. O presente trabalho procura atacar este problema no nível do middleware, comumente utilizado como forma de integrar componentes de software reusáveis, diminuindo o tempo e o esforço desprendido no desenvolvimento de aplicações e serviços com alta qualidade. Este trabalho especifica e implementa um middleware para uma plataforma MPSoC voltada para sistemas embarcados e de tempo-real, permitindo adaptações durante o projeto e/ou execução da aplicação, a fim de otimizar o uso dos recursos e atender às restrições de projeto. Ao projetista da aplicação é permitido reusar os serviços do middleware e da plataforma em diferentes aplicações. Igualmente, aplicações escritas sobre o middleware podem ser portadas para outras plataformas onde o middleware possa ser executado. O middleware proposto oferece serviços implementados em hardware e encapsulamento da comunicação hardware-software na própria aplicação. Além disso, são oferecidos meios para gerenciamento de requisitos não funcionais de energia e tempo-real, como deadline e tempo de execução. / One of the main challenges in the development of tools and methodologies for a multiprocessor real-time embedded system is to reuse already developed software, but at the same time obtaining low memory footprint, low energy consumption, and minimal area, obviously addressing the real-time constraints. This work aims at facing these problems at the middleware level, frequently used to integrate components of reusable software, accelerating development cycle and reducing the effort to develop applications and services with high quality. The present work specifies and implements a middleware for an MPSoC platform oriented to real-time and embedded systems, providing adaptations at development and execution time, in order to optimize resources usage and fulfill design restrictions. The designer can reuse middleware services and the platform as well, when developing different applications. Likewise, applications developed under the middleware can be ported to run in other platforms where the middleware was ported to. The proposed middleware offers hardware implemented services and encapsulates hardware-software communication in the application. Moreover, it permits to specify non-functional requirements of energy and real-time, as deadline and execution time.

Page generated in 0.4121 seconds