• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 117
  • 42
  • 26
  • 19
  • 11
  • 5
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 256
  • 256
  • 64
  • 51
  • 46
  • 42
  • 33
  • 31
  • 27
  • 27
  • 25
  • 25
  • 25
  • 24
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Estimativa de desempenho de uma NoC a partir de seu modelo em SYSTEMC-TLM. / A NoC performance evaluation from a SYSTEMC - TLM model.

Sepúlveda Flórez, Martha Johanna 16 October 2006 (has links)
The wide variety of interconnection structures presently nowadays for SoC (Systemon- Chip), bus and networks-on-Chip NoCs, each of them with a wide set of setup parameters, provides a huge amount of design alternatives. Although the interconnection structure is a key SoC component, there are few design tools in order to set the appropriate configuration parameters for a given application. An efficient SoC project may comply an exploration stage among the possible solutions for the communication structure, during the first steps of the design process. The absence of appropriate tools for that exploration makes critical the designer?s judgment. The present study aims to enhance the communication SoC structure design area, when a NoC is used. This work proposes a methodology that allows the establishment of the NoC communication parameters using a high level model (SystemC TLM timed). Our approach analyzes and evaluates the NoC performance under a wide variety of traffic conditions. The experimental stage was conducted employing a model of a net represented by a SystemC TLM timed (Hermes_Temp). Parametric and pseudo-random generators control the network traffic. The analysis was carried on with a tool designed for these purpose, which generates a group of performance metrics. The results allow to elucidate the global and inner network behavior. The performance values are useful for the heterogeneous and homogeneous NoC design projects, improving the performance evaluation studies scope. / The wide variety of interconnection structures presently nowadays for SoC (Systemon- Chip), bus and networks-on-Chip NoCs, each of them with a wide set of setup parameters, provides a huge amount of design alternatives. Although the interconnection structure is a key SoC component, there are few design tools in order to set the appropriate configuration parameters for a given application. An efficient SoC project may comply an exploration stage among the possible solutions for the communication structure, during the first steps of the design process. The absence of appropriate tools for that exploration makes critical the designer?s judgment. The present study aims to enhance the communication SoC structure design area, when a NoC is used. This work proposes a methodology that allows the establishment of the NoC communication parameters using a high level model (SystemC TLM timed). Our approach analyzes and evaluates the NoC performance under a wide variety of traffic conditions. The experimental stage was conducted employing a model of a net represented by a SystemC TLM timed (Hermes_Temp). Parametric and pseudo-random generators control the network traffic. The analysis was carried on with a tool designed for these purpose, which generates a group of performance metrics. The results allow to elucidate the global and inner network behavior. The performance values are useful for the heterogeneous and homogeneous NoC design projects, improving the performance evaluation studies scope.
52

Análise da influência do uso de domínios de parâmetros sobre a eficiência da verificação funcional baseada em estimulação aleatória. / Analysis of the influence of using parameter domains on ramdom-stimulation-based functional verification efficiency.

Carlos Ivan Castro Marquez 10 February 2009 (has links)
Uma das maiores restrições que existe atualmente no fluxo de projeto de CIs é a necessidade de um ciclo menor de desenvolvimento. Devido às grandes dimensões dos sistemas atuais, é muito provável encontrar no projeto de blocos IP, erros ou bugs originados na passagem de uma dada especificação inicial para seus correspondentes modelos de descrição de hardware. Isto faz com que seja necessário verificar tais modelos para garantir aplicações cem por cento funcionais. Uma das técnicas de verificação que tem adquirido bastante popularidade recentemente é a verificação funcional, uma vez que é uma alternativa que ajuda a manter baixos custos de validação dos modelos HDL ao longo do projeto completo do circuito. Na verificação funcional, que está baseada em ambientes de simulação, a funcionalidade completa (ou relevante) do modelo é explorada, aplicando-se casos de teste, um após o outro. Isto permite examinar o modelo em todas as seqüências e combinações de entradas desejadas. Na verificação funcional, existe a possibilidade de simular o modelo estimulando-o com casos de teste aleatórios, o qual ajuda a cobrir um amplo número de estados. Para facilitar a aplicação de estímulos em simulação de circuitos, é comum que espaços definidos por parâmetros de entrada sejam limitados em sua abrangência e agrupados de tal forma que subespaços sejam formados. No desenvolvimento de testbenches, os geradores de estímulos aleatórios podem ser criados de forma a conter subespaços que se sobrepõem (resultando em estímulos redundantes) ou subespaços que contenham condições que não sejam de interesse (resultando em estímulos inválidos). É possível eliminar ou diminuir, os casos de teste redundantes e inválidos através da aplicação de metodologias de modificação do espaço de estímulos de entrada, e assim, diminuir o tempo requerido para completar a simulação de modelos HDL. No presente trabalho, é realizada uma análise da aplicação da técnica de organização do espaço de entrada através de domínios de parâmetros do IP, e uma metodologia é desenvolvida para tal, incluindo-se, aí, uma ferramenta de codificação automática de geradores de estímulos aleatórios em linguagem SyatemC: o GET_PRG. Resultados com a aplicação da metodologia é comparada a casos de aplicação de estímulos aleatórios gerados a partir de um espaço de estímulos de entrada sem modificações.Como esperado, o número de casos de teste redundantes e inválidos aplicados aos testbenches foi sempre maior para o caso de estimulação aleatória a partir do espaço de estímulos de entrada completo com um tempo de execução mais longo. / One of the strongest restrictions that exist throughout ICs design flow is the need for shorter development cycles. This, along with the constant demand for more functionalities, has been the main cause for the appearance of the so-called System-on-Chip (SOC) architectures, consisting of systems that contain dozens of reusable hardware blocks (Intellectual Properties, or IPs). The increasing complexity makes it necessary to thoroughly verify such models in order to guarantee 100% functional applications. Among the current verification techniques, functional verification has received important attention, since it represents an alternative that keeps HDL validation costs low throughout the circuits design cycle. Functional verification is based in testbenches, and it works by exploring the whole (or relevant) models functionality, applying test cases in a sequential fashion. This allows the testing of the model in all desired input sequences and combinations. There are different techniques concerning testbench design, being the random stimulation an important approach, by which a huge number of test cases can be automatically created. In order to ease the stimuli application in circuit simulation, it is common to limit the range of the space defined by input parameters and to group such restricted parameters in sub-spaces. In testbench development, it may occur the creation of random stimuli generators containing overlapping sub-spaces (resulting in redundant stimuli) or sub-spaces containing conditions of no interest (resulting in invalid stimuli). It is possible to eliminate, or at least reduce redundant and invalid test cases by modifying the input stimuli space, thus, diminishing the time required to complete the HDL models simulation. In this work, the application of a technique aimed to organize the input stimuli space, by means of IP parameter domains, is analyzed. A verification methodology based on that is developed, including a tool for automatic coding of random stimuli generators using SystemC: GET_PRG. Results on applying such a methodology are compared to cases where test vectors from the complete verification space are generated. As expected, the number of redundant test cases applied to the testbenches was always greater for the case of random stimulation on the whole (unreduced, unorganized) input stimuli space, with a larger testbench execution time.
53

Estimativa de desempenho de uma NoC a partir de seu modelo em SYSTEMC-TLM. / A NoC performance evaluation from a SYSTEMC - TLM model.

Martha Johanna Sepúlveda Flórez 16 October 2006 (has links)
The wide variety of interconnection structures presently nowadays for SoC (Systemon- Chip), bus and networks-on-Chip NoCs, each of them with a wide set of setup parameters, provides a huge amount of design alternatives. Although the interconnection structure is a key SoC component, there are few design tools in order to set the appropriate configuration parameters for a given application. An efficient SoC project may comply an exploration stage among the possible solutions for the communication structure, during the first steps of the design process. The absence of appropriate tools for that exploration makes critical the designer?s judgment. The present study aims to enhance the communication SoC structure design area, when a NoC is used. This work proposes a methodology that allows the establishment of the NoC communication parameters using a high level model (SystemC TLM timed). Our approach analyzes and evaluates the NoC performance under a wide variety of traffic conditions. The experimental stage was conducted employing a model of a net represented by a SystemC TLM timed (Hermes_Temp). Parametric and pseudo-random generators control the network traffic. The analysis was carried on with a tool designed for these purpose, which generates a group of performance metrics. The results allow to elucidate the global and inner network behavior. The performance values are useful for the heterogeneous and homogeneous NoC design projects, improving the performance evaluation studies scope. / The wide variety of interconnection structures presently nowadays for SoC (Systemon- Chip), bus and networks-on-Chip NoCs, each of them with a wide set of setup parameters, provides a huge amount of design alternatives. Although the interconnection structure is a key SoC component, there are few design tools in order to set the appropriate configuration parameters for a given application. An efficient SoC project may comply an exploration stage among the possible solutions for the communication structure, during the first steps of the design process. The absence of appropriate tools for that exploration makes critical the designer?s judgment. The present study aims to enhance the communication SoC structure design area, when a NoC is used. This work proposes a methodology that allows the establishment of the NoC communication parameters using a high level model (SystemC TLM timed). Our approach analyzes and evaluates the NoC performance under a wide variety of traffic conditions. The experimental stage was conducted employing a model of a net represented by a SystemC TLM timed (Hermes_Temp). Parametric and pseudo-random generators control the network traffic. The analysis was carried on with a tool designed for these purpose, which generates a group of performance metrics. The results allow to elucidate the global and inner network behavior. The performance values are useful for the heterogeneous and homogeneous NoC design projects, improving the performance evaluation studies scope.
54

Effiziente externe Beobachtung von CPU-Aktivitäten auf SoCs

Weiss, Alexander 02 October 2015 (has links)
Die umfassende Beobachtbarkeit von System‐on‐Chips (SoCs) ist eine wichtige Voraussetzung für das effiziente Testen und Debuggen eingebetteter Systeme. Ausgehend von einer Analyse verschiedener Anwendungsfälle ergibt sich ein Katalog von Anforderungen an die Beobachtbarkeit von SoCs. Ein wichtiges Kriterium ist hier die Vollständigkeit der Beobachtung und umfasst die Aktivitäten der CPU (ausgeführte Instruktionen, gelesene und geschriebene Daten, Verhalten des Caches, Ausführungszeiten), des Bussystems und von Umgebungsbedingungen. Weitere Kriterien sind die Echtzeitfähigkeit und die Kontinuität der Beobachtung sowie die gleichzeitige Durchführung verschiedener Beobachtungsaufgaben. Dabei soll es zu einer möglichst geringen Beeinflussung des SoCs kommen. Weitere wichtige Aspekt sind die Kosten der Lösung, die Universalität, die Skalierbarkeit sowie die Latenz der Verfügbarkeit der Beobachtungsergebnisse. Für viele Anwendungen, besonders in sicherheitskritischen Bereichen, muss zudem nachgewiesen werden, dass das Beobachtungsverfahren kein Fehlverhalten des SoCs bewirkt bzw. ein solches maskiert. Eine besondere Herausforderung stellen Multiprozessor‐SoCs (MPSoCs) dar, da hier die Kommunikation zwischen den einzelnen CPUs im Inneren des SoC stattfindet und entsprechend schwierig für einen externen Bobachter sichtbar zu machen ist. Der Stand der Technik zur Beobachtung von SoCs wird im Wesentlichen durch zwei Verfahren dargestellt. Bei der Software‐Instrumentierung wird zum funktionalen Programmcode zusätzlicher Code hinzugefügt, welcher zur Beobachtung des Programms dient. Diese Methode ist einfach und universell anwendbar, erfüllt aber die genannten Kriterien nur sehr eingeschränkt. Nachteilig ist hier der Ressourcenverbrauch im Falle des Verbleibs der Instrumentierung im fertigen Produkt. Wird die Instrumentierung nur temporär dem Code hinzugefügt, muss sichergestellt werden, dass das Beobachtungsergebnis auch für den finalen Code anwendbar ist – was besonders bei ressourcen‐abhängigen Integrationstests nur schwierig erfüllbar ist. Eine alternative Lösung stellt eine spezielle Hardware‐Unterstützung in SoCs („embedded Trace“) dar. Hier werden im SoC Zustandsinformationen (z.B. Taskwechsel, ausgeführte Instruktionen, Datentransfers) gesammelt und mittels Trace‐Nachrichten an den Beobachter übermittelt. Dabei stellt die Bandbreite, die zur Ausgabe der Trace‐Nachrichten vom SoC verfügbar ist, ein entscheidendes Nadelöhr dar ‐ im SoC sind viel mehr den Beobachter interessierende Informationen verfügbar als nach außen transferiert werden können. Damit haben beide dem gegenwärtige Stand der Technik entsprechende Beobachtungsverfahren eine Reihe von Einschränkungen, die sich besonders bei der Vollständigkeit der Beobachtung, der Flexibilität, der Kontinuität und der Unterstützung von MPSoCs zeigen. In dieser Arbeit wird nun ein neuer Ansatz vorgestellt, welcher gegenüber dem Stand der Technik in einigen Bereichen deutliche Verbesserungen bietet. Dabei werden die Trace‐Daten nicht vom zu beobachtenden SoC direkt, sondern aus einer parallel mitlaufenden Emulation gewonnen. Die Bandbreite der für die Synchronisation der Emulation erforderlichen Daten ist in vielen Fällen deutlich geringer als bei der Ausgabe von umfassenden Trace‐Nachrichten mittels „embedded Trace“‐Lösungen. Gleichzeitig ist eine vollständige, äußerst detaillierte Beobachtung der Vorgänge innerhalb des SoC möglich. Das neue Beobachtungsverfahren wurde mittels verschiedener FPGA-basierter Implementierungen evaluiert, hier konnte auch die Anwendbarkeit für MPSoCs gezeigt werden.
55

Etude d'attaques matérielles et combinées sur les "System-on-chip" / Hardware and combined attacks on the "System-on-Chip"

Majéric, Fabien 30 November 2018 (has links)
L'intérêt de la communauté de la sécurité numérique dans le domaine des Systems on Chip (SoC) s'est essentiellement focalisé sur les menaces logicielles, améliorant sans cesse le niveau de protection. Cependant, l'exploitation de ce vecteur d'attaque devenant de plus en plus difficile, il est fort probable que les attaques matérielles se multiplient. Par conséquent, il est primordial d'étudier ces dernières afin d'anticiper la menace qu'elles représentent. La sophistication de l'architecture et la rapidité d'évolution des technologies embarquées dans les SoC, justifient la mise en place d'une méthodologie adaptée pour évaluer efficacement leur niveau de sécurité. C'est dans ce contexte que cette thèse propose l'étude de cette catégorie d'attaques ainsi qu'un aperçu de leur impact sur la sécurité de ce type de systèmes. Alors que les architectures élaborées accroissent la difficulté de mise en place d'attaques physiques, elles augmentent également la surface d'attaque. Une première étude analyse les chemins d'attaques afin de déterminer les grandeurs physiques exploitables les plus pertinentes. Cette étape conduit, dans un deuxième temps, à l'élaboration de règles génériques pour l'évaluation sécuritaire des SoC présents sur le marché. Celles-ci combinent diverses techniques déjà utilisées dans le domaine de la carte à puce. L'ensemble de ce travail s'appuie sur plusieurs divers modules caractéristiques de la sécurité des SoC actuels. Tous les résultats soulignent que la complexité inhérente aux SoC n'est pas suffisante pour les protéger contre les attaques matérielles et l'implémentation des sécurités dans ces systèmes doit se faire sans se reposer sur cette propriété. / In the field of System on Chip (SoC), the digital security community has mainly focused on software threats; constantly working to improve the level of protection. Since the exploitation of this attack vector is becoming more and more difficult, it is most likely that the number of hardware attacks will increase. Therefore, it is essential to study these attacks in order to anticipate the threat they represent. The sophisticated architecture and the rapidly changing technologies embedded in the SoC justify the implementation of an adapted methodology, to effectively evaluate their level of security.In this context, this thesis examines the feasibility of this type of attacks and their impact on the security of these systems. While rich architectures increase the difficulty of setting up hardware attacks, they also increase the attack surface. Our study starts by analyzing the attack paths in order to determine the most relevant exploitable physical quantities. This has led to the development of a generic procedure for the security evaluation of SoCs on the market. This method combines various techniques that are already applied to smart cards. This entire work is based on several case studies related to various embedded modules characteristic of the security in current systems-on-chips. All the observed results lead to the same observations: the inherent complexity of SoCs is not sufficient to protect them against hardware attacks. The implementation of security in these systems must be done without relying on this property.
56

Optimisation de convertisseurs DC-DC SoC (System on Chip) pour l'automobile / Optimization of SoC (System on Chip) DC-DC converters for automotive application

Aulagnier, Guillaume 16 April 2015 (has links)
L’équipe de conception de Freescale à Toulouse développe des circuits intégrés dédiés au marché de l’automobile pour des applications châssis, sécurité ou loisir. Les contraintes associées à l’embarquement des circuits sont nombreuses : niveau d’intégration, fiabilité, températures élevées, et compatibilité électromagnétique. Les produits conçus par Freescale intègrent des convertisseurs à découpage pour l’alimentation en énergie des microcontrôleurs. Cette thèse a pour objet l’étude de nouvelles topologies de convertisseur d’énergie pour la baisse de l’encombrement et des perturbations électromagnétiques. La structure multiphase répond à la problématique dans son ensemble. Un prototype est réalisé dans une technologie silicium Freescale haute tension 0.25µm. Le volume des composants externes de filtrage est optimisé et réduit. Les mesures sur le prototype montrent des performances en accord avec les objectifs, et des émissions électromagnétiques particulièrement faibles. / The Freescale design team in Toulouse develops integrated circuits for automotive application such as chassis, safety or infotainment. Constraints associated with the embodiment of such circuits are many: die-size, safety, EMC (Electromagnetic Compliance). Switching Mode Power Supplies are integrated in these products to supply power to microcontrollers. This PhD thesis is to study new topologies of power supply to reduce the volume and electromagnetic disturbances. The multiphase structure responds to the raised issue. A prototype is produced in a Freescale 0.25µm high voltage silicon technology. Volume of the external components for filtering is optimized and reduced. Measures show upgrades in performance and reduced electromagnetic emissions.
57

Predictable Multiprocessor Platform for Safety- Critical Real- Time Systems

Sigurðsson, Páll Axel January 2021 (has links)
Multicore systems excel at providing concurrent execution of applications, giving true parallelism where all cores can execute sequences of machine instructions at the same time. However, multicore systems come with their own sets of problems, most notably when cores in a system (or core tiles) share hardware components such as memory modules or Input/Output (IO) peripherals. This increased level of complexity makes it especially difficult to design and verify safety- critical systems that require real- time operation, such as flight controllers in airplanes and airbag controllers in the automotive industry. Verifying that that systems are predictable is therefore essential, requiring methods for measuring and finding out the Worst- Case Execution Times (WCETs) and Best- Case Execution Times (BCETs). Additionally, the designer must ensure isolation between running applications (indicating that the platform is composable). This thesis work consists of designing a predictable Multiprocessor System On- Chip (MPSoC) using Qsys and Quartus II, as well as providing methods and test benches that can support all claims made about the platform’s reported behavior. A shared- memory loosely coupled multicore design was implemented, which can be horizontally scaled from 2 to 8 core tiles. A high- level Hardware Abstraction Layer (HAL) is written for the platform to simplify its use. Using Nios II/e processors as the logical cores in the platform’s core tiles gives predictable (mostly static) latencies when the platform is tested, showing no erratic or unexplained timing variations. However, due to the Round Robin (RR) nature of the arbitration logic in the Avalon Switch Fabric (ASF), composability was not fully achieved in the platform. Groundwork for implementing Time- Division Multiplexing (TDM) arbitration logic is proposed and will ideally be fully implemented in future work. / Mångkärniga processorsystem utmärker sig när det kommer till samkörning mellan applikationer. De ger en sann parallellism, där alla kärnor kan köra processorinstruktioner samtidigt. Mångkärniga system kommer med sina egna problem, framför allt när kärnorna ska dela komponenter så som minnesmoduler och Input/Output tillbehör. Den ökade komplexiteten gör att det är extra svårt att designa och verifiera säkerhetskritiska system som kräver körning i realtid, så som flygkontrollers på flygplan och styrenheter för krockkudden i bilar. Verifiering av att systemen är förutsägbara är essentiellt, detta behöver metoder för att mäta och hitta den värsta möjliga exekveringstiden (WCET) och den bästa möjliga exekveringstiden (BCET). Utöver detta måste designern säkerställa att processerna som körs på kärnorna är isolerade ifrån varandra (komponerbara). Detta arbetet består av att designa ett förutsägbart mångkärnigt system på chip (MPSoC) med Qsys och Quartus II, samt att ge metoder och testbänkar som kan bevisa systemets hävdade beteende. Ett löst kopplat mångkärnigt system med delat minne implementerades, där systemets kärnor kan ökas horisontellt från 2 till 8 stycken. Ett Hardware Abstraction Layer (HAL) skapades för systemet för att simplifiera användningen. Användningen av Nios II/e som processorkärna gav förutsägbara exekveringstider när systemet testades och visade inga oförklarliga tids variationer. Däremot, på grund av att Avalon Switch Fabric (ASF) tilldelar access med Round Robin (RR), är systemet inte komponerbart. Basen för att implementera Time- Division Multiplexing (TDM) istället är föreslaget och kommer idealt implementeras som fortsatt arbete.
58

Estimativas de desempenho da estrutura de comunicação de SoC a partir de modelos de transações. / Performance estimation of on-chip communication structures using transaction level modeling.

Eslava Garzon, Johan Sebastian 17 April 2009 (has links)
A complexidade crescente (tanto da funcionalidade como da arquitetura) dos sistemas eletrônicos digitais sobre silício (conhecidos na literatura como System-on-Chip, SoC) exige novas metodologias que permitam diminuir seu tempo de desenvolvimento. O projeto no nível de sistemas (SLD) é proposto para aumentar a eficiência do projeto de SoC. SLD exige novas linguagens (como SystemC) e níveis de abstração (como TLM). A estrutura de comunicação (EC) de um SoC tem apresentado uma crescente importância devido à presença de uma maior quantidade (e funcionalidade) dos módulos a serem comunicados. Portanto, a EC apresenta um grande impacto no desempenho global do SoC. Nesta tese é proposta uma metodologia de projeto da EC chamada de MaLOC (Multi- Abstraction Level On-Chip communications structure design) que é baseada num enfoque top-down que percorre três níveis de transações (TLM). A tomada de decisões é feita utilizando-se duas importantes características que dão a originalidade a nossa proposta: 1) baseadas num conjunto de diversas métricas de desempenho que permite obter resultados mais confiáveis. 2) decisões ASAP (o mais rápido possível), antecipando a tomada de decisões utilizando níveis mais abstratos do que o RTL, permitindo diminuir o tempo de projeto da EC. Para validar a proposta uma série de análises de fidelidade foram realizadas, os resultados indicaram fidelidades maiores do que 96% e em cenários extremos maiores do que 72%. Adicionalmente os tempos de simulação no nível TLM atemporal foi até 2,6 vezes mais rápido do que o nível TLM de precisão de transferências, que foi até 1,6 vezes mais rápido do que o nível TLM de ciclos de relógio (menos abstrato). Estes resultados indicam a validade da metodologia para realizar a tomada de decisões, permitindo uma melhor exploração do espaço de projeto Os estudos de caso permitiram observar que além de configurar a EC procurando o melhor desempenho, MaLOC identificou soluções com menor consumo de energia, através do uso de um conjunto diverso de métricas, e configuração de parâmetros do sistema (tamanho da memória). Estas duas situações indicam o potencial que a metodologia apresenta para o projeto de diferentes tipos de EC, assim como de diferentes componentes de um SoC. / Modern and future System on Chip design requires several methodologies in order to handle their growing complexity (of both functional and architectural issues). System Level Design has emerged as a solution to handle the complex of nowadays and future SoC designs, increasing their efficiency and reducing the time to market. SLD requires new modeling languages (such as SystemC) and abstraction levels (such as Transaction Level Modeling - TLM). The integration of very different and composite IP cores into a SoC makes their physical and logical integration a very difficult task. Hence, the communication structure (CS) presents a significant impact on the SOC global performance. This thesis proposes a novel methodology named MaLOC (Multi-Abstraction Level On- Chip communications structure design) that uses a top-down approach. The parameters configuration is driven by two important considerations: 1) performance metrics based, this enables to obtain a most reliable solution; 2) an ASAP configuration schedule, this enables to reduce the CS design time through the use of higher abstraction levels. A fidelity test was performed. The results showed that in extreme conditions (such as burst size higher than time between transactions) the fidelity obtained was higher than 72%. In normal cases (burst size similar to the time between transactions) the fidelity was higher than 96%. The simulations execution times were compared among the three TLM levels and the results showed that TLM untimed simulations were 2.6 times faster than the TLM transfer accurate, also these were 1.6 times faster than the TLM cycle accurate. This means that TLM untimed simulations are 4 times faster than TLM Cycle accurate, enabling a enhanced space design exploration. The case studies performed showed that MaLOC can be useful to identify solutions that satisfy the performance required reducing the power consumption (reducing activities across the bus). Also, a system parameter was defined using the methodology (memory banks). These two situations indicate the MaLOC potential to design several CS types and SoC configuration parameters.
59

Self-adaptive QOS at communication and computation levels for many-core system-on-chip

Ruaro, Marcelo 16 March 2018 (has links)
Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-04-03T14:37:48Z No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) / Approved for entry into archive by Tatiana Lopes (tatiana.lopes@pucrs.br) on 2018-04-13T17:30:40Z (GMT) No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) / Made available in DSpace on 2018-04-13T17:37:13Z (GMT). No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) Previous issue date: 2018-03-16 / Sistemas multi-n?cleos intra-chip s?o o estado-da-arte em termos de poder computacional, alcan?ando de d?zias a milhares de elementos de processamentos (PE) em um ?nico circuito integrado. Sistemas multi-n?cleos de prop?sito geral assumem uma admiss?o din?mica de aplica??es, onde o conjunto de aplica??es n?o ? conhecido em tempo de projeto e as aplica??es podem iniciar sua execu??o a qualquer momento. Algumas aplica??es podem ter requisitos de tempo real, requisitando n?veis de qualidade de servi?o (QoS) do sistema. Devido ao alto grau de imprevisibilidade do uso dos recursos e o grande n?mero de componentes para se gerenciar, propriedades autoadaptativas tornam-se fundamentais para dar suporte a QoS em tempo de execu??o. A literatura fornece diversas propostas de QoS autoadaptativo, focado em recursos de comunica??o (ex., redes intra-chip), ou computa??o (ex., CPU). Contudo, para fornecer um suporte de QoS completo, ? fundamental uma autoconsci?ncia abrangente dos recursos do sistema, e assumir t?cnicas adaptativas que permitem agir em ambos os n?veis de comunica??o e computa??o para atender os requisitos das aplica??es. Para suprir essas demandas, essa Tese prop?e uma infraestrutura e t?cnicas de gerenciamento de QoS autoadaptativo, cobrindo ambos os n?veis de computa??o e comunica??o. No n?vel de computa??o, a infraestrutura para QoS consiste em um escalonador din?mico de tarefas de tempo real e um protocolo de migra??o de tarefas de baixo custo. Estas t?cnicas fornecem QoS de computa??o, devido ao gerenciamento da utiliza??o e aloca??o da CPU. A novidade do escalonador de tarefas ? o suporte a requisitos de tempo real din?micos, o que gera mais flexibilidade para as tarefas em explorar a CPU de acordo com uma carga de trabalho vari?vel. A novidade do protocolo de migra??o de tarefas ? o baixo custo no tempo de execu??o comparado a trabalhos do estado-da-arte. No n?vel de comunica??o, a t?cnica proposta ? um chaveamento por circuito (CS) baseado em redes definidas por software (SDN). O paradigma SDN para NoCs ? uma inova??o desta Tese, e ? alcan?ado atrav?s de uma arquitetura gen?rica de software e hardware. Para QoS de comunica??o, SDN ? usado para definir caminhos CS em tempo de execu??o. Essas infraestruturas de QoS s?o gerenciadas de uma forma integrada por um gerenciamento de QoS autoadaptativo, o qual segue o paradigma ODA (Observar, Decidir, Agir), implementando um la?o fechado de adapta??es em tempo de execu??o. O gerenciamento de QoS ? autoconsciente dos recursos do sistema e das aplica??es em execu??o, e pode decidir por adapta??es no n?vel de computa??o ou comunica??o, baseado em notifica??es das tarefas, monitoramento do ambiente, e monitoramento de atendimento de QoS. A autoadapta??o decide reativamente assim como proativamente. Uma t?cnica de aprendizagem do perfil das aplica??es ? proposta para tra?ar o comportamento das tarefas de tempo real, possibilitando a??es proativas. Resultados gerais mostram que o gerenciamento de QoS autoadaptativo proposto pode restaurar os n?veis de QoS para as aplica??es com um baixo custo no tempo de execu??o das aplica??es. Uma avalia??o abrangente, assumindo diversos benchmarks mostra que, mesmo sob diversas interfer?ncias de QoS nos n?veis de computa??o e comunica??o, o tempo de execu??o das aplica??es ? restaurado pr?ximo ao cen?rio ?timo, como 99,5% das viola??es de deadlines mitigadas. / Many-core systems-on-chip are the state-of-the-art in processing power, reaching from a dozen to thousands of processing elements (PE) in a single integrated circuit. General purpose many-cores assume a dynamic application admission, where the application set is unknown at design-time and applications may start their execution at any moment, inducing interference between them. Some applications may have real-time constraints to fulfill, requiring levels of quality of service (QoS) from the system. Due to the high degree of resource?s utilization unpredictability and the number of components to manage, self-adaptive properties become fundamental to support QoS at run-time. The literature provides several self-adaptive QoS proposals, targeting either communication (e.g., Network-on-Chip) or computation resources (e.g., CPU). However, to offer a complete QoS support, it is fundamental to provide a comprehensive self-awareness of the system?s resources, assuming adaptive techniques enabling to act simultaneously at the communication and computation levels to meet the applications' constraints. To cope with these requirements, this Thesis proposes a self-adaptive QoS infrastructure and management techniques, covering both the computation and communication levels. At the computation level, the QoS-driven infrastructure comprises a dynamic real-time task scheduler and a low overhead task migration protocol. These techniques ensure computation QoS by managing the CPU utilization and allocation. The novelty of the task scheduler is the support for dynamic real time constraints, which leverage more flexibility to tasks to explore the CPU according to a variable workload. The novelty of the task migration protocol is its low execution time overhead compared to the state-of-the-art. At the communication level, the proposed technique is a Circuit-Switching (CS) approach based on the Software Defined Networking (SDN) paradigm. The SDN paradigm for NoCs is an innovation of this Thesis and is achieved through a generic software and hardware architecture. For communication QoS, SDN is used to define CS paths at run-time. A self-adaptive QoS management following the ODA (Observe Decide Act) paradigm controls these QoS-driven infrastructures in an integrated way, implementing a closed loop for run time adaptations. The QoS management is self-aware of the system and running applications and can decide to take adaptations at computation or communication levels based on the task feedbacks, environment monitoring, and QoS fulfillment monitoring. The self-adaptation decides reactively as well as proactively. An online application profile learning technique is proposed to trace the behavior of the RT tasks and enabling the proactive actions. Results show that the proposed self-adaptive QoS management can restore the QoS level for the applications with a low overhead over the applications execution time. A broad evaluation, using known benchmarks, shows that even under severe QoS disturbances at computation and communication levels, the execution time of the application is restored near to the optimal scenario, mitigating 99.5% of deadline misses.
60

Convolutional neural network reliability on an APSoC platform a traffic-sign recognition case study / Confiabilidade de uma rede neural convolucional em uma plataforma APSoC: um estudo para reconhecimento de placas de trânsito

Lopes, Israel da Costa January 2017 (has links)
O aprendizado profundo tem inúmeras aplicações na visão computacional, reconhecimento de fala, processamento de linguagem natural e outras aplicações de interesse comercial. A visão computacional, por sua vez, possui muitas aplicações em áreas distintas, indo desde o entretenimento à aplicações relevantes e críticas. O reconhecimento e manipulação de faces (Snapchat), e a descrição de objetos em fotos (OneDrive) são exemplos de aplicações no entretenimento. Ao passo que, a inspeção industrial, o diagnóstico médico, o reconhecimento de objetos em imagens capturadas por satélites (usadas em missões de resgate e defesa), os carros autônomos e o Sistema Avançado de Auxílio ao Motorista (SAAM) são exemplos de aplicações relevantes e críticas. Algumas das empresas de circuitos integrados mais importantes do mundo, como Xilinx, Intel e Nvidia estão apostando em plataformas dedicadas para acelerar o treinamento e a implementação de algoritmos de aprendizado profundo e outras alternativas de visão computacional para carros autônomos e SAAM devido às suas altas necessidades computacionais. Assim, implementar sistemas de aprendizado profundo que alcançam alto desempenho com o custo de baixa utilização de área e dissipação de potência é um grande desafio. Além do mais, os circuitos eletrônicos para a indústria automotiva devem ser confiáveis mesmo sob efeitos da radiação, defeitos de fabricação e efeitos do envelhecimento. Assim, um gerador automático de VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) para Redes Neurais Convolucionais (RNC) foi desenvolvido para reduzir o tempo associado a implementação de algoritmos de aprendizado profundo em hardware. Como estudo de caso, uma RNC foi treinada pela ferramenta Convolutional Architecture for Fast Feature Embedding (Caffe), de modo a classificar 6 classes de placas de trânsito, alcançando uma precisão de cerca de 89,8% no conjunto de dados German Traffic-Sign Recognition Benchmark (GTSRB), que contém imagens de placas de trânsito em cenários complexos. Essa RNC foi implementada num All-Programmable System-on- Chip (APSoC) Zynq-7000, resultando em 313 Frames Por Segundo (FPS) em imagens normalizadas para 32x32, com o APSoC dissipando uma potência de somente 2.057 W, enquanto uma Graphics Processing Unit (GPU) embarcada, em seu modo de operação mínimo, dissipa 10 W. A confiabilidade da RNC proposta foi investigada por injeções de falhas acumuladas e aleatórias por emulação nos bits de configuração da Lógica Programável (LP) do APSoC, alcançando uma confiabilidade de 80,5% sob Single-Bit-Upset (SBU) onde foram considerados ambos os Dados Corrompidos Silenciosos (DCSs) críticos e os casos em que o sistema não respondeu no tempo esperado (time-outs). Em relação às falhas múltiplas, a confiabilidade da RNC decresce exponencialmente com o número de falhas acumuladas. Em vista disso, a confiabilidade da RNC proposta deve ser aumentada através do uso de técnicas de proteção durante o fluxo de projeto. / Deep learning has a plethora of applications in computer vision, speech recognition, natural language processing and other applications of commercial interest. Computer vision, in turn, has many applications in distinct areas, ranging from entertainment applications to relevant and critical applications. Face recognition and manipulation (Snapchat), and object description in pictures (OneDrive) are examples of entertainment applications. Industrial inspection, medical diagnostics, object recognition in images captured by satellites (used in rescue and defense missions), autonomous cars and Advanced Driver-Assistance System (ADAS) are examples of relevant and critical applications. Some of the most important integrated circuit companies around the world, such as Xilinx, Intel and Nvidia are waging in dedicated platforms for accelerating the training and deployment of deep learning and other computer vision algorithms for autonomous cars and ADAS due to their high computational requirement. Thus, implementing a deep learning system that achieves high performance with low area utilization and power consumption costs is a big challenge. Besides, electronic equipment for automotive industry must be reliable even under radiation effects, manufacturing defects and aging effects, inasmuch as if a system failure occurs, a car accident can happen. Thus, a Convolutional Neural Network (CNN) VHSIC (Very High Speed Integrated Circuit) Hardware Description Language (VHDL) automatic generator was developed to reduce the design time associated to the implementation of deep learning algorithms in hardware. As a case study, a CNN was trained by the Convolutional Architecture for Fast Feature Embedding (Caffe) framework, in order to classify 6 traffic-sign classes, achieving an average accuracy of about 89.8% on the German Traffic-Sign Recognition Benchmark (GTSRB) dataset, which contains trafficsigns images in complex scenarios. This CNN was implemented on a Zynq-7000 All- Programmable System-on-Chip (APSoC), achieving about 313 Frames Per Second (FPS) on 32x32-normalized images, with the APSoC consuming only 2.057W, while an embedded Graphics Processing Unit (GPU), in its minimum operation mode, consumes 10W. The proposed CNN reliability was investigated by random piled-up fault injection by emulation in the Programming Logic (PL) configuration bits of the APSoC, achieving 80.5% of reliability under Single-Bit-Upset (SBU) where both critical Silent Data Corruptions (SDCs) and time-outs were considered. Regarding the multiple faults, the proposed CNN reliability exponentially decreases with the number of piled-up faults. Hence, the proposed CNN reliability must be increased by using hardening techniques during the design flow.

Page generated in 0.0609 seconds